JP2022015250A

JP2022015250A - Effluent quality prediction device and effluent quality prediction method

Info

Publication number: JP2022015250A
Application number: JP2020117960A
Authority: JP
Inventors: 充森本; Mitsuru Morimoto; 嵐黄; Lan Huang; 輝巳竹原; Terumi Takehara; 和雄清滝; Kazuo Kiyotaki; 章大久保; Akira Okubo; 輝尚寶珍; Teruhisa Hochin
Original assignee: Kyoto Institute of Technology NUC; Nissin Electric Co Ltd
Current assignee: Kyoto Institute of Technology NUC; Nissin Electric Co Ltd
Priority date: 2020-07-08
Filing date: 2020-07-08
Publication date: 2022-01-21

Abstract

To eliminate abnormal data included in time-series data from a plurality of sensors at a water treatment site.SOLUTION: In an effluent quality prediction device (100), a pre-processing unit (103a) performs filtering to a learning dataset, which is a portion of learning data for a first prescribed time in learning data. A model generation unit (103c) generates a prediction model for predicting effluent quality data a prescribed time after the learning dataset. Meanwhile, an effluent quality prediction unit (105) applies the prediction model to the latest data stored in a latest data database (104) and predicts effluent quality data.SELECTED DRAWING: Figure 1

Description

本発明は、下水処理場の複数のセンサからの時系列データに基づいて、放流水質を予測する技術に関する。 The present invention relates to a technique for predicting the quality of discharged water based on time-series data from a plurality of sensors in a sewage treatment plant.

下水処理場の放流水質を予測する技術として、下記の特許文献１に開示された発明がある。この特許文献１は、水処理設備等のプラント設備の監視対象量を予測する技術に関する。 As a technique for predicting the quality of discharged water from a sewage treatment plant, there is an invention disclosed in Patent Document 1 below. This Patent Document 1 relates to a technique for predicting a monitored amount of plant equipment such as water treatment equipment.

一次予測部は、データ群、基準年間平均モデルおよび基準季節別平均モデルを用いて、ニューラルモデル、重回帰モデル、およびクラスタリングモデルを含む年間平均一次予測モデル群および季節別平均一次予測モデル群を生成する。そして、年間平均一次予測モデル群および季節別平均一次予測モデル群を用いて各予測モデルの監視対象量の予測値を予測結果データとして算出する。 The primary predictor uses the data group, the reference annual average model and the reference seasonal average model to generate an annual average primary prediction model group including a neural model, a multiple regression model, and a clustering model, and a seasonal average primary prediction model group. do. Then, the predicted value of the monitored amount of each prediction model is calculated as the prediction result data using the annual average primary prediction model group and the seasonal average primary prediction model group.

二次予測部は、稼働実績データ（年間実績データおよび季節別実績データ）を用いて基準二次予測モデルを生成する。そして、基準二次予測モデル、予測結果データおよび稼働実績データを用いて、監視対象量の年間および季節別の平均値を予測するための二次予測モデルを生成する。 The secondary forecasting unit generates a standard secondary forecasting model using the operation performance data (annual performance data and seasonal performance data). Then, using the reference secondary prediction model, the prediction result data, and the operation record data, a secondary prediction model for predicting the annual and seasonal average value of the monitored quantity is generated.

最後に、二次予測部は、監視対象予測値（予測結果データ）に重みを付与し、重みが付与された複数の監視対象量をノードとするニューラルネットワークを用いて監視対象量の予測値を算出する。 Finally, the secondary prediction unit assigns weights to the monitored predicted values (prediction result data), and uses a neural network with a plurality of weighted monitored targets as nodes to determine the predicted values of the monitored quantities. calculate.

特開２０１３－１６１３３６号公報（２０１３年８月１９日公開）Japanese Unexamined Patent Publication No. 2013-161336 (published on August 19, 2013)

しかしながら、特許文献１には、下水処理場の複数のセンサからの時系列データに含まれる異常データを除去する技術については記載されていない。 However, Patent Document 1 does not describe a technique for removing abnormal data included in time-series data from a plurality of sensors in a sewage treatment plant.

本発明の一態様は、下水処理場の複数のセンサからの時系列データに含まれる異常データを除去することが可能な放流水質予測装置を提供することを目的とする。 One aspect of the present invention is to provide a effluent quality prediction device capable of removing anomalous data included in time-series data from a plurality of sensors in a sewage treatment plant.

上記の課題を解決するために、本発明の一態様に係る放流水質予測装置は、下水処理場に設置された複数のセンサから、検出値の時系列データを取得するセンサ情報取得部と、センサ情報取得部によって取得された時系列データを学習データとして記憶する学習データデータベースと、センサ情報取得部によって取得された時系列データを直近データとして記憶する直近データデータベースと、学習データの中の第１の所定時間分の学習データを学習データセットとし、学習データセットにフィルタ処理を行う前処理部と、第２の所定時間後の、放流水質データの値を予測するための予測モデルを、前処理部によってフィルタ処理された後の学習データセットから生成するモデル生成部と、直近データに、予測モデルを適用することにより、放流水質データの値を予測する放流水質予測部とを備える。 In order to solve the above problems, the discharged water quality prediction device according to one aspect of the present invention includes a sensor information acquisition unit that acquires time-series data of detected values from a plurality of sensors installed in a sewage treatment plant, and a sensor. A learning data database that stores the time-series data acquired by the information acquisition unit as training data, a latest data database that stores the time-series data acquired by the sensor information acquisition unit as the latest data, and the first of the training data. The training data for a predetermined time is used as the training data set, and the preprocessing unit that filters the training data set and the prediction model for predicting the value of the discharged water quality data after the second predetermined time are preprocessed. It includes a model generation unit generated from the training data set after being filtered by the unit, and a discharge water quality prediction unit that predicts the value of the discharge water quality data by applying a prediction model to the latest data.

また、上記の課題を解決するために、本発明の一態様に係る放流水質予測方法は、下水処理場に設置された複数のセンサから、検出値の時系列データを取得するステップと、取得された時系列データを学習データとして記憶するステップと、取得された時系列データを直近データとして記憶するステップと、学習データの中の第１の所定時間分の学習データを学習データセットとし、学習データセットにフィルタ処理を行うステップと、第２の所定時間後の、放流水質データの値を予測するための予測モデルを、フィルタ処理された後の学習データセットから生成するステップと、直近データに、予測モデルを適用することにより、放流水質データの値を予測するステップとを含む。 Further, in order to solve the above-mentioned problems, the discharged water quality prediction method according to one aspect of the present invention includes a step of acquiring time-series data of detected values from a plurality of sensors installed in a sewage treatment plant. The step of storing the acquired time-series data as training data, the step of storing the acquired time-series data as the latest data, and the training data for the first predetermined time in the training data as the training data set, and the training data. A step to filter the set, a step to generate a prediction model for predicting the value of the discharged water quality data after the second predetermined time from the training data set after being filtered, and the latest data. It includes the step of predicting the value of the discharged water quality data by applying the prediction model.

本発明の一態様によれば、下水処理場の複数のセンサからの時系列データに含まれる異常データを除去することが可能となる。 According to one aspect of the present invention, it is possible to remove anomalous data included in time-series data from a plurality of sensors in a sewage treatment plant.

本発明の一実施形態に係る放流水質予測装置の概略構成を示す機能ブロック図である。It is a functional block diagram which shows the schematic structure of the effluent water quality prediction apparatus which concerns on one Embodiment of this invention. 前処理部によって加工された後の学習データの一例を示す図である。It is a figure which shows an example of the training data after being processed by a pre-processing part. 放流水質データにおける定常時データおよび非定常時データを説明するための図である。It is a figure for demonstrating the steady state data and the non-steady state data in the discharged water quality data. 予測モデル生成部の処理手順を説明するためのフローチャートである。It is a flowchart for demonstrating the processing procedure of the prediction model generation part. モデル生成部によって使用されるＲＮＮのハイパーパラメータの調整項目例を示す図である。It is a figure which shows the adjustment item example of the hyperparameter of RNN used by the model generation part. 予測モデルの生成を模式的に示す図である。It is a figure which shows the generation of the prediction model schematically. 非定常時データが正しく予測できていない場合を示す図である。It is a figure which shows the case where the non-stationary data cannot be predicted correctly. 非定常時データが正しく予測できている場合を示す図である。It is a figure which shows the case where the non-stationary data can be predicted correctly. 前処理部の処理手順を説明するためのフローチャートである。It is a flowchart for demonstrating the processing procedure of the pre-processing part. フィルタの各パラメータの最適化を説明するためのフローチャートである。It is a flowchart for demonstrating the optimization of each parameter of a filter. フィルタの各パラメータの最適化を模式的に示す図である。It is a figure which shows the optimization of each parameter of a filter schematically. 非定常時データ変更部の処理手順を説明するためのフローチャートである。It is a flowchart for demonstrating the processing procedure of the non-stationary data change part. 学習データのサンプリングを説明するための図である。It is a figure for demonstrating sampling of training data. 非定常時データの数を増やす方法を説明するための図である。It is a figure for demonstrating the method of increasing the number of non-stationary data. 非定常時データの数を増やす他の方法を説明するための図である。It is a figure for demonstrating another method for increasing the number of non-stationary data. 非定常時データの初期の抽出データを示す図である。It is a figure which shows the initial extraction data of the non-stationary data. 非定常時データの数を減らす方法を説明するための図である。It is a figure for demonstrating the method of reducing the number of non-stationary data. 図１２に示す非定常時データ変更部の処理を模式的に示す図である。It is a figure which shows typically the process of the non-stationary data change part shown in FIG. 放流水質予測部の処理手順を説明するためのフローチャートである。It is a flowchart for demonstrating the processing procedure of the effluent quality prediction unit. 放流水質データの予測を模式的に示す図である。It is a figure which shows the prediction of the discharged water quality data schematically. 非定常時データの数を増減しなかった場合のＣＯＤの実測値と放流水質予測部によって予測されたＣＯＤの予測値とを示すグラフである。It is a graph which shows the measured value of COD when the number of non-stationary data is not increased or decreased, and the predicted value of COD predicted by the discharged water quality prediction unit. 非定常時データの数を増減した場合のＣＯＤの実測値と放流水質予測部によって予測されたＣＯＤの予測値とを示すグラフである。It is a graph which shows the measured value of COD when the number of non-stationary data is increased or decreased, and the predicted value of COD predicted by the effluent quality prediction unit. 放流水質予測装置を実現するためのコンピュータの構成例を示す図である。It is a figure which shows the configuration example of the computer for realizing the effluent quality prediction apparatus.

以下、本発明の一実施形態について、詳細に説明する。なお、説明の便宜上、同一の部材には同一の符号を付し、それらの名称および機能も同一である。したがって、それらの詳細な説明は繰り返さない。 Hereinafter, one embodiment of the present invention will be described in detail. For convenience of explanation, the same members are designated by the same reference numerals, and their names and functions are also the same. Therefore, their detailed description will not be repeated.

（放流水質予測装置１００の構成）
図１は、本発明の一実施形態に係る放流水質予測装置１００の概略構成を示す機能ブロック図である。放流水質予測装置１００は、センサ情報取得部１０１と、学習データデータベース１０２と、予測モデル生成部１０３と、直近データデータベース１０４と、放流水質予測部１０５と、出力部１０６とを含む。 (Configuration of effluent quality prediction device 100)
FIG. 1 is a functional block diagram showing a schematic configuration of a effluent water quality prediction device 100 according to an embodiment of the present invention. The discharged water quality prediction device 100 includes a sensor information acquisition unit 101, a learning data database 102, a prediction model generation unit 103, a recent data database 104, a discharged water quality prediction unit 105, and an output unit 106.

図示していないが、下水処理場には、ポンプ揚水量、返流流量、送気量、混合浮遊物濃度（ＭＬＳＳ（Mixed Liquor Suspended Solid））、溶存酸素濃度（ＤＯ（Dissolved Oxygen））、ＣＯＤ（Chemical Oxygen Demand）、ＴＮ（Total Nitrogen）（全窒素）、ＴＰ（Total Phosphorus）（全リン）などのデータ（検出値）を計測するためのセンサが設置されている。これらのセンサからのデータが、放流水質予測装置１００に逐一送られる。 Although not shown, the sewage treatment plant has pump pumping volume, return flow rate, air supply volume, mixed suspended matter concentration (MLSS (Mixed Liquor Suspended Solid)), dissolved oxygen concentration (DO (Dissolved Oxygen)), and COD. Sensors for measuring data (detection values) such as (Chemical Oxygen Demand), TN (Total Nitrogen) (total nitrogen), and TP (Total Phosphorus) (total phosphorus) are installed. The data from these sensors are sent to the discharged water quality prediction device 100 one by one.

ＣＯＤ、ＴＮおよびＴＰは、下水処理場の放流水質を示す指標として知られている。そこで、以下、ＣＯＤ、ＴＮおよびＴＰのそれぞれを計測する３つのセンサから送られるデータを放流水質データと呼ぶ。他方、上記３つのセンサ以外のセンサから送られるデータをセンサ情報データと呼ぶことにする。 COD, TN and TP are known as indicators of the quality of effluent from sewage treatment plants. Therefore, hereinafter, the data sent from the three sensors that measure each of COD, TN, and TP is referred to as discharged water quality data. On the other hand, data sent from sensors other than the above three sensors will be referred to as sensor information data.

センサ情報取得部１０１は、センサ情報データおよび放流水質データを取得し、これらのデータに時間情報データ（年、月、日、曜日／祝日、および時刻）を付与して学習データを生成する。以下の説明において、センサ情報取得部１０１は、１分毎にセンサ情報データおよび放流水質データを取得することとするが、これに限定されるものではない。 The sensor information acquisition unit 101 acquires sensor information data and discharged water quality data, and adds time information data (year, month, day, day / holiday, and time) to these data to generate learning data. In the following description, the sensor information acquisition unit 101 acquires sensor information data and discharged water quality data every minute, but the present invention is not limited to this.

学習データデータベース１０２は、センサ情報取得部１０１によって生成された学習データを、学習に必要な期間、たとえば、１年以上の期間分だけ順次記憶する。 The learning data database 102 sequentially stores the learning data generated by the sensor information acquisition unit 101 for a period required for learning, for example, a period of one year or more.

予測モデル生成部１０３は、前処理部１０３ａと、非定常時データ変更部１０３ｂと、モデル生成部１０３ｃと、モデル評価部１０３ｄと、モデル変更部１０３ｅとを備えている。前処理部１０３ａは、学習データデータベース１０２に蓄積されている学習データを、学習できる形式になるように加工する。 The prediction model generation unit 103 includes a preprocessing unit 103a, a non-stationary data change unit 103b, a model generation unit 103c, a model evaluation unit 103d, and a model change unit 103e. The preprocessing unit 103a processes the learning data stored in the learning data database 102 so that it can be learned.

図２は、前処理部１０３ａによって加工された後の学習データの一例を示す図である。それぞれの学習データは、５項目の時間情報データ（年、月、日、曜日／祝日、および時刻）と、１５項目のセンサ情報データ（ポンプ揚水量、返流流量、送気量、混合浮遊物濃度、および溶存酸素濃度など）と、３項目の放流水質データ（ＣＯＤ、ＴＮ、およびＴＰ）とを含む。 FIG. 2 is a diagram showing an example of training data after being processed by the preprocessing unit 103a. Each learning data consists of 5 items of time information data (year, month, day, day / holiday, and time) and 15 items of sensor information data (pump pumping amount, return flow rate, air supply amount, mixed suspended matter). Concentration, dissolved oxygen concentration, etc.) and three items of discharged water quality data (COD, TN, and TP).

図３は、放流水質データにおける定常時データおよび非定常時データを説明するための図である。警報レベルは、放流水質データが悪化して運転変更等の対処が必要となる閾値である。定常時データは、放流水質データが警報レベル未満である時間のデータである。また、非定常時データは、放流水質データが警報レベルを超える時間のデータである。 FIG. 3 is a diagram for explaining steady-state data and non-steady-time data in the discharged water quality data. The alarm level is a threshold value at which the discharged water quality data deteriorates and it is necessary to take measures such as changing the operation. The constant data is the data of the time when the discharged water quality data is below the alarm level. The non-stationary data is data for the time when the discharged water quality data exceeds the warning level.

放流水質の定常時データと非定常時データとのバランスが良い（全期間の放流水質データに対して非定常時データの割合がある程度大きい）場合、放流水質の定常時データと非定常時データとが高い精度で予測できる。しかし、非定常時データの割合が小さい場合、定常時データの予測精度は高いが、非定常時データの予測精度が低くなる。 When the balance between the steady-state data and the non-stationary data of the discharged water quality is good (the ratio of the non-stationary data to the discharged water quality data for the whole period is large to some extent), the steady-time data and the non-stationary data of the discharged water quality are used. Can be predicted with high accuracy. However, when the ratio of the non-stationary data is small, the prediction accuracy of the stationary data is high, but the prediction accuracy of the non-stationary data is low.

非定常時データ変更部１０３ｂは、前処理部１０３ａによって生成された学習データの中から非定常時データを抽出し、非定常時データの割合に応じて非定常時データの数を増減することにより学習データを変更する。この非定常時データ変更部１０３ｂの詳細は、後述する。 The non-stationary data changing unit 103b extracts non-stationary data from the learning data generated by the preprocessing unit 103a, and increases or decreases the number of non-stationary data according to the ratio of the non-stationary data. Change the training data. The details of the non-stationary data changing unit 103b will be described later.

モデル生成部１０３ｃは、非定常時データ変更部１０３ｂによって変更された学習データを用いて所定の更新周期ごとに再学習を行い、下水処理場の運転変更に対応した予測モデルを生成する。このモデル生成部１０３ｃの詳細は、後述する。 The model generation unit 103c relearns at predetermined update cycles using the learning data changed by the non-stationary data change unit 103b, and generates a prediction model corresponding to the operation change of the sewage treatment plant. The details of this model generation unit 103c will be described later.

モデル評価部１０３ｄは、モデル生成部１０３ｃによって生成された予測モデルを評価し、評価誤差が目標誤差以下であれば、その予測モデルが十分に学習していると判断する。このモデル評価部１０３ｄの詳細は、後述する。 The model evaluation unit 103d evaluates the prediction model generated by the model generation unit 103c, and if the evaluation error is equal to or less than the target error, it is determined that the prediction model is sufficiently learned. The details of this model evaluation unit 103d will be described later.

モデル変更部１０３ｅは、放流水質予測部１０５が予測処理を実行中であるか否かを判定する。予測処理を実行中でなければ、モデル変更部１０３ｅは、モデル生成部１０３ｃによって生成された予測モデルを最新の予測モデルとして更新する。 The model change unit 103e determines whether or not the discharged water quality prediction unit 105 is executing the prediction process. If the prediction process is not being executed, the model change unit 103e updates the prediction model generated by the model generation unit 103c as the latest prediction model.

直近データデータベース１０４は、センサ情報取得部１０１によって生成された学習データを予測に必要な期間、たとえば、１時間以上の期間分だけを順次記憶する。すなわち、直近データデータベース１０４は、１時間以上の最新の学習データを直近データとして常に記憶している。 The latest data database 104 sequentially stores the learning data generated by the sensor information acquisition unit 101 only for a period required for prediction, for example, for a period of one hour or more. That is, the latest data database 104 always stores the latest learning data for one hour or more as the latest data.

放流水質予測部１０５は、前処理部１０５ａと、予測処理部１０５ｂと、後処理部１０５ｃとを備える。前処理部１０５ａは、直近データデータベース１０４に蓄積されている直近データを、予測できる形式になるように加工する。 The discharged water quality prediction unit 105 includes a pretreatment unit 105a, a prediction processing unit 105b, and a post-treatment unit 105c. The preprocessing unit 105a processes the latest data stored in the latest data database 104 into a predictable format.

予測処理部１０５ｂは、前処理部１０５ａによって生成された直近データおよびモデル変更部１０３ｅによって更新された予測モデルを用いて、所定時間後（例：数時間後）の放流水質を予測する。詳細は後述する。以下では、当該所定時間が、２．５時間である場合を主に例示する。本明細書では、この所定時間を「第２の所定時間」とも称する。 The prediction processing unit 105b predicts the discharged water quality after a predetermined time (eg, after several hours) using the latest data generated by the pretreatment unit 105a and the prediction model updated by the model change unit 103e. Details will be described later. In the following, a case where the predetermined time is 2.5 hours is mainly illustrated. In the present specification, this predetermined time is also referred to as a "second predetermined time".

後処理部１０５ｃは、予測処理部１０５ｂによって予測された放流水質を、出力可能な形式に変換する。 The post-treatment unit 105c converts the discharged water quality predicted by the prediction processing unit 105b into a format that can be output.

出力部１０６は、放流水質予測部１０５によって生成された各種情報を、下水処理場の図示しない制御装置、表示装置などに出力する。 The output unit 106 outputs various information generated by the discharged water quality prediction unit 105 to a control device, a display device, etc. (not shown) of the sewage treatment plant.

（予測モデル生成部１０３の処理手順）
図４は、予測モデル生成部１０３の処理手順を説明するためのフローチャートである。まず、前処理部１０３ａは、学習データデータベース１０２に蓄積されている学習データを取得する（ステップＳ１）。そして、前処理部１０３ａは、学習データの欠測値の補間および異常値の除去を行う（ステップＳ２）。 (Processing procedure of prediction model generation unit 103)
FIG. 4 is a flowchart for explaining the processing procedure of the prediction model generation unit 103. First, the preprocessing unit 103a acquires the learning data stored in the learning data database 102 (step S1). Then, the preprocessing unit 103a interpolates the missing value of the training data and removes the abnormal value (step S2).

ここで、欠測値とは、センサの点検等により計測値が正しく出力されていないときの値を意味する。また、異常値とは、センサの故障等により異常となっている値を意味する。これらは、たとえば、一定の範囲から値が外れているか否かによって判断される。このステップＳ２の処理の詳細は、後述する。 Here, the missing value means a value when the measured value is not output correctly due to inspection of the sensor or the like. Further, the abnormal value means a value that is abnormal due to a sensor failure or the like. These are determined, for example, by whether or not the value is out of a certain range. The details of the process of this step S2 will be described later.

次に、非定常時データ変更部１０３ｂは、前処理部１０３ａによって前処理が行われた後の学習データの中の非定常時データの割合を算出する。そして、非定常時データ変更部１０３ｂは、非定常時データの割合が所定値となるように、非定常時データの数を増減して学習データを変更する（ステップＳ３）。このステップＳ３の処理の詳細は、後述する。 Next, the non-stationary data changing unit 103b calculates the ratio of the non-stationary data in the training data after the preprocessing is performed by the preprocessing unit 103a. Then, the non-stationary data changing unit 103b changes the learning data by increasing or decreasing the number of non-stationary data so that the ratio of the non-stationary data becomes a predetermined value (step S3). The details of the process of this step S3 will be described later.

次に、モデル生成部１０３ｃは、非定常時データ変更部１０３ｂによって変更された後の学習データを用いて、複数の学習データセットを作成する。図２に示すように、太線によって囲まれた２４時間（１４４０点）分の学習データを１つの学習データセットとする。そして、１点（１行）下にずらした２４時間分の学習データを次の学習データセットとする。このように、１点（１行）ずつ下にずらした学習データセットを順次作成してゆく。本明細書では、この学習データセットの時間を「第１の所定時間」とも称する。 Next, the model generation unit 103c creates a plurality of training data sets using the training data after being changed by the non-stationary data change unit 103b. As shown in FIG. 2, the training data for 24 hours (1440 points) surrounded by the thick line is regarded as one training data set. Then, the training data for 24 hours shifted down by one point (one line) is used as the next training data set. In this way, the learning data sets shifted downward by one point (one line) are sequentially created. In the present specification, the time of this training data set is also referred to as a "first predetermined time".

以下、２４時間（１４４０点）分の学習データを１つの学習データセットとして説明するが、これは一例であって、それ以外の時間（点数）分の学習データを学習データセットとしてもよい。 Hereinafter, the learning data for 24 hours (1440 points) will be described as one learning data set, but this is an example, and the learning data for other times (points) may be used as the learning data set.

そして、モデル生成部１０３ｃは、予測モデルを作成する。本実施形態においては、時系列データを処理するために好適な回帰型ニューラルネットワーク、たとえば、リカレントニューラルネットワーク（Recurrent Neural Network）（以下、ＲＮＮと呼ぶ。）の学習手法を用いることとし、現行のハイパーパラメータを用いて予測モデルを作成する。 Then, the model generation unit 103c creates a prediction model. In the present embodiment, it is decided to use a learning method of a recurrent neural network suitable for processing time series data, for example, a recurrent neural network (hereinafter referred to as RNN), and the current hyper. Create a prediction model using parameters.

図５は、モデル生成部１０３ｃによって使用されるＲＮＮのハイパーパラメータの調整項目例を示す図である。ハイパーパラメータは、学習データセットの時間、エポック数、最適化関数、重み減衰係数、学習率の乗数、学習率の更新間隔、およびノード数などを含む。モデル生成部１０３ｃは、これらハイパーパラメータの値を順次調整しながら予測モデルを生成する。 FIG. 5 is a diagram showing an example of adjustment items of RNN hyperparameters used by the model generation unit 103c. Hyperparameters include the time of the training dataset, the number of epochs, the optimization function, the weight attenuation coefficient, the multiplier of the training rate, the update interval of the training rate, and the number of nodes. The model generation unit 103c generates a prediction model while sequentially adjusting the values of these hyperparameters.

ハイパーパラメータの各項目は、図５の右側に示すように調整範囲が決められており、これらの値に変更することが可能である。したがって、ハイパーパラメータの各項目の調整範囲のすべての組み合わせで予測モデルを作成することができる。なお、ハイパーパラメータは、図５に示す項目に限られるものではなく、これ以外に層数、潜在変数の数、活性化関数等を含んでいても良い。 The adjustment range of each item of hyperparameters is determined as shown on the right side of FIG. 5, and it is possible to change to these values. Therefore, it is possible to create a predictive model with all combinations of adjustment ranges of each item of hyperparameters. The hyperparameters are not limited to the items shown in FIG. 5, and may include the number of layers, the number of latent variables, the activation function, and the like.

モデル生成部１０３ｃは、図５に示すハイパーパラメータの各項目の値を選択してＲＮＮ構造に適用することにより、パラメータの調整を行う。そして、ＲＮＮの機械学習手法、たとえば、ＢＰＴＴ（Back Propagation Through Time）法を適用し、調整したハイパーパラメータを使用して予測モデルを作成する（ステップＳ４）。 The model generation unit 103c adjusts the parameters by selecting the values of each item of the hyperparameters shown in FIG. 5 and applying them to the RNN structure. Then, an RNN machine learning method, for example, a BPTT (Back Propagation Through Time) method is applied, and a prediction model is created using the adjusted hyperparameters (step S4).

また、ＲＮＮでは、過去の中間層出力を反映することが必ずしも容易ではないことが知られている。そこで、この問題を改善するために、ＬＳＴＭ（Long Short Term Memory）と呼ばれるネットワークを利用することも可能である。ＬＳＴＭでは、ＲＮＮの中間層出力に対して、記憶期間の長さの概念が導入されているため、遠い過去の出力の影響を保持できる。 Further, it is known that it is not always easy to reflect the past intermediate layer output in the RNN. Therefore, in order to improve this problem, it is also possible to use a network called LSTM (Long Short Term Memory). Since the concept of the length of the storage period is introduced in the LSTM for the intermediate layer output of the RNN, the influence of the output in the distant past can be retained.

また、ＧＲＵ（Gated Recurrent Unit）という手法を用いることも可能である。ＧＲＵでは、入力ゲートと忘却ゲートとが更新ゲートとして、１つのゲートに統合されている。ＧＲＵによれば、長いステップ前の出来事の特徴の記憶を維持しやすくなる。 It is also possible to use a method called GRU (Gated Recurrent Unit). In GRU, the input gate and the oblivion gate are integrated into one gate as an update gate. According to the GRU, it is easier to maintain a memory of the characteristics of the event before a long step.

次に、モデル生成部１０３ｃは、複数の学習データセットをトレーニングデータ、検証データおよび評価データに分割する。そして、モデル生成部１０３ｃは、トレーニングデータと検証データとを用い、ステップＳ４において作成された予測モデルで学習および検証を行い、予測値が学習データの実測値に近づくようにネットワークのパラメータを調整する（ステップＳ５）。 Next, the model generation unit 103c divides the plurality of training data sets into training data, validation data, and evaluation data. Then, the model generation unit 103c uses the training data and the verification data to perform learning and verification with the prediction model created in step S4, and adjusts the network parameters so that the predicted values approach the actually measured values of the training data. (Step S5).

この学習および検証について、再び図２を参照しながら、詳細に説明する。モデル生成部１０３ｃは、図２の太線によって示す学習データセットをトレーニングデータとし、２．５時間後のＣＯＤの値を検証データとする。そして、モデル生成部１０３ｃは、ステップＳ４において作成された予測モデルの入力層にトレーニングデータを設定し、出力層の検証データとして２．５時間後のＣＯＤの値を設定する。モデル生成部１０３ｃは、所定の学習手法（ＢＰＴＴ法、ＬＳＴＭ法、またはＧＲＵ法など）を用いて再学習を行うことにより、予測モデルを更新する。なお、検証データは、学習データセットの中の最新の時刻の学習データから２．５時間後のＣＯＤの値とする。 This learning and verification will be described in detail with reference to FIG. 2 again. The model generation unit 103c uses the training data set shown by the thick line in FIG. 2 as training data, and the COD value after 2.5 hours as verification data. Then, the model generation unit 103c sets the training data in the input layer of the prediction model created in step S4, and sets the value of COD after 2.5 hours as the verification data of the output layer. The model generation unit 103c updates the prediction model by performing re-learning using a predetermined learning method (BPTT method, LSTM method, GRU method, etc.). The verification data is the value of COD 2.5 hours after the training data at the latest time in the training data set.

図６は、予測モデルの生成を模式的に示す図である。学習データ２０１として学習データセットの各行の２３項目（時間情報データ５項目＋センサ情報データ１５項目＋放流水質データ３項目）がＲＮＮ層の入力層に入力され、２．５時間後のＣＯＤの値が検証データとして出力層に設定される。なお、入力層に入力されるのは時間情報データ５項目＋センサ情報データ１５項目の合計２０項目だけであってもよい。 FIG. 6 is a diagram schematically showing the generation of a prediction model. As training data 201, 23 items (5 items of time information data + 15 items of sensor information data + 3 items of discharged water quality data) of each row of the training data set are input to the input layer of the RNN layer, and the value of COD 2.5 hours later. Is set in the output layer as verification data. It should be noted that only 5 items of time information data + 15 items of sensor information data, for a total of 20 items, may be input to the input layer.

ＲＮＮ層２０２は、回帰型ニューラルネットワークであり、中間層の時間経過を考慮しなければならない。このため、ＲＮＮ層２０２に、そのまま誤差逆伝播（ＢＰ）法を適用することができない。そのため、ＲＮＮ層を、中間層出力を介して時間方向に展開することにより全結合層２０３を作成し、ＢＰ法が適用できる形式の予測モデル２０４を作成する。いわゆる、ＢＰＴＴ法が適用できるようにする。また、上述のＬＳＴＭ法またはＧＲＵ法などの学習手法を用いるようにしてもよい。 The RNN layer 202 is a recurrent neural network, and the passage of time of the intermediate layer must be taken into consideration. Therefore, the error back propagation (BP) method cannot be applied to the RNN layer 202 as it is. Therefore, the fully connected layer 203 is created by expanding the RNN layer in the time direction via the intermediate layer output, and the prediction model 204 in a format to which the BP method can be applied is created. The so-called BPTT method can be applied. Further, a learning method such as the above-mentioned LSTM method or GRU method may be used.

この予測モデル２０４に学習データ２０１の各行の項目を順次入力して予測値を計算し、予測値が検証データ（２．５時間後のＣＯＤの実測値）に近づくように、ＢＰＴＴ法などを用いてパラメータ（重み、バイアス）を調整することにより学習を行う。そして、学習データセットの行をずらしながら順次再学習を行ない、予測モデル２０４を更新する。 The items in each row of the training data 201 are sequentially input to this prediction model 204, the prediction value is calculated, and the BPTT method or the like is used so that the prediction value approaches the verification data (measured value of COD after 2.5 hours). Learning is performed by adjusting the parameters (weight, bias). Then, the prediction model 204 is updated by sequentially retraining while shifting the rows of the training data set.

図６の例では、ＣＯＤを学習する場合について記載している。但し、ＴＮの値を学習する場合であれば、２．５時間後のＴＮの実測値を検証データとして学習すればよい。あるいは、ＴＰを学習する場合であれば、２．５時間後のＴＰの実測値を検証データとして学習すればよい。また、放流水質データの上記３項目（ＣＯＤ、ＴＮ、およびＴＰ）を一度に学習するのであれば、これら３項目の実測値を検証データとして学習すればよい。 In the example of FIG. 6, the case of learning COD is described. However, when learning the TN value, the measured value of TN after 2.5 hours may be learned as verification data. Alternatively, in the case of learning TP, the measured value of TP after 2.5 hours may be learned as verification data. Further, if the above three items (COD, TN, and TP) of the discharged water quality data are to be learned at once, the measured values of these three items may be learned as verification data.

また、上記の例では、所定時間が「２．５時間」である場合を例示している。但し、所定時間はこれに限定されず、任意の時間であってもよい。 Further, in the above example, the case where the predetermined time is "2.5 hours" is illustrated. However, the predetermined time is not limited to this, and may be any time.

次に、モデル評価部１０３ｄは、評価データを用い、ステップＳ５において調整した予測モデルで予測を行う。モデル評価部１０３ｄは、予測値と実測値との評価を行う（ステップＳ６）。ここでは、モデル評価部１０３ｄは、予測値と実測値との差（絶対値）を誤差として計算し、最大誤差を評価誤差として保存する。 Next, the model evaluation unit 103d uses the evaluation data and makes a prediction using the prediction model adjusted in step S5. The model evaluation unit 103d evaluates the predicted value and the measured value (step S6). Here, the model evaluation unit 103d calculates the difference (absolute value) between the predicted value and the measured value as an error, and saves the maximum error as an evaluation error.

また、モデル評価部１０３ｄは、実測値が非定常時データである場合、予測値が正しく非定常時データとなっているか否かを判断する。そして、正確に非定常時データとして予測できた箇所数を、全ての非定常時データの箇所数で割った値を予測正解率として算出する。 Further, when the measured value is the non-stationary data, the model evaluation unit 103d determines whether or not the predicted value is correctly the non-stationary data. Then, the value obtained by dividing the number of points that can be accurately predicted as non-stationary data by the number of points of all non-stationary data is calculated as the predicted correct answer rate.

図７は、非定常時データが正しく予測できていない場合を示す図である。図７においては、１箇所の非定常時データに、４点の非定常時データが含まれている場合を示している。モデル評価部１０３ｄは、予測データが全て定常時データとなっているため、不正解であると判断する。なお、〇は正しく予測できている箇所、×は正しく予測できていない箇所を示している。 FIG. 7 is a diagram showing a case where the non-stationary data cannot be predicted correctly. FIG. 7 shows a case where the non-stationary data at one location includes the non-stationary data at four points. The model evaluation unit 103d determines that the answer is incorrect because all the predicted data are steady-state data. In addition, 〇 indicates a part that can be predicted correctly, and × indicates a part that cannot be predicted correctly.

図８は、非定常時データが正しく予測できている場合を示す図である。図８においても同様に、１か所の非定常時データに、４点の非定常時データが含まれている場合を示している。モデル評価部１０３ｄは、予測データの１点以上を非定常時データと予測しているため、正解であると判断する。 FIG. 8 is a diagram showing a case where the non-stationary data can be correctly predicted. Similarly, FIG. 8 shows a case where the non-stationary data at one place includes the non-stationary data at four points. Since the model evaluation unit 103d predicts that one or more points of the prediction data are non-stationary data, it determines that the answer is correct.

このように、モデル評価部１０３ｄは、連続している非定常時データを１箇所の非定常時データとみなし、その予測データが正解か否かを判断し、正解であった非定常時データの箇所数を算出する。そして、正解であった非定常時データの箇所数を、全ての非定常時データの箇所数で割った値を予測正解率とする。 In this way, the model evaluation unit 103d regards continuous non-stationary data as one non-stationary data, determines whether or not the predicted data is correct, and determines whether the predicted data is correct or not, and the non-stationary data that was the correct answer. Calculate the number of locations. Then, the value obtained by dividing the number of points of the non-stationary data that was the correct answer by the number of points of all the non-stationary data is defined as the predicted correct answer rate.

なお、評価データとして、２４時間の学習データセットの入力項目を全て用いてもよい。但し、学習データセット内の最も新しい時刻から数時間分の入力項目を抽出し、それを評価データとするようにしてもよい。 As the evaluation data, all the input items of the 24-hour learning data set may be used. However, input items for several hours may be extracted from the latest time in the training data set and used as evaluation data.

次に、モデル評価部１０３ｄは、評価誤差が目標誤差以下であるか否かを判定する（ステップＳ７）。目標誤差とは、あらかじめ決められた値であり、予測モデル２０４が十分に学習しているか否かを判定するための値である。 Next, the model evaluation unit 103d determines whether or not the evaluation error is equal to or less than the target error (step S7). The target error is a predetermined value, and is a value for determining whether or not the prediction model 204 is sufficiently trained.

評価誤差が目標誤差よりも大きければ（ステップＳ７，Ｎｏ）、ステップＳ９に処理が進む。また、評価誤差が目標誤差以下であれば（Ｓ７，Ｙｅｓ）、モデル評価部１０３ｄは、予測正解率が目標正解率以上であるか否かを判定する（ステップＳ８）。予測正解率が目標正解率よりも小さければ（Ｓ８，Ｎｏ）、ステップＳ９に処理が進む。 If the evaluation error is larger than the target error (steps S7 and No), the process proceeds to step S9. If the evaluation error is equal to or less than the target error (S7, Yes), the model evaluation unit 103d determines whether or not the predicted correct answer rate is equal to or greater than the target correct answer rate (step S8). If the predicted correct answer rate is smaller than the target correct answer rate (S8, No), the process proceeds to step S9.

ステップＳ９において、モデル評価部１０３ｄは、予測処理部１０５ｂの更新回数が２回目以上であるか否かを判定する（ステップＳ９）。予測処理部１０５ｂの更新回数が２回目以上であれば（Ｓ９，Ｙｅｓ）、モデル評価部１０３ｄは、所定の更新時間を超えているか否かを判定する（ステップＳ１０）。この所定の更新時間とは、十分な時間を経過しても評価誤差が小さくならなければ処理を終了するための時間であり、たとえば、数日程度の時間として設定される。 In step S9, the model evaluation unit 103d determines whether or not the number of updates of the prediction processing unit 105b is the second or more (step S9). If the number of updates of the prediction processing unit 105b is the second or more (S9, Yes), the model evaluation unit 103d determines whether or not the predetermined update time has been exceeded (step S10). This predetermined update time is a time for ending the process if the evaluation error does not become small even after a sufficient time has elapsed, and is set as, for example, a time of about several days.

所定の更新時間を超えていれば（ステップＳ１０，Ｙｅｓ）、モデル評価部１０３ｄは、現在の学習データでは目標誤差以下であり、かつ目標正解率以上の予測モデル２０４を作成することができないと判断し、予測モデル２０４の更新を行なわずに処理を終了する（ステップＳ１１）。 If the predetermined update time is exceeded (step S10, Yes), the model evaluation unit 103d determines that the prediction model 204 that is less than or equal to the target error and is greater than or equal to the target correct answer rate with the current learning data cannot be created. Then, the process ends without updating the prediction model 204 (step S11).

また、所定の更新時間を超えていなければ（ステップＳ１０，Ｎｏ）、モデル生成部１０３ｃおよびモデル評価部１０３ｄは、ステップＳ４に戻ってＳ４～Ｓ８の処理を繰り返すことになる。ここで、モデル生成部１０３ｃは、第１の設定時間だけ待った後に、ステップＳ３以降の処理を行う。この第１の設定時間は、１回の学習に必要な時間であり、データ量によっても異なるが、たとえば半日または１日の時間として設定される。 Further, if the predetermined update time is not exceeded (steps S10 and No), the model generation unit 103c and the model evaluation unit 103d return to step S4 and repeat the processes of S4 to S8. Here, the model generation unit 103c performs the processing after step S3 after waiting for the first set time. This first set time is the time required for one learning, and is set as, for example, half a day or one day, although it varies depending on the amount of data.

モデル生成部１０３ｃおよびモデル評価部１０３ｄがステップＳ４～Ｓ８を繰り返すことにより、図５に示す様々な組み合わせのハイパーパラメータの予測モデル２０４を順次作成し、作成した予測モデル２０４を用いて学習データセットの学習を順次行ない、予測モデル２０４を順次更新してゆくことになる。 By repeating steps S4 to S8, the model generation unit 103c and the model evaluation unit 103d sequentially create a prediction model 204 of hyperparameters of various combinations shown in FIG. 5, and use the created prediction model 204 to create a training data set. Learning will be performed sequentially, and the prediction model 204 will be updated sequentially.

予測正解率が目標正解率以上であれば（ステップＳ８，Ｙｅｓ）、モデル評価部１０３ｄは、予測モデル２０４をモデル変更部１０３ｅに保存する（ステップＳ１２）。この予測モデル２０４は、ハイパーパラメータの様々な組み合わせで生成された予測モデルの中で、最適に近い予測モデルであると言える。 If the predicted correct answer rate is equal to or higher than the target correct answer rate (step S8, Yes), the model evaluation unit 103d stores the predicted model 204 in the model change unit 103e (step S12). It can be said that this prediction model 204 is a prediction model that is close to the optimum among the prediction models generated by various combinations of hyperparameters.

次に、モデル変更部１０３ｅは、放流水質予測部１０５が予測処理を実施中であるか否かを判定する（ステップＳ１３）。放流水質予測部１０５が予測処理を実施中であれば（ステップＳ１３，Ｙｅｓ）、モデル変更部１０３ｅは、第２の設定時間毎に放流水質予測部１０５の状態を確認する。この第２の設定時間とは、放流水質予測部１０５が１回の予測処理を実施するのに必要な時間であり、たとえば、数分程度の時間として設定される。 Next, the model change unit 103e determines whether or not the effluent water quality prediction unit 105 is executing the prediction process (step S13). If the effluent water quality prediction unit 105 is executing the prediction process (steps S13, Yes), the model change unit 103e confirms the state of the effluent water quality prediction unit 105 every second set time. The second set time is the time required for the discharged water quality prediction unit 105 to carry out one prediction process, and is set as, for example, a time of about several minutes.

放流水質予測部１０５が予測処理を実施中でなければ（ステップＳ１３，Ｎｏ）、モデル変更部１０３ｅは、予測処理部１０５ｂが使用する予測モデル２０４を最新の予測モデル２０４に更新し（ステップＳ１４）、処理を終了する。 If the effluent quality prediction unit 105 is not executing the prediction process (steps S13, No), the model change unit 103e updates the prediction model 204 used by the prediction processing unit 105b to the latest prediction model 204 (step S14). , End the process.

なお、所定の更新時間、第１の設定時間および第２の設定時間は任意の値が設定可能であり、下水処理場に設置されているセンサの数の変更および学習データのデータ量の変更等に応じて適切な値を設定するものとする。 Arbitrary values can be set for the predetermined update time, the first set time, and the second set time, such as changing the number of sensors installed in the sewage treatment plant and changing the amount of training data. Appropriate values shall be set according to.

図９は、前処理部１０３ａ（図４のステップＳ２）の処理手順を説明するためのフローチャートである。この処理は、学習データセットにフィルタ処理を行うことにより学習データに含まれる異常データを除去するものである。 FIG. 9 is a flowchart for explaining the processing procedure of the preprocessing unit 103a (step S2 in FIG. 4). This process removes abnormal data contained in the training data by filtering the training data set.

まず、前処理部１０３ａは、学習データ内に欠測値がある場合、前後の時間の学習データの同じ項目の値を用いて、当該欠測値を補間する。または、前処理部１０３ａは、前週の同じ曜日の同じ時刻の同じ項目の値を用いて、当該欠測値を補間してもよい（Ｓ６１）。そして、前処理部１０３ａは、補間した学習データに時間情報データを追加する（Ｓ６２）。 First, when there is a missing value in the training data, the preprocessing unit 103a interpolates the missing value by using the value of the same item in the training data of the previous and next times. Alternatively, the preprocessing unit 103a may interpolate the missing value by using the value of the same item at the same time on the same day of the previous week (S61). Then, the preprocessing unit 103a adds the time information data to the interpolated learning data (S62).

次に、前処理部１０３ａは、最適化するフィルタのパラメータの個数Ｎを取得する（Ｓ６３）。パラメータの個数Ｎは、適用するデジタルフィルタによって異なるが、本実施形態においてはローパスフィルタの１種であるバターワースフィルタを用いるものとする。バターワースフィルタのパラメータは、通過帯域周波数Ｆ_ｐ、そのゲインＧ_ｐ、阻止帯域周波数Ｆ_ｓ、およびそのゲインＧ_ｓの４つである（Ｎ＝４）。以下、バターワースフィルタの場合について説明するが、これ以外のデジタルフィルタを用いてもよい。 Next, the preprocessing unit 103a acquires the number N of the parameters of the filter to be optimized (S63). The number N of the parameters varies depending on the digital filter to be applied, but in the present embodiment, a Butterworth filter, which is one of the low-pass filters, is used. The four parameters of the Butterworth filter are the passband frequency F _p , its gain G _p , the blocking band frequency F _s , and its gain G _s (N = 4). Hereinafter, the case of the Butterworth filter will be described, but other digital filters may be used.

次に、前処理部１０３ａは、事前に設定した最適化するフィルタの各パラメータの最大値（Ｍａｘ_ｉ）および最小値（Ｍｉｎ_ｉ）を取得する（Ｓ６４）。ここで、ｉ＝１，２，…，Ｎであり、バターワースフィルタの場合はＮ＝４となる。 Next, the preprocessing unit 103a acquires the maximum value (Max _i ) and the minimum value (Min _i ) of each parameter of the filter to be optimized in advance (S64). Here, i = 1, 2, ..., N, and in the case of a Butterworth filter, N = 4.

次に、前処理部１０３ａは、最適化するフィルタのパラメータ毎の最大値（Ｍａｘ_ｉ）および最小値（Ｍｉｎ_ｉ）の範囲内で、ランダムな数値の組み合わせをＮ＋１通り設定する（Ｓ６５）。このランダムなパラメータの組み合わせを、Ｐ_１＝（Ｆ_ｐ１，Ｇ_ｐ１，Ｆ_ｓ１，Ｇ_ｓ１）、Ｐ_２＝（Ｆ_ｐ２，Ｇ_ｐ２，Ｆ_ｓ２，Ｇ_ｓ２）、…、Ｐ_５＝（Ｆ_ｐ５，Ｇ_ｐ５，Ｆ_ｓ５，Ｇ_ｓ５）とする。 Next, the preprocessing unit 103a sets N + 1 random combinations of numerical values within the range of the maximum value (Max _i ) and the minimum value (Min _i ) for each parameter of the filter to be optimized (S65). The combination of this random parameter is P ₁ = (F _p1 , G _p1 , F _s1 , G _s1 ), P ₂ = (F _p2 , G _p2 , F _s2 , G _s2 ), ..., P ₅ = (F _p5 ). , G _p5 , F _s5 , G _s5 ).

次に、前処理部１０３ａは、パラメータの組Ｐ_ｊ（ｊ＝１，２，…，Ｎ＋１）を用いて学習データセットにフィルタ処理を行い、５つの学習データセットを作成する。そして、モデル生成部１０３ｃが、ハイパーパラメータを用いて作成した予測モデルに対して、５つの学習データセットのそれぞれを用いて学習を行い、５つの予測モデルを作成する（Ｓ６６）。なお、最初に用いる学習データセットは、上述と同様に、例えば、２４時間分の学習データである。また、モデル生成部１０３ｃは、予め作成した最適なハイパーパラメータを用いて予測モデルを作成するものとする。 Next, the preprocessing unit 103a filters the training data set using the parameter set P _j (j = 1, 2, ..., N + 1) to create five training data sets. Then, the model generation unit 103c trains the prediction model created by using the hyperparameters using each of the five training data sets, and creates five prediction models (S66). The learning data set used first is, for example, training data for 24 hours, as described above. Further, the model generation unit 103c shall create a prediction model using the optimum hyperparameters created in advance.

次に、モデル評価部１０３ｄは、モデル生成部１０３ｃによって学習された５つの予測モデルのそれぞれに対して評価データを入力し、算出された予測値と実測値との差の絶対値（誤差）を５つの予測モデル毎に算出する。そして、モデル評価部１０３ｄは、５つの予測モデル毎に誤差の最大値を求める。ここで、各予測モデルの誤差の最大値を求める関数を、目的関数ｆとする。また、実測値は、上述と同様に、例えば、２．５時間後のＣＯＤ（ＴＮ，ＴＰ）の値である。 Next, the model evaluation unit 103d inputs evaluation data for each of the five prediction models learned by the model generation unit 103c, and calculates the absolute value (error) of the difference between the calculated predicted value and the measured value. Calculated for each of the five prediction models. Then, the model evaluation unit 103d obtains the maximum value of the error for each of the five prediction models. Here, the function for obtaining the maximum value of the error of each prediction model is defined as the objective function f. Further, the measured value is, for example, the value of COD (TN, TP) after 2.5 hours, as described above.

前処理部１０３ａは、モデル評価部１０３ｄによって求められた誤差の最大値ｆ（Ｐ_１），ｆ（Ｐ_２），…，ｆ（Ｐ_Ｎ＋１）を用いて５つのパラメータの組を序列化する（Ｓ６７）。ここで、序列化された後の各パラメータの組の誤差の最大値は、ｆ（Ｐ_１）≦ｆ（Ｐ_２）≦ｆ（Ｐ_３）≦ｆ（Ｐ_４）≦ｆ（Ｐ_５）となる。ｍを、最適化を行う際の繰り返し回数（１≦ｍ）とすると、ｍ回目の序列化された後のパラメータの組は、Ｐ_１ ^（ｍ）、Ｐ_２ ^（ｍ）、Ｐ_３ ^（ｍ）、Ｐ_４ ^（ｍ）、Ｐ_５ ^（ｍ）となる。これらのパラメータの組をそれぞれ点と呼ぶことにし、この時の点Ｐ_１ ^（ｍ）を最良点Ｂ^（ｍ）、点Ｐ_４ ^（ｍ）を第２最悪点Ｖ^（ｍ）、点Ｐ_５ ^（ｍ）を最悪点Ｗ^（ｍ）と呼ぶことにする。 The preprocessing unit 103a ranks a set of five parameters using the maximum error values f (P ₁ ), f (P ₂ ), ..., F (PN + 1) obtained by the model evaluation unit 103d (P _{N + 1} ). S67). Here, the maximum value of the error of each parameter set after being ordered is f (P ₁ ) ≤ f (P ₂ ) ≤ f (P ₃ ) ≤ f (P ₄ ) ≤ f (P ₅ ). Become. Assuming that m is the number of repetitions (1 ≦ m) when optimizing, the set of parameters after the m-th order is P ₁ ^(m) , P ₂ ^(m) , P ₃ ^(m). , P ₄ ^(m) , P ₅ ^(m) . The set of these parameters is called a point, and the point P ₁ ^(m) at this time is the best point B ^(m) , the point P ₄ ^(m) is the second worst point V ^(m) , and the point P ₅ ⁽ point P 5). ^{Let m} ) be called the worst point W ^(m) .

最後に、前処理部１０３ａは、最良点Ｂ^（ｍ）、第２最悪点Ｖ^（ｍ）、および最悪点Ｗ^（ｍ）を用いて各パラメータの最適化を行う（Ｓ６８）。本実施形態においては、最適化手法として、例えば、ネルダー・ミード法を用いることにするが、これに限定されるものではなく、最急降下法、遺伝アルゴリズム等であってもよい。 Finally, the pretreatment unit 103a optimizes each parameter using the best point B ^(m) , the second worst point V ^(m) , and the worst point W ^(m) (S68). In the present embodiment, for example, the Nelder-Mead method is used as the optimization method, but the method is not limited to this, and a steepest descent method, a genetic algorithm, or the like may be used.

図１０は、フィルタの各パラメータの最適化（図９のステップＳ６８）の処理手順を説明するためのフローチャートである。なお、ｍの初期値は０として説明する。まず、前処理部１０３ａは、ｍ＝ｍ＋１を算出し、点Ｗ^（ｍ）以外の４つの点Ｐ_ｊ ^（ｍ）（ｊ＝１，２，３，４）を用いて重心^（ｍ）を求める（Ｓ７１）。 FIG. 10 is a flowchart for explaining the processing procedure of optimizing each parameter of the filter (step S68 in FIG. 9). The initial value of m will be described as 0. First, the preprocessing unit 103a calculates m = m + 1, and obtains the center of gravity ^(m) using four points P _j ^(m) (j = 1, 2, 3, 4) other than the point W ^(m) . (S71).

次に、前処理部１０３ａは、線分Ｗ^（ｍ）Ｇ^（ｍ）を２：１に外分する点Ｒ^（ｍ）を求める（Ｓ７２）。 Next, the preprocessing unit 103a obtains a point R ^(m) that externally divides the line segment W ^(m) G ^(m) by 2: 1 (S72).

Ｒ^（ｍ）＝２Ｇ^（ｍ）－Ｗ^（ｍ） …（２）
次に、前処理部１０３ａは、Ｒ^（ｍ）およびＢ^（ｍ）のそれぞれに対応するパラメータの組を作成する。そして、モデル生成部１０３ｃは、Ｒ^（ｍ）およびＢ^（ｍ）のそれぞれに対応する予測モデルを作成する。そして、モデル評価部１０３ｄは、Ｒ^（ｍ）およびＢ^（ｍ）のそれぞれに対応する予測モデルを評価することにより、それぞれの誤差の最大値ｆ（Ｒ^（ｍ））およびｆ（Ｂ^（ｍ））を算出する。以下、この一連の処理を、単に「目的関数ｆの値を算出する」と記載することにする。 R ^(m) = 2G ^(m) -W ^(m) ... (2)
Next, the preprocessing unit 103a creates a set of parameters corresponding to each of R ^(m) and B ^(m) . Then, the model generation unit 103c creates a prediction model corresponding to each of R ^(m) and B ^(m) . Then, the model evaluation unit 103d evaluates the prediction model corresponding to each of R ^(m) and B ^(m) , so that the maximum values f (R ^(m) ) and f (B ^(m) of the respective errors are evaluated. ) Is calculated. Hereinafter, this series of processes will be simply described as "calculating the value of the objective function f".

前処理部１０３ａは、ｆ（Ｒ^（ｍ））がｆ（Ｂ^（ｍ））以下であるか否かを判定する（Ｓ７３）。ｆ（Ｒ^（ｍ））がｆ（Ｂ^（ｍ））以下であれば（Ｓ７３，Ｙｅｓ）、線分Ｗ^（ｍ）Ｒ^（ｍ）を３：２に外分する点Ｅ^（ｍ）を求める（Ｓ７４）。 The preprocessing unit 103a determines whether or not f (R ^(m) ) is f (B ^(m) ) or less (S73). If f (R ^(m) ) is f (B ^(m) ) or less (S73, Yes), find the point E ^(m) that divides the line segment W ^(m) R ^(m) into 3: 2. (S74).

Ｅ^（ｍ）＝３Ｇ^（ｍ）－２Ｗ^（ｍ） …（３）
次に、ｆ（Ｅ^（ｍ））の値を算出し、前処理部１０３ａは、ｆ（Ｅ^（ｍ））がｆ（Ｒ^（ｍ））以下であるか否かを判定する（Ｓ７５）。ｆ（Ｅ^（ｍ））がｆ（Ｒ^（ｍ））以下であれば（Ｓ７５，Ｙｅｓ）、点Ｗ^（ｍ）を削除して、点Ｅ^（ｍ）を加える。そして、Ｗ^（ｍ）をＶ^（ｍ）に更新し（Ｓ７６）、ステップＳ８４に処理が進む。また、ｆ（Ｅ^（ｍ））がｆ（Ｒ^（ｍ））よりも大きければ（Ｓ７５，Ｎｏ）、ステップＳ７７に処理が進む。 E ^(m) = 3G ^(m) -2W ^(m) ... (3)
Next, the value of f (E ^(m) ) is calculated, and the preprocessing unit 103a determines whether or not f (E ^(m) ) is f (R ^(m) ) or less (S75). If f (E ^(m) ) is f (R ^(m) ) or less (S75, Yes), the point W ^(m) is deleted and the point E ^(m) is added. Then, W ^(m ) is updated to V ^(m) (S76), and the process proceeds to step S84. If f (E ^(m) ) is larger than f (R ^(m) ) (S75, No), the process proceeds to step S77.

また、ｆ（Ｒ^（ｍ））がｆ（Ｂ^（ｍ））よりも大きければ（Ｓ７３，Ｎｏ）、ｆ（Ｖ^（ｍ））の値を算出し、前処理部１０３ａは、ｆ（Ｒ^（ｍ））がｆ（Ｖ^（ｍ））以下であるか否かを判定する（Ｓ７８）。ｆ（Ｒ^（ｍ））がｆ（Ｖ^（ｍ））以下であれば（Ｓ７８，Ｙｅｓ）、ステップＳ７７に処理が進む。 If f (R ^(m) ) is larger than f (B ^(m) ), the value of (S73, No), f (V ^(m) ) is calculated, and the preprocessing unit 103a determines f (R ⁽ m)). It is determined whether or not ^m) ) is f (V ^(m) ) or less (S78). If f (R ^(m) ) is f (V ^(m) ) or less (S78, Yes), the process proceeds to step S77.

ステップＳ７７において、前処理部１０３ａは、点Ｗ^（ｍ）を削除して、点Ｒ^（ｍ）を加える。そして、Ｗ^（ｍ）をＶ^（ｍ）に更新し、ステップＳ８４に処理が進む。 In step S77, the pretreatment unit 103a deletes the point W ^(m ^{) and adds the point R (m)} . Then, W ^(m ) is updated to V ^(m) , and the process proceeds to step S84.

また、ｆ（Ｒ^（ｍ））がｆ（Ｖ^（ｍ））よりも大きければ（Ｓ７８，Ｎｏ）、前処理部１０３ａは、線分Ｗ^（ｍ）Ｇ^（ｍ）の中点Ｓ^（ｍ）を求める（Ｓ７９）。 Further, if f (R ^(m) ) is larger than f (V ^(m) ) (S78, No), the preprocessing unit 103a is the midpoint S ^(m ) of the line segment W ^(m) G ^(m ). (S79).

Ｓ^（ｍ）＝１／２Ｇ^（ｍ）＋１／２Ｗ^（ｍ） …（４）
そして、ｆ（Ｓ^（ｍ））およびｆ（Ｗ^（ｍ））の値を算出し、前処理部１０３ａは、ｆ（Ｓ^（ｍ））がｆ（Ｗ^（ｍ））以下であるか否かを判定する（Ｓ８０）。ｆ（Ｓ^（ｍ））がｆ（Ｗ^（ｍ））以下であれば（Ｓ８０，Ｙｅｓ）、前処理部１０３ａは、点Ｗ^（ｍ）を削除して、点Ｓ^（ｍ）を加え、ステップＳ８３に処理が進む。 S ^(m) = 1 / 2G ^(m) + 1 / 2W ^(m) ... (4)
Then, the values of f (S ^(m) ) and f (W ^(m) ) are calculated, and the preprocessing unit 103a determines whether or not f (S ^(m) ) is f (W ^(m) ) or less. Is determined (S80). If f (S ^(m) ) is f (W ^(m) ) or less (S80, Yes), the preprocessing unit 103a deletes the point W ^(m ), adds the point S ^(m) , and steps. Processing proceeds to S83.

また、ｆ（Ｓ^（ｍ））がｆ（Ｗ^（ｍ））よりも大きければ（Ｓ８０，Ｎｏ）、前処理部１０３ａは、各線分Ｂ^（ｍ）Ｐ_ｊ ^（ｍ）の中点Ｐ_ｊ ^（ｍ）’を求め（ｊ＝２，３，４，５）、各点Ｐ_ｊ ^（ｍ）をＰ_ｊ ^（ｍ）’に更新し（Ｓ８２）、ステップＳ８３に処理が進む。 Further, if f (S ^(m) ) is larger than f (W ^(m) ) (S80, No), the preprocessing unit 103a is at the midpoint P _j ⁽ m) of each line segment B ^(m) P _j ^(m) . ^m)' is obtained (j = 2, 3, 4, 5), each point P _j ^(m) is updated to P _j ^(m)' (S82), and the process proceeds to step S83.

Ｐ_ｊ ^（ｍ）’＝１／２Ｐ_ｊ ^（ｍ）＋１／２Ｂ^（ｍ） …（５）
ステップＳ８３において、前処理部１０３ａは、目的関数ｆを用いて各点を序列化し、ステップＳ８４に処理が進む。 P _j ^(m)' = 1 / 2P _j ^(m) + 1 / 2B ^(m) ... (5)
In step S83, the preprocessing unit 103a ranks each point using the objective function f, and the process proceeds to step S84.

ステップＳ８４において、前処理部１０３ａは、設定判定式を満たすか否かを判定する。この設定判定式として、例えば、最適化繰り返し回数ｍが事前に設定した上限ｑを超えるか否か、最適化時間が事前に設定した上限値を超えるか否か、最良点Ｂ^（ｍ）が変動しなかった最適化繰り返し回数が事前に設定した上限値を超えるか否か、ｆ（Ｂ（ｍ））が事前に設定した閾値を下回るか否か、次式（６）が事前に設定した閾値を下回るか否か、等が挙げられる。 In step S84, the preprocessing unit 103a determines whether or not the setting determination formula is satisfied. As this setting determination formula, for example, whether the number of optimization repetitions m exceeds the preset upper limit q, whether the optimization time exceeds the preset upper limit, and the best point B ^(m) fluctuate. Whether or not the number of optimization repetitions that were not performed exceeds the preset upper limit value, whether or not f (B (m)) is below the preset threshold value, and whether or not the following equation (6) presets the threshold value. Whether or not it is less than the above, and so on.

設定判定式を満たさなければ（Ｓ８４，Ｎｏ）、ステップＳ７１に戻って以降の処理を繰り返す。また、設定判定式を満たせば（Ｓ８４，Ｙｅｓ）、前処理部１０３ａは、点Ｂ^（ｍ）のパラメータの組み合わせを採用する（Ｓ８５）。すなわち、前処理部１０３ａは、点Ｂ^（ｍ）のパラメータの組み合わせを用いて、最初の学習データセットにフィルタ処理を行い、フィルタ処理後の学習データセットを用いて、図４のステップＳ３以降の処理が行われることになる。 If the setting determination formula is not satisfied (S84, No), the process returns to step S71 and the subsequent processing is repeated. Further, if the setting determination formula is satisfied (S84, Yes), the preprocessing unit 103a adopts the combination of the parameters of the point B ^(m) (S85). That is, the preprocessing unit 103a filters the first training data set using the combination of the parameters of the points B ^(m) , and uses the filtered training data set after step S3 in FIG. Processing will be performed.

なお、非定常時データ変更部１０３ｂによる非定常時データの変更処理（図４のステップＳ３）が不要な場合には、ステップＳ３の処理を行わずに、ステップＳ４の処理に進むようにしてもよい。 If the non-stationary data change process (step S3 in FIG. 4) by the non-stationary data change unit 103b is unnecessary, the process of step S4 may be performed without performing the process of step S3.

図１１は、フィルタの各パラメータの最適化を模式的に示す図である。図１１においては、説明を簡単にするために、Ｎ＝２（２次元）の場合を示している。Ｇ１１は、ｍ＝１の場合の最良点Ｂ^（１）、第２最悪点Ｖ^（１）および最悪点Ｗ^（１）を示している。 FIG. 11 is a diagram schematically showing the optimization of each parameter of the filter. In FIG. 11, for the sake of simplicity, the case of N = 2 (two-dimensional) is shown. G11 indicates the best point B ⁽¹⁾ , the second worst point V ⁽¹⁾ , and the worst point W ⁽¹⁾ when m = 1.

Ｇ１２は、ｍ＝２の場合を示しており、図１０に示す処理によって追加された点が最良点Ｂ^（２）に、Ｇ１１に示す最良点Ｂ^（１）が第２最悪点Ｖ^（２）に、Ｇ１１に示す第２最悪点Ｖ^（１）が最悪点Ｗ^（２）に更新され、Ｇ１１に示す最悪点Ｗ^（１）が削除されている。そして、図１０に示す同様の処理が繰り返される。 G12 shows the case where m = 2, and the point added by the process shown in FIG. 10 is the best point B ⁽²⁾ , and the best point B ⁽¹⁾ shown in G11 is the second worst point V ^(2). The second worst point V ⁽¹⁾ shown in G11 is updated to the worst point W ⁽²⁾ , and the worst point W ⁽¹⁾ shown in G11 is deleted. Then, the same process shown in FIG. 10 is repeated.

Ｇ１３は、ｍ＝ｑ－１の場合の最良点Ｂ^{（ｑ－１）}、第２最悪点Ｖ^{（ｑ－１）}および最悪点Ｗ^{（ｑ－１）}を示している。ｑは、ｍの上限設定値である。 G13 indicates the best point B ^(q-1) , the second worst point V ^(q-1) , and the worst point W ^(q-1) when m = q-1. q is the upper limit set value of m.

Ｇ１４は、ｍ＝ｑの場合の最良点Ｂ^（ｑ）、第２最悪点Ｖ^（ｑ）および最悪点Ｗ^（ｑ）を示している。このように、図１０に示す処理を上限設定値ｑまで繰り返すことにより、徐々に誤差が小さくなり、最適なパラメータを求めることができる。 G14 indicates the best point B ^(q) , the second worst point V ^(q) , and the worst point W ^(q) when m = q. In this way, by repeating the process shown in FIG. 10 up to the upper limit set value q, the error is gradually reduced and the optimum parameter can be obtained.

図１２は、非定常時データ変更部１０３ｂ（図４のステップＳ３）の処理手順を説明するためのフローチャートである。まず、非定常時データ変更部１０３ｂは、前処理部１０３ａによって加工された後の１分毎の学習データをサンプリングして、例えば、１０分毎の学習データを作成する。 FIG. 12 is a flowchart for explaining the processing procedure of the non-stationary data changing unit 103b (step S3 in FIG. 4). First, the non-stationary data changing unit 103b samples the learning data every minute after being processed by the preprocessing unit 103a, and creates learning data every 10 minutes, for example.

非定常時データ変更部１０３ｂは、１０分毎のデータの中から、センサデータが警報レベルを超える非定常時データを抽出する。そして、非定常時データ変更部１０３ｂは、抽出した非定常時データの点数の総数を求め、その総数が全データに占める割合Ｐを求める（ステップＳ３１）。 The non-stationary data change unit 103b extracts non-stationary data whose sensor data exceeds the alarm level from the data every 10 minutes. Then, the non-stationary data changing unit 103b obtains the total number of points of the extracted non-stationary data, and obtains the ratio P of the total number to all the data (step S31).

図１３は、学習データのサンプリングを説明するための図である。なお、以下の説明においては、放流水質データの１つであるＣＯＤの非定常時データの場合について説明するが、他の放流水質データ（ＴＮ，ＰＮ）の非定常時データについても同様である。 FIG. 13 is a diagram for explaining sampling of training data. In the following description, the case of non-stationary data of COD, which is one of the discharged water quality data, will be described, but the same applies to the non-stationary data of other discharged water quality data (TN, PN).

図１３は、１分毎のＣＯＤのセンサデータを示しており、非定常時データ変更部１０３ｂは、１０点のセンサデータの中の５点目のセンサデータを抽出する。図１３においては、警報レベルを超えているＣＯＤの非定常時データのみを記載しているが、ＣＯＤの定常時データも同様に、１０点のセンサデータの中の５点目のセンサデータを抽出して１０分毎のデータを生成する。なお、図１３においては、１分毎のＣＯＤのセンサデータをサンプリングデータと記載しており、１０分毎のＣＯＤのセンサデータを初期の抽出データと記載している。 FIG. 13 shows the sensor data of COD every minute, and the non-stationary data changing unit 103b extracts the sensor data of the fifth point among the sensor data of ten points. In FIG. 13, only the non-stationary data of the COD exceeding the alarm level is shown, but the sensor data of the fifth point among the 10 sensor data is similarly extracted from the steady state data of the COD. And generate data every 10 minutes. In FIG. 13, the sensor data of COD every 1 minute is described as sampling data, and the sensor data of COD every 10 minutes is described as initial extraction data.

ここで、非定常時データ変更部１０３ｂが非定常時データの増減を行う際に使用する事前の設定項目について説明する。Ｐ_ＭＡＸは非定常時データの最大割合を示し、Ｐ_ＭＩＮは非定常時データの最小割合を示す。また、μは、非定常時データの割合の増減幅の調整係数を示し、Ｎ_ＭＩＮは、非定常時データの割合の最小増減幅を示す。 Here, the preset items used by the non-stationary data changing unit 103b to increase / decrease the non-stationary data will be described. _PMAX indicates the maximum percentage of non-stationary data, and P _MIN indicates the minimum percentage of non-stationary data. Further, μ indicates the adjustment coefficient of the increase / decrease range of the ratio of the non-stationary data, and N _MIN indicates the minimum increase / decrease range of the ratio of the non-stationary data.

また、Ｐ０^（ｋ）をｋ回目の選択時の非定常時データの最小割合とし、Ｐ３^（ｋ）をｋ回目の選択時の非定常時データの最大割合とすると、ｋ回目の非定常時データの割合の増減幅Ｎ^（ｋ）は、μ×（Ｐ３^（ｋ）－Ｐ０^（ｋ））となる。 Further, assuming that P0 ^(k) is the minimum ratio of the non-stationary data at the time of the kth selection and P3 ^(k) is the maximum ratio of the non-stationary data at the time of the kth selection, the kth non-stationary data. The increase / decrease range N ^(k) of the ratio of is μ × (P3 ^(k) −P0 ^(k) ).

非定常時データの最適な割合が、事前に設定された最小割合Ｐ_ＭＩＮと最大割合Ｐ_ＭＡＸとの間に存在するという考えに基づき、最適化法（非線形計画法）を用いて非定常時データの最適な割合を求める。 Non-linear data using an optimization method (nonlinear programming) based on the idea that the optimal percentage of non-stationary data exists between the preset minimum percentage P _MIN and maximum percentage P _MAX . Find the optimal ratio of.

非定常時データ変更部１０３ｂは、計算回数ｋに１を代入し（ステップＳ３２）、１回目の最小割合Ｐ０^（ｋ）にＰ_ＭＩＮを代入し、１回目の最大割合Ｐ３^（ｋ）にＰ_ＭＡＸを代入し（ステップＳ３３）、ｋ回目の割合の増減幅Ｎ^（ｋ）にμ×（Ｐ３^（ｋ）－Ｐ０^（ｋ））の値を代入する（ステップＳ３４）。 The non-stationary data change unit 103b substitutes 1 for the number of calculations k (step S32), substitutes _PMIN for the first minimum ratio P0 ^(k) , and _PMAX for the first maximum ratio P3 ^(k) . (Step S33), and the value of μ × (P3 ^(k) −P0 ^(k) ) is substituted into the increase / decrease range N ^(k) of the kth ratio (step S34).

次に、非定常時データ変更部１０３ｂは、割合Ｐ１^（ｋ）にＰ０^（ｋ）＋Ｎ^（ｋ）を代入し、割合Ｐ２^（ｋ）にＰ３^（ｋ）－Ｎ^（ｋ）を代入する（ステップＳ３５）。そして、非定常時データ変更部１０３ｂは、割合がＰ１^（ｋ）となるように非定常時データを増やして学習データセットＤ１を作成し、割合がＰ２^（ｋ）となるように非定常時データを減らして学習データセットＤ２を作成する（ステップＳ３６，Ｓ３７）。このとき、非定常時データ変更部１０３ｂは、ステップＳ３１において算出した割合Ｐに基づいて、非定常時データを増減し、学習データセットＤ１およびＤ２を作成するものとする。 Next, the non-stationary data changing unit 103b substitutes P0 ^(k) + N ^(k) for the ratio P1 ^(k ) and substitutes P3 ^(k) -N ^(k) for the ratio P2 ^(k) (step). S35). Then, the non-stationary data changing unit 103b creates a learning data set D1 by increasing the non-stationary data so that the ratio becomes P1 ^(k) , and the non-stationary data so that the ratio becomes P2 ^(k) . Is reduced to create the training data set D2 (steps S36 and S37). At this time, the non-stationary data changing unit 103b shall increase or decrease the non-stationary data based on the ratio P calculated in step S31 to create the learning data sets D1 and D2.

図１４は、非定常時データの数を増やす方法を説明するための図である。図１４に示すように、初期の抽出データの周辺のサンプリングデータには非定常時データが含まれることが多いため、追加の非定常時データを抽出することができる。図１４においては、１０点の中の２点目の非定常時データおよび８点目の非定常時データを追加することにより、非定常時データを３倍にする例を示している。 FIG. 14 is a diagram for explaining a method of increasing the number of non-stationary data. As shown in FIG. 14, since the sampling data around the initial extraction data often includes non-stationary data, additional non-stationary data can be extracted. FIG. 14 shows an example in which the non-stationary data is tripled by adding the non-stationary data at the second point and the non-stationary data at the eighth point among the ten points.

図１５は、非定常時データの数を増やす他の方法を説明するための図である。図１５に示すように、初期の抽出データの振幅に対して、振幅を１．０５倍に増やした非定常時データを作成し、その非定常時データを学習データに追加する。また、初期の抽出データの振幅に対して、振幅を０．９５倍に減らした非定常時データを作成し、その非定常時データを学習データに追加する。このようにして、非定常時データの数を３倍に増やすことができる。 FIG. 15 is a diagram for explaining another method for increasing the number of non-stationary data. As shown in FIG. 15, non-stationary data whose amplitude is increased by 1.05 times with respect to the amplitude of the initial extracted data is created, and the non-stationary data is added to the training data. In addition, non-stationary data whose amplitude is reduced by 0.95 times with respect to the amplitude of the initial extracted data is created, and the non-stationary data is added to the training data. In this way, the number of non-stationary data can be tripled.

図１６は、非定常時データの初期の抽出データを示す図である。図１６においては、初期の抽出データのサンプリング間隔が１分となっている場合について説明するが、サンプリング間隔が１分よりも長くなっている場合もあり得る。 FIG. 16 is a diagram showing initial extraction data of non-stationary data. In FIG. 16, a case where the sampling interval of the initial extraction data is 1 minute will be described, but the sampling interval may be longer than 1 minute.

図１７は、非定常時データの数を減らす方法を説明するための図である。図１７においては、サンプリング間隔を５分とし、５点の非定常時データ（初期の抽出データ）の中の２点目を抽出する。これによって、非定常時データの数を初期の抽出データの０．２倍に減らすことができる。 FIG. 17 is a diagram for explaining a method of reducing the number of non-stationary data. In FIG. 17, the sampling interval is set to 5 minutes, and the second point of the five non-stationary time data (initial extraction data) is extracted. As a result, the number of non-stationary data can be reduced to 0.2 times the initial extracted data.

たとえば、初期の抽出データの数が１８万点であり、その中の非定常時データの数が６千点であったとすると、非定常時データを正しく予測するために非定常時データの数を増やす必要がある。上述の非定常時データを増やす方法を用いて、非定常時データの数を２万点追加するとすれば、追加後の全データが２０万点となり、追加後の非定常時データが２万６千点となる。その結果、非定常時データの割合が１３％となり、非定常時データを正しく予測することができるようになる。 For example, if the number of initial extracted data is 180,000 and the number of non-stationary data is 6,000, the number of non-stationary data is used to correctly predict the non-stationary data. Need to increase. If the number of non-stationary data is added by 20,000 using the above-mentioned method of increasing non-stationary data, the total number of added data will be 200,000, and the added non-stationary data will be 26. It will be a thousand points. As a result, the ratio of the non-stationary data becomes 13%, and the non-stationary data can be predicted correctly.

次に、モデル生成部１０３ｃは、非定常時データ変更部１０３ｂによって作成された学習データセットＤ１を用いて予測モデルＭ１を作成し、学習データセットＤ２を用いて予測モデルＭ２を作成する（ステップＳ３８）。ここでは、モデル生成部１０３ｃは、図５に示すハイパーパラメータ調整項目の中の最適な組み合わせと考えられるハイパーパラメータを用いて予測モデルを作成するものとし、予測モデルの学習および検証を行うことにより予測モデルＭ１およびＭ２を更新する。 Next, the model generation unit 103c creates a prediction model M1 using the training data set D1 created by the non-stationary data change unit 103b, and creates a prediction model M2 using the training data set D2 (step S38). ). Here, the model generation unit 103c shall create a prediction model using hyperparameters considered to be the optimum combination among the hyperparameter adjustment items shown in FIG. 5, and predict by learning and verifying the prediction model. Update models M1 and M2.

次に、モデル評価部１０３ｄは、評価データを用い、ステップＳ３８において作成された予測モデルＭ１およびＭ２で予測を行う。モデル評価部１０３ｄは、予測モデルＭ１およびＭ２のそれぞれの予測値と実測値との評価を行う（ステップＳ３９）。ここでは、モデル評価部１０３ｄは、予測モデルＭ１およびＭ２のそれぞれの予測値と実測値との差を誤差として計算し、最大誤差を評価誤差として保存する。 Next, the model evaluation unit 103d uses the evaluation data to make predictions using the prediction models M1 and M2 created in step S38. The model evaluation unit 103d evaluates the predicted values and the actually measured values of the predicted models M1 and M2 (step S39). Here, the model evaluation unit 103d calculates the difference between the predicted value and the measured value of the predicted models M1 and M2 as an error, and saves the maximum error as an evaluation error.

次に、非定常時データ変更部１０３ｂは、予測モデルＭ１の評価誤差が予測モデルＭ２の評価誤差以下であるか否かを判定する（ステップＳ４０）。予測モデルＭ１の評価誤差が予測モデルＭ２の評価誤差以下であれば（Ｓ４０，Ｙｅｓ）、非定常時データ変更部１０３ｂは、学習データセットＤ１を一時保存する（ステップＳ４１）。 Next, the non-stationary data changing unit 103b determines whether or not the evaluation error of the prediction model M1 is equal to or less than the evaluation error of the prediction model M2 (step S40). If the evaluation error of the prediction model M1 is equal to or less than the evaluation error of the prediction model M2 (S40, Yes), the non-stationary data change unit 103b temporarily stores the training data set D1 (step S41).

次に、非定常時データ変更部１０３ｂは、ｋ＋１回目の最小割合Ｐ０^{（ｋ＋１）}にＰ０^（ｋ）を代入し、ｋ＋１回目の最大割合Ｐ３^{（ｋ＋１）}にＰ２^（ｋ）を代入する（ステップＳ４２）。そして、ｋ＋１回目の割合Ｐ１^{（ｋ＋１）}にＰ１^（ｋ）を代入し、ｋ＋１回目の割合Ｐ２^{（ｋ＋１）}にＰ２^（ｋ）－Ｎ^（ｋ）の算出値を代入する（ステップＳ４３）。 Next, the non-stationary data changing unit 103b substitutes P0 ^(k) for the minimum ratio P0 ^{(k + 1)} at the k + 1st time, and substitutes P2 ^(k) for the maximum ratio P3 ^{(k + 1)} at the k + 1th time (step S42). ). Then, P1 ^(k) is substituted into the k + 1th ratio P1 ^{(k + 1)} , and the calculated value of P2 ^(k) −N ^(k ) is substituted into the k + 1th ratio P2 ^{(k + 1)} (step S43).

また、予測モデルＭ１の評価誤差が予測モデルＭ２の評価誤差より大きければ（Ｓ４０，Ｎｏ）、非定常時データ変更部１０３ｂは、学習データセットＤ２を一時保存する（ステップＳ４４）。 If the evaluation error of the prediction model M1 is larger than the evaluation error of the prediction model M2 (S40, No), the non-stationary data change unit 103b temporarily stores the training data set D2 (step S44).

次に、非定常時データ変更部１０３ｂは、ｋ＋１回目の最小割合Ｐ０^{（ｋ＋１）}にＰ１^（ｋ）を代入し、ｋ＋１回目の最大割合Ｐ３^{（ｋ＋１）}にＰ３^（ｋ）を代入する（ステップＳ４５）。そして、ｋ＋１回目の割合Ｐ１^{（ｋ＋１）}にＰ１^（ｋ）＋Ｎ^（ｋ）の算出値を代入し、ｋ＋１回目の割合Ｐ２^{（ｋ＋１）}にＰ２^（ｋ）を代入する（ステップＳ４６）。 Next, the non-stationary data changing unit 103b substitutes P1 ^(k) for the minimum ratio P0 ^{(k + 1)} at the k + 1st time, and substitutes P3 ^(k) for the maximum ratio P3 ^{(k + 1)} at the k + 1th time (step S45). ). Then, the calculated value of P1 ^(k) + N ^(k) is substituted into the k + 1st ratio P1 ^{(k + 1)} , and P2 ( ^k ) is substituted into the k + 1st ratio P2 ^{(k + 1)} (step S46).

次に、非定常時データ変更部１０３ｂは、Ｎ^{（ｋ＋１）}にμ×（Ｐ３^{（ｋ＋１）}－Ｐ０^{（ｋ＋１）}）の算出値を代入して更新し（ステップＳ４７）、増減幅Ｎ^{（ｋ＋１）}が最小の増減幅Ｎ_ＭＩＮ以下であるか否かを判定する（ステップＳ４８）。 Next, the non-stationary data changing unit 103b substitutes the calculated value of μ × (P3 ^{(k + 1)} −P0 ^{(k + 1)} ) into N ⁽ k + 1) and updates it (step S47), and increases / decreases the range N ^{(k + 1).} Is determined whether or not is equal to or less than the minimum increase / decrease width _NMIN (step S48).

増減幅Ｎ^{（ｋ＋１）}が最小の増減幅Ｎ_ＭＩＮよりも大きければ（Ｓ４８，Ｎｏ）、非定常時データ変更部１０３ｂは、計算回数ｋにｋ＋１の値を代入し（ステップＳ４９）、ステップＳ３６に戻って以降の処理を繰り返す。また、増減幅Ｎ^{（ｋ＋１）}が最小の増減幅Ｎ_ＭＩＮ以下であれば（Ｓ４８，Ｙｅｓ）、非定常時データ変更部１０３ｂは、一次保存している学習データセットを採用し（ステップＳ５０）、図４のステップＳ４に処理が進む。 If the increase / decrease width N ^{(k + 1)} is larger than the minimum increase / decrease width N _MIN (S48, No), the non-stationary data change unit 103b substitutes the value of k + 1 for the number of calculations k (step S49), and in step S36. After returning, the subsequent processing is repeated. Further, if the increase / decrease width N ^{(k + 1)} is equal to or less than the minimum increase / decrease width N _MIN (S48, Yes), the non-stationary data change unit 103b adopts the learning data set primaryly stored (step S50). The process proceeds to step S4 in FIG.

図１８は、図１２に示す非定常時データ変更部１０３ｂ（図４のステップＳ３）の処理を模式的に示す図である。図１８は、横軸を非定常時データの割合、縦軸を評価誤差Ｅとするグラフ形式で表している。 FIG. 18 is a diagram schematically showing the processing of the non-stationary data changing unit 103b (step S3 in FIG. 4) shown in FIG. In FIG. 18, the horizontal axis represents the ratio of non-stationary data, and the vertical axis represents the evaluation error E in a graph format.

１回目の最適な割合の選択範囲を、最小割合Ｐ０^（１）（Ｐ_ＭＩＮ）から最大割合Ｐ３^（１）（Ｐ_ＭＡＸ）までとし、１回目の増減幅Ｎ^（１）にμ×（Ｐ３^（１）－Ｐ０^（１））の算出値を代入し、割合Ｐ１^（１）をＰ０^（１）＋Ｎ^（１）とし、割合Ｐ２^（１）をＰ３^（１）－Ｎ^（１）とする。このとき、予測モデルＭ１の評価誤差Ｅ１よりも予測モデルＭ２の評価誤差Ｅ２の方が大きいため、最大割合Ｐ３^（２）に割合Ｐ２^（１）を代入し、割合Ｐ２^（２）にＰ２^（１）－Ｎ^（１）の算出値を代入する。これにより、最大割合Ｐ３^（２）および割合Ｐ２^（２）がＮ^（１）だけ中央（左側）に移動する（Ｇ１）。 The selection range of the optimum ratio for the first time is from the minimum ratio P0 ⁽¹⁾ ( _PMIN ) to the maximum ratio P3 ⁽¹⁾ ( _PMAX ), and the first increase / decrease range N ⁽¹⁾ is μ × (P3 ⁽ P3). ¹⁾ Substitute the calculated value of -P0 ⁽¹⁾ ), and let the ratio P1 ⁽¹⁾ be P0 ⁽¹⁾ + N ⁽¹⁾ , and the ratio P2 ⁽¹⁾ be P3 ⁽¹⁾ -N ⁽¹⁾ . At this time, since the evaluation error E2 of the prediction model M2 is larger than the evaluation error E1 of the prediction model M1, the ratio P2 ⁽¹⁾ is substituted for the maximum ratio P3 ⁽²⁾ , and the ratio P2 ⁽²⁾ is P2 ⁽¹ ). ⁾ -N Substitute the calculated value of ⁽¹⁾ . As a result, the maximum ratio P3 ⁽²⁾ and the ratio P2 ⁽²⁾ move to the center (left side) by N ⁽¹⁾ (G1).

次に、２回目の最適な割合の選択範囲を、最小割合Ｐ０^（２）から最大割合Ｐ３^（２）までとし、２回目の増減幅Ｎ^（２）にμ×（Ｐ３^（２）－Ｐ０^（２））の算出値を代入する。このとき、予測モデルＭ１の評価誤差Ｅ１よりも予測モデルＭ２の評価誤差Ｅ２の方が小さいため、最小割合Ｐ０^（３）に割合Ｐ１^（２）を代入し、割合Ｐ１^（３）にＰ１^（２）＋Ｎ^（２）の算出値を代入する。これにより、最小割合Ｐ０^（３）および割合Ｐ１^（３）がＮ^（２）だけ中央（右側）に移動する（Ｇ２）。 Next, the selection range of the optimum ratio for the second time is set from the minimum ratio P0 ⁽²⁾ to the maximum ratio P3 ⁽²⁾ , and the second increase / decrease range N ⁽²⁾ is μ × (P3 ⁽²⁾ -P0 ⁽ . ²⁾ Substitute the calculated value in). At this time, since the evaluation error E2 of the prediction model M2 is smaller than the evaluation error E1 of the prediction model M1, the ratio P1 ⁽²⁾ is substituted into the minimum ratio P0 ⁽³⁾ , and the ratio P1 ⁽³⁾ is P1 ⁽² ). ⁾ + N Substitute the calculated value of ⁽²⁾ . As a result, the minimum ratio P0 ⁽³⁾ and the ratio P1 ⁽³⁾ move to the center (right side) by N ⁽²⁾ (G2).

次に、３回目の最適な割合の選択範囲を、最小割合Ｐ０^（３）から最大割合Ｐ３^（３）までとし、３回目の増減幅Ｎ^（３）にμ×（Ｐ３^（３）－Ｐ０^（３））の算出値を代入する。このとき、予測モデルＭ１の評価誤差Ｅ１よりも予測モデルＭ２の評価誤差Ｅ２の方が大きいため、最大割合Ｐ３^（４）に割合Ｐ２^（３）を代入し、割合Ｐ２^（４）にＰ２^（３）－Ｎ^（３）の算出値を代入する。これにより、最大割合Ｐ３^（４）および割合Ｐ２^（４）がＮ^（３）だけ中央（左側）に移動する（Ｇ３）。 Next, the selection range of the optimum ratio for the third time is set from the minimum ratio P0 ⁽³⁾ to the maximum ratio P3 ⁽³⁾ , and the increase / decrease range N ⁽³⁾ for the third time is μ × (P3 ⁽³⁾ −P0 ⁽ . ³⁾ Substitute the calculated value in). At this time, since the evaluation error E2 of the prediction model M2 is larger than the evaluation error E1 of the prediction model M1, the ratio P2 ⁽³⁾ is substituted for the maximum ratio P3 ⁽⁴⁾ , and the ratio P2 ⁽⁴⁾ is P2 ⁽³ ). ⁾ -N Substitute the calculated value of ⁽³⁾ . As a result, the maximum ratio P3 ⁽⁴⁾ and the ratio P2 ⁽⁴⁾ move to the center (left side) by N ⁽³⁾ (G3).

同様の処理を繰り返し、Ｔ回目の最適な割合の選択範囲を、最小割合Ｐ０^（Ｔ）から最大割合Ｐ３^（Ｔ）までとし、Ｔ回目の増減幅Ｎ^（Ｔ）にμ×（Ｐ３^（Ｔ）－Ｐ０^（Ｔ））の算出値を代入する。このとき、Ｔ回目の増減幅Ｎ^（Ｔ）が最小の増減幅Ｎ_ＭＩＮよりも小さくなるため、一次保存されている学習データセットが採用される。 The same process is repeated, the selection range of the optimum ratio for the T time is set from the minimum ratio P0 ^(T) to the maximum ratio P3 ^(T) , and the increase / decrease range N ^(T) for the T time is μ × (P3 ^(T)). -Substitute the calculated value of P0 ^(T) ). At this time, since the increase / decrease width N ^(T) of the Tth time is smaller than the minimum increase / decrease width N _MIN , the primary stored learning data set is adopted.

（放流水質予測部１０５の処理手順）
図１９は、放流水質予測部１０５の処理手順を説明するためのフローチャートである。まず、前処理部１０５ａは、所定の予測周期毎に直近データデータベース１０４に蓄積されている直近データを取得する（ステップＳ２１）。前処理部１０３ａは、学習データの前処理を行う。より具体的には、前処理部１０３ａは、学習データ内に欠測値または異常値がある場合には、欠測値の補間または異常値の除去を行う（ステップＳ２２）。なお、欠測値、異常値、それらの値の補間方法および除去方法については、予測モデル生成部１０３の前処理部１０３ａにおいて説明したものと同様である。 (Treatment procedure of effluent quality prediction unit 105)
FIG. 19 is a flowchart for explaining the processing procedure of the discharged water quality prediction unit 105. First, the preprocessing unit 105a acquires the latest data stored in the latest data database 104 at predetermined prediction cycles (step S21). The preprocessing unit 103a performs preprocessing of learning data. More specifically, when there is a missing value or an abnormal value in the training data, the preprocessing unit 103a interpolates the missing value or removes the abnormal value (step S22). The missing values, the abnormal values, the interpolation method and the removal method of those values are the same as those described in the preprocessing unit 103a of the prediction model generation unit 103.

なお、図１９に示す処理は、所定の予測周期毎に再度行われる。この所定の予測周期とは、放流水質予測部１０５が１回の予測処理を実施するのに必要な時間であり、たとえば、数分程度の時間として設定される。この所定の予測周期は、任意の値を設定することが可能であり、学習データのデータ量の変更等に応じて適切な値を設定するものとする。 The process shown in FIG. 19 is performed again at predetermined prediction cycles. This predetermined prediction cycle is the time required for the discharged water quality prediction unit 105 to carry out one prediction process, and is set as, for example, a time of about several minutes. Any value can be set for this predetermined prediction cycle, and an appropriate value shall be set according to a change in the amount of training data or the like.

次に、予測処理部１０５ｂは、前処理部１０５ａによって前処理された後の直近データを、予測モデル生成部１０３によって作成された予測モデル２０４に適用して、２．５時間後の放流水質データ（ＣＯＤ、ＴＮ、ＴＰ）の値を予測する（ステップＳ２３）。 Next, the prediction processing unit 105b applies the latest data after preprocessing by the pretreatment unit 105a to the prediction model 204 created by the prediction model generation unit 103, and the discharged water quality data after 2.5 hours. Predict the value of (COD, TN, TP) (step S23).

図２０は、放流水質データの予測を模式的に示す図である。直近データ２０５として学習データセットの各行の２３項目（時間情報データ５項目＋センサ情報データ１５項目＋放流水質データ３項目）が予測モデル２０４の入力層に入力される。予測モデル２０４は、２．５時間後のＣＯＤの値を予測し、出力データ２０６として出力する。 FIG. 20 is a diagram schematically showing the prediction of the discharged water quality data. As the latest data 205, 23 items (5 items of time information data + 15 items of sensor information data + 3 items of discharged water quality data) of each row of the training data set are input to the input layer of the prediction model 204. The prediction model 204 predicts the value of COD after 2.5 hours and outputs it as output data 206.

同様にして、ＴＮの予測用に作成された予測モデル２０４を用いて、２．５時間後のＴＮの値を予測することも可能である。また、ＴＰの予測用に作成された予測モデル２０４を用いて、２．５時間後のＴＰを予測することも可能である。 Similarly, it is also possible to predict the value of TN after 2.5 hours by using the prediction model 204 created for the prediction of TN. It is also possible to predict the TP after 2.5 hours by using the prediction model 204 created for the prediction of the TP.

最後に、後処理部１０５ｃは、予測処理部１０５ｂによって予測された放流水質データ（予測結果）に対し、後処理を行う。具体的には、後処理部１０５ｃは、予測結果を出力部１０６が出力可能な形式に変換し（ステップＳ２４）、処理を終了する。 Finally, the post-treatment unit 105c performs post-treatment on the discharged water quality data (prediction result) predicted by the prediction processing unit 105b. Specifically, the post-processing unit 105c converts the prediction result into a format that can be output by the output unit 106 (step S24), and ends the processing.

（予測結果の一例）
ある下水処理場において放流水質の予測を行った。データ数は４年８か月の１分データ（約２４５万データ）とし、入力データ種（時間情報データを含む）は２０項目とし、出力データ種は３項目（ＣＯＤ、ＴＮおよびＴＰ）とした。また、学習データセットは、２３項目（入力データ種＋出力データ種）×１２時間とした。機械学習手法としては、ＬＳＴＭ法を用いた。 (Example of prediction result)
The quality of discharged water was predicted at a sewage treatment plant. The number of data was 1 minute data (about 2.45 million data) for 4 years and 8 months, the input data type (including time information data) was 20 items, and the output data type was 3 items (COD, TN and TP). .. The learning data set was 23 items (input data type + output data type) x 12 hours. The LSTM method was used as the machine learning method.

図２１は、非定常時データの数を増減しなかった場合のＣＯＤの実測値と放流水質予測部１０５によって予測されたＣＯＤの予測値とを示すグラフである。縦軸はＣＯＤの値を示し、横軸は時刻を示している。このグラフは、１日のＣＯＤの実測値および予測値を示しており、非定常時データが正しく予測できていないことを示している。 FIG. 21 is a graph showing the measured value of COD and the predicted value of COD predicted by the effluent quality prediction unit 105 when the number of non-stationary data is not increased or decreased. The vertical axis shows the value of COD, and the horizontal axis shows the time. This graph shows the measured and predicted values of COD for one day, and shows that the non-stationary data cannot be predicted correctly.

図２２は、非定常時データの数を増減した場合のＣＯＤの実測値と放流水質予測部１０５によって予測されたＣＯＤの予測値とを示すグラフである。このグラフは、非定常時データが正しく予測できていることを示している。 FIG. 22 is a graph showing the measured value of COD when the number of non-stationary data is increased or decreased and the predicted value of COD predicted by the discharged water quality prediction unit 105. This graph shows that the non-stationary data can be predicted correctly.

（効果）
以上説明したように、本実施形態に係る放流水質予測装置１００によれば、前処理部１０３ａがデジタルフィルタを用いて学習データに含まれる異常データやノイズを除去するようにしたので、質の高い学習データを生成することができ、予測精度の向上を図ることが可能となった。 (effect)
As described above, according to the discharged water quality prediction device 100 according to the present embodiment, the pretreatment unit 103a uses a digital filter to remove abnormal data and noise included in the learning data, so that the quality is high. It was possible to generate training data and improve the prediction accuracy.

また、放流水質データの予測にＲＮＮを用い、ＢＰＴＴ法、ＬＳＴＭ法、ＧＲＵ法などの学習手法を適用するようにしたので、１つの予測モデルで放流水質データを予測することが可能となった。 In addition, since RNN was used to predict the discharged water quality data and learning methods such as the BPTT method, LSTM method, and GRU method were applied, it became possible to predict the discharged water quality data with one prediction model.

また、放流水質データの予測にＲＮＮを用いるようにしたので、下水処理場の運転が変更された場合でも、ハイパーパラメータを変更して予測モデルを作成するだけで対応できるため、予測精度の低下を防止することが可能となった。 In addition, since RNN is used to predict the discharged water quality data, even if the operation of the sewage treatment plant is changed, it can be dealt with simply by changing the hyperparameters and creating a prediction model, which reduces the prediction accuracy. It became possible to prevent it.

また、前処理部１０３ａがデジタルフィルタのパラメータを最適化するようにしたので、さらに質の高い学習データを生成することができ、予測精度の向上を図ることが可能となった。 Further, since the preprocessing unit 103a optimizes the parameters of the digital filter, it is possible to generate higher quality training data and improve the prediction accuracy.

また、モデル生成部１０３ｃが、様々なハイパーパラメータを用いて予測モデルを作成し、その中で評価誤差が少ない予測モデルを用いて放流水質データの予測を行なうようにしたので、放流水質データを正確に予測することが可能となった。 In addition, the model generation unit 103c creates a prediction model using various hyperparameters, and predicts the discharged water quality data using the prediction model with less evaluation error, so that the discharged water quality data is accurate. It became possible to predict.

〔ソフトウェアによる実現例〕
放流水質予測装置１００の制御ブロック（特にセンサ情報取得部１０１、予測モデル生成部１０３、放流水質予測部１０５および出力部１０６）は、集積回路（ＩＣチップ）等に形成された論理回路（ハードウェア）によって実現してもよいし、ソフトウェアによって実現してもよい。 [Example of implementation by software]
The control block (particularly the sensor information acquisition unit 101, the prediction model generation unit 103, the discharge water quality prediction unit 105 and the output unit 106) of the discharge water quality prediction device 100 is a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like. ) Or by software.

後者の場合、放流水質予測装置１００は、各機能を実現するソフトウェアであるプログラムの命令を実行するコンピュータを備えている。このコンピュータは、例えば１つ以上のプロセッサを備えていると共に、上記プログラムを記憶したコンピュータ読み取り可能な記録媒体を備えている。そして、上記コンピュータにおいて、上記プロセッサが上記プログラムを上記記録媒体から読み取って実行することにより、本発明の目的が達成される。上記プロセッサとしては、例えばＣＰＵ（Central Processing Unit）を用いることができる。上記記録媒体としては、「一時的でない有形の媒体」、例えば、ＲＯＭ（Read Only Memory）等の他、テープ、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、上記プログラムを展開するＲＡＭ（Random Access Memory）などをさらに備えていてもよい。また、上記プログラムは、該プログラムを伝送可能な任意の伝送媒体（通信ネットワークや放送波等）を介して上記コンピュータに供給されてもよい。なお、本発明の一態様は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 In the latter case, the effluent water quality prediction device 100 includes a computer that executes instructions of a program that is software that realizes each function. The computer includes, for example, one or more processors and a computer-readable recording medium that stores the program. Then, in the computer, the processor reads the program from the recording medium and executes the program, thereby achieving the object of the present invention. As the processor, for example, a CPU (Central Processing Unit) can be used. As the recording medium, a "non-temporary tangible medium", for example, a ROM (Read Only Memory) or the like, a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. Further, a RAM (Random Access Memory) for expanding the above program may be further provided. Further, the program may be supplied to the computer via any transmission medium (communication network, broadcast wave, etc.) capable of transmitting the program. It should be noted that one aspect of the present invention can also be realized in the form of a data signal embedded in a carrier wave, in which the above program is embodied by electronic transmission.

〔コンピュータの構成例〕
図２３は、本発明の一態様に係る放流水質予測装置（例：放流水質予測装置１００）を実現するためのコンピュータの構成例を示す図である。このコンピュータは、コンピュータ本体３００と、ディスプレイ４０１と、キーボード４０２と、マウス４０３とを含む。使用者は、ディスプレイ４０１に表示された画面を見ながらキーボード４０２およびマウス４０３を操作することにより、コンピュータ本体３００を放流水質予測装置として機能させるためのプログラム（以下、放流水質予測プログラムと呼ぶ。）を起動する。 [Computer configuration example]
FIG. 23 is a diagram showing a configuration example of a computer for realizing a effluent water quality prediction device (eg, effluent water quality prediction device 100) according to one aspect of the present invention. The computer includes a computer body 300, a display 401, a keyboard 402, and a mouse 403. A program for allowing the computer body 300 to function as a effluent water quality predictor by operating the keyboard 402 and the mouse 403 while looking at the screen displayed on the display 401 (hereinafter referred to as a effluent water quality prediction program). To start.

コンピュータ本体３００は、ＣＰＵ３０１と、ＧＰＧＰＵ（General-Purpose computing on Graphics Processing Units）３０２と、ブートプログラム等を記憶するＲＯＭ３０３と、実行するプログラム、作業データ等を記憶するＲＡＭ３０４と、ハードディスク３０５と、ＤＶＤ（Digital Versatile Disc）ドライブ３０６と、ネットワークＩ／Ｆ（Interface）３０７と、メモリポート３０８と、ＲＴＣ（Real Time Clock）３０９とを含み、それぞれ内部バス３１０に接続されている。 The computer body 300 includes a CPU 301, a GPGPU (General-Purpose computing on Graphics Processing Units) 302, a ROM 303 for storing a boot program, a RAM 304 for storing a program to be executed, work data, etc., a hard disk 305, and a DVD ( It includes a Digital Versatile Disc) drive 306, a network I / F (Interface) 307, a memory port 308, and an RTC (Real Time Clock) 309, each of which is connected to an internal bus 310.

ＧＰＧＰＵ３０２は、ＧＰＵの演算資源を画像処理以外の目的に応用したものであり、高い処理性能を有しているため、ニューラルネットワーク等の行列演算、最適化問題、暗号解読、音声処理などの幅広い分野で用いられている。 The GPU PPU 302 is an application of GPU computing resources for purposes other than image processing, and has high processing performance, so it has a wide range of fields such as matrix operations such as neural networks, optimization problems, cryptanalysis, and voice processing. Used in.

放流水質予測プログラムは、ＤＶＤ４０６、リムーバブルメモリ４０５などの記録媒体に記録されており、ＣＰＵ３０１の制御により、ＤＶＤドライブ３０６またはメモリポート３０８を介してハードディスク３０５にインストールされる。また、放流水質予測プログラムは、公衆回線４０４およびネットワークＩ／Ｆ３０７を介してダウンロードされ、ハードディスク３０５にインストールされても良い。 The discharged water quality prediction program is recorded on a recording medium such as a DVD 406 and a removable memory 405, and is installed on the hard disk 305 via the DVD drive 306 or the memory port 308 under the control of the CPU 301. Further, the discharged water quality prediction program may be downloaded via the public line 404 and the network I / F 307 and installed on the hard disk 305.

ＣＰＵ３０１が、ＲＡＭ３０４に記憶される放流水質プログラムを実行し、ネットワークＩ／Ｆ３０７および公衆回線４０４を介して下水処理場に設置された複数のセンサからのセンサ情報データおよび放流水質データを受信し、それらのデータにＲＴＣ３０９から取得した時刻情報データを付与して学習データを生成することにより、センサ情報取得部１０１を実現する。 The CPU 301 executes a discharge water quality program stored in the RAM 304, receives sensor information data and discharge water quality data from a plurality of sensors installed in the sewage treatment plant via the network I / F 307 and the public line 404, and receives them. The sensor information acquisition unit 101 is realized by adding the time information data acquired from the RTC 309 to the data of the above and generating the learning data.

また、ＣＰＵ３０１が、ＲＡＭ３０４に記憶される放流水質プログラムを実行し、センサ情報取得部１０１によって生成された学習データをハードディスク３０５などに逐次記憶することにより、学習データデータベース１０２および直近データデータベース１０４を実現する。 Further, the CPU 301 executes the discharged water quality program stored in the RAM 304, and sequentially stores the learning data generated by the sensor information acquisition unit 101 in the hard disk 305 or the like, thereby realizing the learning data database 102 and the latest data database 104. do.

また、ＣＰＵ３０１が、ＲＡＭ３０４に記憶される放流水質プログラムを実行し、ＧＰＧＰＵ３０２と共にＲＮＮの学習などを行ない、予測モデル２０４を生成することにより、予測モデル生成部１０３を実現する。 Further, the CPU 301 executes the effluent water quality program stored in the RAM 304, learns the RNN together with the GPGPU 302, and generates the prediction model 204, thereby realizing the prediction model generation unit 103.

同様に、ＣＰＵ３０１が、ＲＡＭ３０４に記憶される放流水質予測プログラムを実行し、ＧＰＧＰＵ３０２と共に直近データ２０５を予測モデル２０４に適用した演算を行なって予測データを生成することにより、放流水質予測部１０５を実現する。 Similarly, the CPU 301 executes the effluent water quality prediction program stored in the RAM 304, performs an operation of applying the latest data 205 to the prediction model 204 together with the GPGPU 302, and generates the prediction data, thereby realizing the effluent water quality prediction unit 105. do.

また、ＣＰＵ３０１が、ＲＡＭ３０４に記憶される放流水質予測プログラムを実行し、放流水質予測部１０５によって生成された各種情報を、ネットワークＩ／Ｆ３０７および公衆回線４０４を介して、下水処理場の図示しない制御装置、表示装置などに送信することにより、出力部１０６を実現する。 Further, the CPU 301 executes a effluent water quality prediction program stored in the RAM 304, and controls various information generated by the effluent water quality prediction unit 105 via the network I / F 307 and the public line 404, which is not shown in the sewage treatment plant. The output unit 106 is realized by transmitting to a device, a display device, or the like.

〔まとめ〕
本発明の態様１に係る放流水質予測装置は、下水処理場に設置された複数のセンサから、検出値の時系列データを取得するセンサ情報取得部と、センサ情報取得部によって取得された時系列データを学習データとして記憶する学習データデータベースと、センサ情報取得部によって取得された時系列データを直近データとして記憶する直近データデータベースと、学習データの中の第１の所定時間分の学習データを学習データセットとし、学習データセットにフィルタ処理を行う前処理部と、第２の所定時間後の、放流水質データの値を予測するための予測モデルを、前処理部によってフィルタ処理された後の学習データセットから生成するモデル生成部と、直近データに、予測モデルを適用することにより、放流水質データの値を予測する放流水質予測部とを備える。〔summary〕
The discharged water quality prediction device according to the first aspect of the present invention has a sensor information acquisition unit that acquires time-series data of detected values from a plurality of sensors installed in a sewage treatment plant, and a time-series acquired by the sensor information acquisition unit. A training data database that stores data as training data, a recent data database that stores time-series data acquired by the sensor information acquisition unit as the latest data, and training data for the first predetermined time in the training data are learned. Learning after the pre-processing unit filters the pre-processing unit that forms the data set and filters the training data set, and the prediction model for predicting the value of the discharged water quality data after the second predetermined time. It is provided with a model generation unit generated from a data set and a discharge water quality prediction unit that predicts the value of the discharge water quality data by applying a prediction model to the latest data.

本発明の態様２に係る放流水質予測装置は、上記態様１において、モデル生成部は、学習データセットをトレーニングデータとし、第２の所定時間後の、放流水質データの実測値を検証データとし、トレーニングデータおよび検証データを用いてリカレントニューラルネットワークに学習させることにより予測モデルを生成する構成を備えていてもよい。 In the discharge water quality prediction device according to the second aspect of the present invention, in the above aspect 1, the model generation unit uses the training data set as training data and the measured value of the discharged water quality data after the second predetermined time as verification data. It may be configured to generate a predictive model by training a recurrent neural network using training data and verification data.

本発明の態様３に係る放流水質予測装置は、上記態様２において、モデル生成部は、リカレントニューラルネットワークのハイパーパラメータを変更しながら予測モデルを生成し、放流水質予測装置はさらに、学習データセットの少なくとも一部の学習データを評価データとし、評価データを予測モデルに入力することによって得られる予測値と、検証データとの誤差が所定値以下の予測モデルを、放流水質予測部に設定するモデル評価部を含む構成を備えていてもよい。 In the discharge water quality prediction device according to the third aspect of the present invention, in the above aspect 2, the model generation unit generates a prediction model while changing the hyper parameters of the recurrent neural network, and the discharge water quality prediction device further obtains a training data set. Model evaluation in which at least a part of the training data is used as evaluation data, and a prediction model in which the error between the prediction value obtained by inputting the evaluation data into the prediction model and the verification data is less than a predetermined value is set in the discharged water quality prediction unit. It may have a structure including a part.

本発明の態様４に係る放流水質予測装置は、上記態様３において、前処理部は、フィルタ処理で使用する複数のパラメータの組を生成し、複数のパラメータの組を用いて学習データセットにフィルタ処理を行って複数の学習データセットを作成し、モデル生成部は、複数の学習データセットを用いて複数の予測モデルを生成し、モデル評価部は、評価データを複数の予測モデルに入力することによって得られる予測値と、検証データとの最大誤差を、複数の予測モデル毎に算出し、前処理部は、複数の予測モデルごとに算出された最大誤差によって複数のパラメータを序列化し、最適化を行うことによりフィルタ処理の最適なパラメータの組を抽出する構成を備えていてもよい。 In the discharge water quality prediction device according to the fourth aspect of the present invention, in the third aspect, the pretreatment unit generates a set of a plurality of parameters used in the filtering process and filters the set of the plurality of parameters into the training data set. Processing is performed to create multiple training data sets, the model generation unit generates multiple prediction models using the multiple training data sets, and the model evaluation unit inputs the evaluation data into the multiple prediction models. The maximum error between the predicted value obtained by It may be provided with a configuration for extracting the optimum set of parameters for filtering by performing the above.

本発明の態様５に係る放流水質予測方法は、下水処理場に設置された複数のセンサから、検出値の時系列データを取得するステップと、取得された時系列データを学習データとして記憶するステップと、取得された時系列データを直近データとして記憶するステップと、学習データの中の第１の所定時間分の学習データを学習データセットとし、学習データセットにフィルタ処理を行うステップと、第２の所定時間後の、放流水質データの値を予測するための予測モデルを、フィルタ処理された後の学習データセットから生成するステップと、直近データに、予測モデルを適用することにより、放流水質データの値を予測するステップとを含む。 The discharged water quality prediction method according to the fifth aspect of the present invention includes a step of acquiring time-series data of detected values from a plurality of sensors installed in a sewage treatment plant and a step of storing the acquired time-series data as training data. A step of storing the acquired time-series data as the latest data, a step of using the training data for the first predetermined time in the training data as the training data set, and a step of filtering the training data set, and a second step. By applying the prediction model to the step of generating a prediction model for predicting the value of the discharged water quality data after a predetermined time from the training data set after filtering, and to the latest data, the discharged water quality data. Includes a step to predict the value of.

〔付記事項〕
本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。 [Additional notes]
The present invention is not limited to the above-described embodiments, and various modifications can be made within the scope of the claims, and the embodiments obtained by appropriately combining the technical means disclosed in the different embodiments. Is also included in the technical scope of the present invention.

１００放流水質予測装置
１０１センサ情報取得部
１０２学習データデータベース
１０３予測モデル生成部
１０３ａ前処理部
１０３ｂ非定常時データ変更部
１０３ｃモデル生成部
１０３ｄモデル評価部
１０３ｅモデル変更部
１０４直近データデータベース
１０５放流水質予測部
２０１学習データ
２０４予測モデル
２０５直近データ 100 Discharge water quality prediction device 101 Sensor information acquisition unit 102 Learning data database 103 Prediction model generation unit 103a Pretreatment unit 103b Non-stationary data change unit 103c Model generation unit 103d Model evaluation unit 103e Model change unit 104 Latest data database 105 Discharge water quality prediction Part 201 Training data 204 Prediction model 205 Recent data

Claims

A sensor information acquisition unit that acquires time-series data of detected values from multiple sensors installed in the sewage treatment plant,
A learning data database that stores the time-series data acquired by the sensor information acquisition unit as learning data, and
A latest data database that stores the time-series data acquired by the sensor information acquisition unit as the latest data, and
A pre-processing unit that uses the training data for the first predetermined time in the training data as a training data set and filters the training data set, and
A model generation unit that generates a prediction model for predicting the value of the discharged water quality data after the second predetermined time from the training data set after being filtered by the pretreatment unit.
A effluent water quality prediction device including a effluent water quality prediction unit that predicts the value of the effluent water quality data by applying the prediction model to the latest data.

The model generator
The training data set is used as training data.
The measured value of the discharged water quality data after the second predetermined time is used as the verification data.
The effluent water quality prediction device according to claim 1, wherein a prediction model is generated by training a recurrent neural network using the training data and the verification data.

The model generation unit generates a prediction model while changing the hyperparameters of the recurrent neural network.
Further, the discharged water quality prediction device uses at least a part of the training data of the training data set as evaluation data, and an error between the prediction value obtained by inputting the evaluation data into the prediction model and the verification data is predetermined. The discharge water quality prediction device according to claim 2, further comprising a model evaluation unit that sets a prediction model having a value or less in the discharge water quality prediction unit.

The preprocessing unit generates a set of a plurality of parameters used in the filtering process, filters the training data set using the set of the plurality of parameters, and creates a plurality of training data sets.
The model generation unit generates a plurality of prediction models using the plurality of training data sets, and generates a plurality of prediction models.
The model evaluation unit calculates the maximum error between the predicted value obtained by inputting the evaluation data into the plurality of prediction models and the verification data for each of the plurality of prediction models.
According to claim 3, the preprocessing unit ranks the plurality of parameters according to the maximum error calculated for each of the plurality of prediction models, and performs optimization to extract the optimum set of parameters for the filtering process. The described effluent quality prediction device.

Steps to acquire time-series data of detected values from multiple sensors installed in sewage treatment plants,
A step of storing the acquired time-series data as learning data,
The step of storing the acquired time-series data as the latest data, and
A step of using the training data for the first predetermined time in the training data as a training data set and filtering the training data set, and
A step of generating a prediction model for predicting the value of the discharged water quality data after the second predetermined time from the filtered training data set, and a step of generating a prediction model.
A method for predicting the quality of discharged water, which comprises a step of predicting the value of the discharged water quality data by applying the prediction model to the latest data.