JP7443609B1

JP7443609B1 - Learning devices, learning methods and learning programs

Info

Publication number: JP7443609B1
Application number: JP2023133001A
Authority: JP
Inventors: 知範泉谷; 浩二伊藤; 大悟藤原
Original assignee: NTT Communications Corp
Current assignee: NTT Communications Corp
Priority date: 2022-10-28
Filing date: 2023-08-17
Publication date: 2024-03-05
Anticipated expiration: 2042-10-28
Also published as: JP2024064827A; JP7335414B1; JP2024064997A

Abstract

【課題】模倣学習においてＪＩＴ法による逐次学習に適する学習データを用い、モデルの精度の向上を図る。【解決手段】処理装置１０は、製品の生産工程における状況を表す説明変数と、生産工程における機器の操作を表す目的変数との組み合わせである履歴を収集する収集部１３２と、履歴に、自動操作であるか否かを示すオートパイロットフラグを少なくとも付与して履歴ＤＢ１２１に登録する登録部１３３と、履歴ＤＢ１２１に登録された履歴から、少なくとも、現時刻よりも所定期間以前のオートパイロットフラグが自動操作ではない第１の履歴を取得する取得部１３４と、取得部１３４によって取得された履歴を用いて、説明変数から目的変数を出力するモデルを更新する更新部１３５と、を有する。【選択図】図２The present invention aims to improve model accuracy in imitation learning by using learning data suitable for sequential learning using the JIT method. A processing device 10 includes a collection unit 132 that collects a history that is a combination of an explanatory variable that represents a situation in a product production process and a target variable that represents an operation of equipment in the production process; A registration unit 133 adds at least an autopilot flag indicating whether or not the automatic operation is performed and registers it in the history DB 121, and from the history registered in the history DB 121, at least the autopilot flag for a predetermined period earlier than the current time is automatically operated. and an updating unit 135 that uses the history acquired by the acquisition unit 134 to update a model that outputs an objective variable from an explanatory variable. [Selection diagram] Figure 2

Description

本発明は、学習装置、学習方法及び学習プログラムに関する。 The present invention relates to a learning device, a learning method, and a learning program.

従来、人間の行動を機械学習モデルに学習させ、当該モデルを用いて人間又はロボット等に動作を教示する模倣学習という技術が知られている。 2. Description of the Related Art Conventionally, a technique called imitation learning is known in which human behavior is learned by a machine learning model, and the model is used to teach motion to humans, robots, or the like.

また、観測されたデータを大量に蓄積しておき、蓄積されたデータの中から要求点の近傍のデータを抽出し、当該抽出したデータを用いてモデルの逐次学習を行うJust-In-Time（ＪＩＴ）法という技術が知られている（例えば、非特許文献１を参照）。 Also, just-in-time (just-in-time), which accumulates a large amount of observed data, extracts data near the required point from the accumulated data, and sequentially trains the model using the extracted data. A technique called the JIT method is known (for example, see Non-Patent Document 1).

ここで、例えば化学プラントにおいては、時間の経過に応じて、機器の経年劣化、触媒の劣化、生産ロード計画の変更等の環境の変化が生じる。 For example, in a chemical plant, changes in the environment occur over time, such as deterioration of equipment over time, deterioration of catalysts, and changes in production load plans.

これに対し、化学プラントにおけるオペレータによる機器の操作を学習する模倣学習にＪＩＴ法を適用して、モデルを環境の変化に適応させることが考えられる。 On the other hand, it is conceivable to apply the JIT method to imitation learning that learns the operation of equipment by operators in chemical plants to adapt the model to changes in the environment.

特開２０１９－１８５１９４号公報JP 2019-185194 Publication

山本茂、「Just-In-Time予測制御：蓄積データに基づく予測制御」、計測と制御第52巻第10号 2013年10月号（https://www.jstage.jst.go.jp/article/sicejl/52/10/52_878/_pdf/-char/ja）Shigeru Yamamoto, “Just-In-Time Predictive Control: Predictive Control Based on Accumulated Data”, Measurement and Control Vol. 52, No. 10, October 2013 issue (https://www.jstage.jst.go.jp/article /sicejl/52/10/52_878/_pdf/-char/ja)

模倣学習を行うモデルによるプラントの運転支援システムが提案されている。例えば、化学プラントにおいて実際にオペレータが行った操作をモデルに学習させることで、模倣学習を行う。具体的には、モデルは、特定の工程において、過去にオペレータが投入した原材料の投入量を学習する。そして、モデルは、操作推奨値として原材料の投入量を出力する。オペレータは、モデルの出力に従って原材料の投入量を設定することで、過去のオペレータの操作を模倣することができる。 A plant operation support system using a model that performs imitation learning has been proposed. For example, imitation learning is performed by having a model learn operations actually performed by an operator at a chemical plant. Specifically, the model learns the input amounts of raw materials input by operators in the past in a particular process. The model then outputs the raw material input amount as a recommended operation value. Operators can imitate past operator operations by setting raw material inputs according to the model's output.

そして、このような模倣学習を行うモデルを用いて、プラントの運転支援だけでなく、操作のオートパイロットが可能になる。 Using a model that performs such imitation learning, it becomes possible not only to support plant operation but also to autopilot operations.

ここで、プラントの操作のオートパイロット中においても、操作のデータが蓄積される。しかしながら、オートパイロット中のデータは、オートパイロットのためにモデルが出力した操作に対するものであり、オートパイロット中のデータでモデルを逐次学習を行っても、今のモデルの補強に留まる。このため、オートパイロット中のデータを用いて逐次学習を行うと、モデルの精度が劣化するおそれがあり、実際のプラントの状況が変わった場合に対応できない。 Here, operation data is accumulated even during autopilot operation of the plant. However, the data in the autopilot is for operations output by the model for the autopilot, and even if the model is sequentially trained using the data in the autopilot, it will only reinforce the current model. For this reason, if sequential learning is performed using data from the autopilot, the accuracy of the model may deteriorate, making it impossible to respond to changes in actual plant conditions.

本発明は、上記に鑑みてなされたものであって、模倣学習においてＪＩＴ法による逐次学習に適する学習データを用い、モデルの精度の向上を図ることができる学習装置、学習方法及び学習プログラムを提供することを目的とする。 The present invention has been made in view of the above, and provides a learning device, a learning method, and a learning program that can improve the accuracy of a model by using learning data suitable for sequential learning using the JIT method in imitation learning. The purpose is to

上述した課題を解決し、目的を達成するために、学習装置は、製品の生産工程における状況を表す説明変数と、前記生産工程における機器の操作を表す目的変数との組み合わせである履歴を収集する収集部と、前記履歴に、制御装置による自動操作であるか否かを示す第１の付与情報を少なくとも付与してデータベースに登録する登録部と、前記データベースに登録された履歴から、少なくとも、現時刻よりも所定期間以前の前記第１の付与情報が前記制御装置による自動操作ではない第１の履歴を取得する取得部と、前記取得部によって取得された履歴を用いて、前記説明変数から前記目的変数を出力するモデルを更新する更新部と、を有することを特徴とする。 In order to solve the above-mentioned problems and achieve the purpose, the learning device collects a history that is a combination of explanatory variables representing the situation in the product production process and objective variables representing the operation of equipment in the production process. a collection unit; a registration unit that adds at least first attached information indicating whether or not the operation is automatic by a control device to the history and registers it in a database; an acquisition unit that acquires a first history in which the first given information for a predetermined period before time is not an automatic operation by the control device; The present invention is characterized by comprising an updating unit that updates a model that outputs a target variable.

本発明によれば、模倣学習においてＪＩＴ法による逐次学習に適する学習データを用い、モデルの精度の向上を図る。 According to the present invention, in imitation learning, learning data suitable for sequential learning using the JIT method is used to improve the accuracy of a model.

図１は、プラント運用システムについて説明する図である。FIG. 1 is a diagram illustrating a plant operation system. 図２は、実施の形態に係る処理装置の構成例を示す図である。FIG. 2 is a diagram illustrating a configuration example of a processing device according to an embodiment. 図３は、履歴ＤＢの例を示す図である。FIG. 3 is a diagram showing an example of a history DB. 図４は、処理装置の処理を説明する図である。FIG. 4 is a diagram illustrating the processing of the processing device. 図５は、実施の形態における処理の手順を示すフローチャートである。FIG. 5 is a flowchart showing the processing procedure in the embodiment. 図６は、プログラムを実行するコンピュータの一例を示す図である。FIG. 6 is a diagram showing an example of a computer that executes a program.

以下に、本願に係る学習装置、学習方法及び学習プログラムの実施形態を図面に基づいて詳細に説明する。なお、本発明は、以下に説明する実施形態により限定されるものではない。 Embodiments of a learning device, a learning method, and a learning program according to the present application will be described in detail below based on the drawings. Note that the present invention is not limited to the embodiments described below.

［実施の形態］
［実施の形態の構成］
まず、図１を用いて、プラント運用システムについて説明する。プラント運用システム１は、プラントにおける製品の生産工程の管理及び制御を行うためのシステムである。プラントには、化学製品を生産するための化学プラントが含まれる。 [Embodiment]
[Configuration of embodiment]
First, the plant operation system will be explained using FIG. The plant operation system 1 is a system for managing and controlling product production processes in a plant. Plants include chemical plants for producing chemical products.

図１に示すように、処理装置１０、端末装置２０及びプラントシステム３０を有する。 As shown in FIG. 1, it has a processing device 10, a terminal device 20, and a plant system 30.

処理装置１０は、模倣学習を行うためのモデル（機械学習モデル）に関する処理を行う。処理装置１０は学習装置として機能することができる。 The processing device 10 performs processing related to a model (machine learning model) for performing imitation learning. Processing device 10 can function as a learning device.

また、処理装置１０及びプラントシステム３０は、ネットワークを介して互いにデータ通信ができるように接続されている。例えば、ネットワークはインターネット及びイントラネットである。処理装置１０は、オペレータ（操作者）等から設定された所定のオートパイロット（自動操作）条件を満たし、オペレータからオートパイロット開始を指示された場合には、モデルを用いた、プラントシステム３０のオートパイロット制御を行う。 Furthermore, the processing device 10 and the plant system 30 are connected to each other via a network so that they can communicate data with each other. For example, networks are the Internet and intranets. When the processing device 10 satisfies predetermined autopilot (automatic operation) conditions set by an operator or the like and is instructed by the operator to start the autopilot, the processing device 10 starts the automatic operation of the plant system 30 using the model. Perform pilot control.

プラントシステム３０は、生産工程で使用される機器及び分散制御システム（ＤＣＳ：Distributed Control System）を含むものであってもよい。例えば、機器は、反応器、冷却器、気液分離器等である。 The plant system 30 may include equipment used in the production process and a distributed control system (DCS). For example, the equipment is a reactor, a cooler, a gas-liquid separator, etc.

端末装置２０は、パーソナルコンピュータ、タブレット型端末及びスマートフォン等の情報処理装置、または、プラントの機器を操作するための専用の端末である。 The terminal device 20 is an information processing device such as a personal computer, a tablet terminal, and a smartphone, or a dedicated terminal for operating plant equipment.

オペレータは、端末装置２０を介してプラントシステム３０に含まれる機器を操作するユーザである。また、オペレータは、処理装置１０からオートパイロットの開始が可能になったことが提示されると、処理装置１０にオートパイロット開始を指示してもよい。オペレータは、処理装置１０からオートパイロットの停止が提示されると、処理装置１０にオートパイロット停止を指示する。なお、処理装置１０において使用されるモデルは、システム管理者等によって適宜管理される。 The operator is a user who operates equipment included in the plant system 30 via the terminal device 20. Furthermore, when the processing device 10 indicates that the autopilot can be started, the operator may instruct the processing device 10 to start the autopilot. When the operator is presented with an instruction to stop the autopilot from the processing device 10, the operator instructs the processing device 10 to stop the autopilot. Note that the models used in the processing device 10 are appropriately managed by a system administrator or the like.

図１に基づき、プラント運用システム１の各装置の処理を説明する。 Based on FIG. 1, processing of each device of the plant operation system 1 will be explained.

端末装置２０は、オペレータの操作に応じて、プラントシステム３０の機器を操作する（ステップＳ１）。例えば、端末装置２０は、操作により、機器内の温度、機器内の圧力、生産工程における生産量の目標値、機器に投入する原料の量等を設定する。 The terminal device 20 operates the equipment of the plant system 30 according to the operator's operation (step S1). For example, the terminal device 20 is operated to set the temperature inside the device, the pressure inside the device, the target value of the production amount in the production process, the amount of raw materials to be input into the device, and the like.

プラントシステム３０は、端末装置２０からの操作に従い稼働する（ステップＳ２）。そして、プラントシステム３０は、稼働の履歴を処理装置１０に送信する（ステップＳ３）。 The plant system 30 operates according to the operation from the terminal device 20 (step S2). Then, the plant system 30 transmits the operation history to the processing device 10 (step S3).

例えば、履歴には、プラントシステム３０の各所に設置されたセンサのセンサ値、端末装置２０からの操作によって設定された設定値が含まれる。また、履歴は、各レコードに時刻（タイムスタンプ）が付された時系列データであってもよい。 For example, the history includes sensor values of sensors installed at various locations in the plant system 30 and setting values set by operations from the terminal device 20. Further, the history may be time-series data in which each record has a time stamp.

端末装置２０は、オペレータの操作に応じて、オートパイロット条件を送信する（ステップＳ４）。オートパイロット条件は、モデルの予測値と、実測値との誤差を基に設定される。例えば、オートパイロット条件は、モデルの予測値と実測値との誤差が所定回数にわたって所定の閾値未満であることや、モデルの予測値と実測値との誤差の直近所定回数の平均が所定の閾値未満であることである。オートパイロット条件は、システム管理者等によって設定されたものでもよい。 The terminal device 20 transmits the autopilot conditions in response to the operator's operation (step S4). The autopilot conditions are set based on the error between the model's predicted value and the actual measured value. For example, the autopilot condition may be that the error between the model's predicted value and the actual measured value is less than a predetermined threshold for a predetermined number of times, or that the average number of times the error between the model's predicted value and the actual measured value is determined within the nearest range is less than a predetermined threshold. It must be less than The autopilot conditions may be set by a system administrator or the like.

処理装置１０は、プラントシステム３０から収集した履歴に、処理装置１０によるオートパイロットであるか否かを示すオートパイロットフラグ（第１の付与情報）を少なくとも付与する。そして、処理装置１０は、機器の操作がオペレータによる手動操作である履歴には手動操作であることを示す手動操作フラグ（第２の付与情報）を付与する。そして、処理装置１０は、各履歴を履歴データベース（ＤＢ）に登録する。 The processing device 10 adds to the history collected from the plant system 30 at least an autopilot flag (first attached information) indicating whether or not the autopilot is performed by the processing device 10 . Then, the processing device 10 adds a manual operation flag (second attached information) indicating that the operation of the device is a manual operation to a history in which the operation of the device is a manual operation by an operator. The processing device 10 then registers each history in a history database (DB).

続いて、処理装置１０は、モデルの学習（例えば、機械学習）のために使用する学習データを履歴の中から取得し、例えば重みを付与して、モデルの学習を行い、モデルを使った推論を行う（ステップＳ５）。 Next, the processing device 10 acquires learning data used for model learning (for example, machine learning) from the history, assigns weights, performs model learning, and performs inference using the model. (Step S5).

この際、処理装置１０は、少なくとも、オートパイロットフラグ及び手動操作フラグを参照して、履歴ＤＢに格納された履歴から、モデルの学習データを取得する。具体的には、処理装置１０は、学習データとして、過去の履歴群から、現時刻（推論時刻）よりも所定期間以前であり、かつ、オートパイロットフラグが「ＯＦＦ」である第１の履歴を取得する。そして、処理装置１０は、学習データとして、過去の履歴群から、オートパイロットフラグが「ＯＮ」であり、かつ、手動操作フラグが「ＯＮ」である第２の履歴を取得する。 At this time, the processing device 10 refers to at least the autopilot flag and the manual operation flag and acquires the learning data of the model from the history stored in the history DB. Specifically, the processing device 10 selects, as learning data, a first history that is a predetermined period earlier than the current time (inference time) and in which the autopilot flag is "OFF" from the past history group. get. Then, the processing device 10 acquires, as learning data, a second history in which the autopilot flag is "ON" and the manual operation flag is "ON" from the past history group.

処理装置１０は、オートパイロット条件を用いて、処理装置１０によるオートパイロットの可否を判定する。処理装置１０の各処理の詳細については後述する。 The processing device 10 uses the autopilot conditions to determine whether the processing device 10 can operate the autopilot. Details of each process of the processing device 10 will be described later.

さらに、処理装置１０は、推論結果及びオートパイロット実施判定結果を示すガイダンス画面２１をオペレータの端末装置２０に提示する（ステップＳ６）。処理装置１０は、オートパイロット条件を満たす場合には、ガイダンス画面２１に、オートパイロットの開始が可能になったことを表示する。端末装置２０から、処理装置１０にオートパイロット開始が指示されると（ステップＳ７）、処理装置１０は、モデルを用いた、プラントシステム３０のオートパイロット制御を行う（ステップＳ８）。 Furthermore, the processing device 10 presents the operator's terminal device 20 with a guidance screen 21 showing the inference result and the autopilot implementation determination result (step S6). If the autopilot conditions are satisfied, the processing device 10 displays on the guidance screen 21 that the autopilot can now be started. When the terminal device 20 instructs the processing device 10 to start the autopilot (step S7), the processing device 10 performs autopilot control of the plant system 30 using the model (step S8).

また、処理装置１０は、オートパイロットを実施中、オートパイロット条件を満たさない場合には、ガイダンス画面２１に、オートパイロットの停止指示を表示する。端末装置２０から、オートパイロット停止が指示されると（ステップＳ７）、処理装置１０は、プラントシステム３０のオートパイロット制御を停止する。そして、端末装置２０は、オペレータの操作に応じて、プラントシステム３０の機器を操作する（ステップＳ１）。 Furthermore, when the autopilot condition is not satisfied during the autopilot operation, the processing device 10 displays an instruction to stop the autopilot on the guidance screen 21. When the terminal device 20 instructs to stop the autopilot (step S7), the processing device 10 stops the autopilot control of the plant system 30. Then, the terminal device 20 operates the equipment of the plant system 30 according to the operator's operation (step S1).

ここで、モデルは、オペレータの操作内容を模倣学習により学習する。そのため、モデルによる推論結果として得られる操作内容に従うことで、他のオペレータが操作を模倣することができる。 Here, the model learns the contents of the operator's operations by imitation learning. Therefore, other operators can imitate the operation by following the operation details obtained as a result of inference by the model.

図２を用いて、処理装置１０について詳細に説明する。図２は、実施の形態に係る処理装置１０の構成例を示す図である。 The processing device 10 will be explained in detail using FIG. 2. FIG. 2 is a diagram showing a configuration example of the processing device 10 according to the embodiment.

図２に示すように、処理装置１０は、通信部１１、記憶部１２及び制御部１３を有する。 As shown in FIG. 2, the processing device 10 includes a communication section 11, a storage section 12, and a control section 13.

通信部１１は、ネットワークを介して、他の装置との間でデータ通信を行う。例えば、通信部１１はＮＩＣ（Network Interface Card）である。 The communication unit 11 performs data communication with other devices via the network. For example, the communication unit 11 is a NIC (Network Interface Card).

記憶部１２は、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、光ディスク等の記憶装置である。なお、記憶部１２は、ＲＡＭ（Random Access Memory）、フラッシュメモリ、ＮＶＳＲＡＭ（Non Volatile Static Random Access Memory）等のデータを書き換え可能な半導体メモリであってもよい。 The storage unit 12 is a storage device such as an HDD (Hard Disk Drive), an SSD (Solid State Drive), or an optical disc. Note that the storage unit 12 may be a data-rewritable semiconductor memory such as a RAM (Random Access Memory), a flash memory, or an NVSRAM (Non Volatile Static Random Access Memory).

記憶部１２は、処理装置１０で実行されるＯＳ（Operating System）や各種プログラムを記憶する。記憶部１２は、履歴ＤＢ１２１及びモデル情報１２２を記憶する。 The storage unit 12 stores an OS (Operating System) and various programs executed by the processing device 10. The storage unit 12 stores a history DB 121 and model information 122.

履歴ＤＢ１２１は、プラントシステム３０から提供された履歴を含む情報である。例えば、履歴ＤＢは、毎分レコードが蓄積される。図３は、履歴ＤＢ１２１の例を示す図である。図３に示すように、履歴ＤＢ１２１は、時刻（time）、オペレータ（operator）、状況（situation）、実施内容（operation）等の説明変数リスト、目的変数である設定値を含む。また、履歴ＤＢ１２１は、重みを含んでもよい。また、履歴ＤＢ１２１の履歴には、オートパイロットフラグ及び／または手動操作フラグが付与されている。さらに、履歴ＤＢ１２１の履歴には、手動操作前後フラグ（第３の付与情報）が付与されている。 The history DB 121 is information including history provided from the plant system 30. For example, records are accumulated in the history DB every minute. FIG. 3 is a diagram showing an example of the history DB 121. As shown in FIG. 3, the history DB 121 includes a list of explanatory variables such as time, operator, situation, and operation, and setting values that are objective variables. Further, the history DB 121 may include weights. Furthermore, an autopilot flag and/or a manual operation flag is added to the history in the history DB 121. Furthermore, the history in the history DB 121 is given a manual operation before/after flag (third given information).

時刻は、操作実施時刻を示す。状況の項目は、第１温度、第２温度、第１流量の項目を有する。状況の項目は、第１圧力、第２圧力、生産工程で発生する気体の濃度等を含んでもよい。 The time indicates the time at which the operation is performed. The status item includes items of first temperature, second temperature, and first flow rate. The status items may include the first pressure, the second pressure, the concentration of gas generated in the production process, and the like.

第１温度、第２温度、第１圧力、第２圧力及び第１流量は、それぞれプラントシステム３０の各所に設置されたセンサのセンサ値である。 The first temperature, second temperature, first pressure, second pressure, and first flow rate are sensor values of sensors installed at various locations in the plant system 30, respectively.

第１温度、第２温度、第１圧力、第２圧力及び第１流量は、モデルの説明変数であって、製品の生産工程における状況を表す説明変数の一例である。 The first temperature, second temperature, first pressure, second pressure, and first flow rate are explanatory variables of the model, and are examples of explanatory variables that represent the situation in the product production process.

なお、時刻は、第１温度、第２温度、第１圧力、第２圧力、流量及び気体濃度が取得された日時を示すタイムスタンプである。 Note that the time is a timestamp indicating the date and time when the first temperature, second temperature, first pressure, second pressure, flow rate, and gas concentration were acquired.

実施内容（operation）は、例えば、端末装置２０からの操作によって設定される設定値である。設定値は、実際に設定された値を正規化した値であってもよい。また、設定値は、モデルの目的変数に相当する。 The implementation details (operation) are, for example, setting values set by an operation from the terminal device 20. The set value may be a normalized value of the actually set value. Furthermore, the set value corresponds to the objective variable of the model.

設定値は、モデルの目的変数であって、生産工程における機器の操作を表す目的変数の一例である。 The set value is an objective variable of the model, and is an example of an objective variable that represents the operation of equipment in the production process.

オートパイロットフラグは、処理装置１０（制御装置）によるオートパイロットオートパイロット）であるか否かを示す。オートパイロットフラグが「ＯＮ」である場合には、処理装置１０によるオートパイロット実施が「ＯＮ」である場合に得られた履歴である。オートパイロットフラグが「ＯＦＦ」である場合には、処理装置１０によるオートパイロット実施が「ＯＦＦ」である場合に得られた履歴である。 The autopilot flag indicates whether or not the processing device 10 (control device) is an autopilot (autopilot). When the autopilot flag is "ON", this is the history obtained when the autopilot implementation by the processing device 10 is "ON". When the autopilot flag is "OFF", this is the history obtained when the autopilot implementation by the processing device 10 is "OFF".

手動操作フラグは、機器の操作がオペレータによる手動操作であることを示す。手動操作フラグが「ＯＮ」である場合には、オペレータによる手動操作である場合に得られた履歴である。 The manual operation flag indicates that the device is operated manually by an operator. When the manual operation flag is "ON", this is the history obtained when the manual operation was performed by an operator.

手動操作前後フラグは、手動操作フラグが付与された履歴の前後所定期間（例えば、ｗ分）内の履歴に付与されるフラグである。前後所定期間は、機器の運転状況や、これまでの履歴の各値の推移等を基に設定され、適宜更新される。 The manual operation flag before and after is a flag that is attached to the history within a predetermined period (for example, w minutes) before and after the history to which the manual operation flag is attached. The preceding and succeeding predetermined periods are set based on the operating status of the equipment, changes in each value in the past history, etc., and are updated as appropriate.

例えば、図３には、時刻「13:21:01」における第１温度が「102.1℃」であり、第２温度が「102.8℃」であり、第１流量が「311.5ｍ^３／ｓ」であり、実施内容（operation）が「203.5」であることが示されている。オートパイロットフラグとして「ＯＦＦ」が付与されており、手動操作前後フラグとして「ＯＮ」が付与されている（セルＣ１－９）。この履歴は、オペレータにより手動操作が行われた時刻の前後所定期間の履歴である。 For example, in FIG. 3, the first temperature at time "13:21:01" is "102.1°C", the second temperature is "102.8°C", and the first flow rate is "311.5 m ³ /s". Yes, and the operation content is "203.5". "OFF" is assigned as the autopilot flag, and "ON" is assigned as the manual operation before/after flag (cell C1-9). This history is a history for a predetermined period before and after the time when the manual operation was performed by the operator.

例えば、図３には、時刻「13:23:01」における第１温度が「101.5℃」であり、第２温度が「102.3℃」であり、第１流量が「311.4ｍ^３／ｓ」であり、実施内容（operation）が「206.3」であることが示されている。オートパイロットフラグとして「ＯＦＦ」が付与されており、手動操作フラグとして「ＯＮ」が付与されている（セルＣ３－８）。この履歴は、オペレータにより手動操作が行われた場合の履歴である。 For example, in FIG. 3, the first temperature at time "13:23:01" is "101.5°C", the second temperature is "102.3°C", and the first flow rate is "311.4 m ³ /s". Yes, and the operation content is "206.3". "OFF" is assigned as the autopilot flag, and "ON" is assigned as the manual operation flag (cell C3-8). This history is a history of manual operations performed by the operator.

また、例えば、図３には、時刻「15:33:01」における第１温度が「102.5℃」であり、第２温度が「103.3℃」であり、第１流量が「311.4ｍ^３／ｓ」であり、実施内容（operation）が「206.3」であることが示されている。オートパイロットフラグとして「ＯＮ」が付与されており、手動操作フラグとして「ＯＮ」が付与されている（セルＣ９－８）。この履歴は、オートパイロット中に手動操作が行われた場合、すなわち、オートパイロット利用中に、オペレータがモデルにより計算された予測値を手動で上書きした場合の履歴である。 For example, in FIG. 3, the first temperature at the time "15:33:01" is "102.5°C", the second temperature is "103.3°C", and the first flow rate is "311.4 m ³ /s". ”, indicating that the operation is “206.3”. "ON" is assigned as the autopilot flag, and "ON" is assigned as the manual operation flag (cell C9-8). This history is a history when a manual operation is performed during autopilot, that is, when an operator manually overwrites a predicted value calculated by a model while using autopilot.

モデル情報１２２は、モデルを構築するためのパラメータ等の情報である。例えば、モデルがニューラルネットワークである場合、モデル情報１２２は、各層の重み及びバイアスである。さらに、モデル情報１２２は、前処理の順番、移動平均処理における窓幅（ウィンドウサイズ）等のパラメータを含む。 The model information 122 is information such as parameters for constructing a model. For example, if the model is a neural network, the model information 122 is the weights and biases of each layer. Furthermore, the model information 122 includes parameters such as the order of preprocessing and the window width (window size) in moving average processing.

制御部１３は、処理装置１０全体を制御する。制御部１３は、例えば、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）、ＧＰＵ（Graphics Processing Unit）等の電子回路や、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field Programmable Gate Array）等の集積回路である。 The control unit 13 controls the entire processing device 10 . The control unit 13 includes, for example, electronic circuits such as a CPU (Central Processing Unit), an MPU (Micro Processing Unit), and a GPU (Graphics Processing Unit), an ASIC (Application Specific Integrated Circuit), and an FPGA (Field Programmable Gate Array). It is an integrated circuit.

また、制御部１３は、各種の処理手順を規定したプログラムや制御データを格納するための内部メモリを有し、内部メモリを用いて各処理を実行する。また、制御部１３は、各種のプログラムが動作することにより各種の処理部として機能する。 Further, the control unit 13 has an internal memory for storing programs and control data that define various processing procedures, and executes each process using the internal memory. Further, the control unit 13 functions as various processing units by running various programs.

例えば、制御部１３は、受付部１３１、収集部１３２、登録部１３３、取得部１３４、更新部１３５、推論部１３６、判定部１３７、表示制御部１３８及びオートパイロット制御部１３９を有する。図４を用いて、処理装置１０の各機能部の処理を説明する。図４は、処理装置１０の処理を説明する図である。 For example, the control unit 13 includes a reception unit 131, a collection unit 132, a registration unit 133, an acquisition unit 134, an update unit 135, an inference unit 136, a determination unit 137, a display control unit 138, and an autopilot control unit 139. Processing of each functional unit of the processing device 10 will be explained using FIG. 4. FIG. 4 is a diagram illustrating the processing of the processing device 10.

受付部１３１は、例えば、端末装置２０から、オートパイロット条件を受け付ける。受付部１３１は、オートパイロット条件を新規に受け付けるほか、オートパイロット条件の修正、削除（一部削除を含む）も受け付ける。受付部１３１は、端末装置２０から、オートパイロットの開始指示、オートパイロットの停止指示を受け付ける。 The reception unit 131 receives autopilot conditions from the terminal device 20, for example. The reception unit 131 not only accepts new autopilot conditions, but also accepts modifications and deletions (including partial deletion) of autopilot conditions. The receiving unit 131 receives an autopilot start instruction and an autopilot stop instruction from the terminal device 20 .

収集部１３２は、プラントシステム３０における稼働の履歴を収集し（図４の（１））、収集した履歴を登録部１３３に出力する。履歴は、説明変数と目的変数との組み合わせである。収集部１３２は、各履歴が、手動操作によるものかオートパイロットによるものかを示す情報も収集する。 The collection unit 132 collects the history of operation in the plant system 30 ((1) in FIG. 4), and outputs the collected history to the registration unit 133. History is a combination of explanatory variables and objective variables. The collecting unit 132 also collects information indicating whether each history is caused by manual operation or by autopilot.

登録部１３３は、プラントシステム３０に対するオートパイロットのＯＮ状態またはＯＦＦ状態を収集する（図４の（２））。登録部１３３は、オートパイロット制御部１３９から、プラントシステム３０に対するオートパイロットのＯＮ状態またはＯＦＦ状態の更新情報を収集する。 The registration unit 133 collects the ON state or OFF state of the autopilot for the plant system 30 ((2) in FIG. 4). The registration unit 133 collects update information on the ON state or OFF state of the autopilot for the plant system 30 from the autopilot control unit 139.

そして、登録部１３３は、手動操作によるものかオートパイロットによるものかを示す情報、オートパイロットのＯＮまたはＯＦＦ状態を示す情報を基に、各履歴の、オートパイロットフラグを「ＯＮ」または「ＯＦＦ」にする。また、登録部１３３は、機器の操作がオペレータによる手動操作である履歴には手動操作フラグを「ＯＮ」とする（例えば、図３のセルＣ３－８，Ｃ９－８，Ｃ１０－８）。また、登録部１３３は、履歴のうち、手動操作フラグ「ＯＮ」の履歴の前後所定期間内の履歴の手動操作前後フラグを「ＯＮ」とする（例えば、図３のセルＣ１－９，Ｃ２－９，Ｃ４－９，Ｃ５－９，Ｃ８－９，Ｃ１１－９）。 Then, the registration unit 133 sets the autopilot flag of each history to "ON" or "OFF" based on the information indicating whether the operation was performed manually or by the autopilot, and the information indicating the ON or OFF state of the autopilot. Make it. Further, the registration unit 133 sets a manual operation flag to "ON" for a history in which the device was operated manually by an operator (for example, cells C3-8, C9-8, and C10-8 in FIG. 3). In addition, the registration unit 133 sets the manual operation flag before and after the history within a predetermined period before and after the history with the manual operation flag "ON" out of the history (for example, cells C1-9 and C2- in FIG. 3). 9, C4-9, C5-9, C8-9, C11-9).

登録部１３３は、フラグ付きの履歴を履歴ＤＢ１２１に登録する（図４の（３））。 The registration unit 133 registers the flagged history in the history DB 121 ((3) in FIG. 4).

取得部１３４は、履歴ＤＢ１２１に含まれる履歴の中から、履歴の各フラグを参照し、説明変数と指定された説明変数との距離に基づいて、学習データとして使用する類似履歴を取得する（図４の（４））。取得部１３４は、履歴ＤＢ１２１に含まれる履歴の中から、説明変数と指定された説明変数との距離に加え、重みを基に類似履歴を取得してもよい。 The acquisition unit 134 refers to each flag of the history from among the histories included in the history DB 121 and acquires similar history to be used as learning data based on the distance between the explanatory variable and the designated explanatory variable (see Fig. 4 (4)). The acquisition unit 134 may acquire similar histories from the histories included in the history DB 121 based on the distance between the explanatory variable and the designated explanatory variable as well as the weight.

取得部１３４は、履歴検索キー（説明変数及び／または目的変数）が指定されると、この履歴検索キーに類似する過去の履歴群を、履歴ＤＢ１２１から取得する。取得部１３４は、履歴検索キー（例えば、現在の状況を示す説明変数）と近い状況で収集された過去の履歴群（説明変数、目的変数）として、履歴ＤＢ１２１から取得する。 When a history search key (explanatory variable and/or objective variable) is specified, the acquisition unit 134 acquires a past history group similar to this history search key from the history DB 121. The acquisition unit 134 acquires from the history DB 121 as a past history group (explanatory variables, objective variables) collected in situations similar to the history search key (for example, an explanatory variable indicating the current situation).

指定された説明変数を要求点と呼ぶ。例えば、要求点は、所定の時刻における説明変数（履歴ＤＢ１２１の各センサ値に相当）である。なお、要求点における目的変数（設定値）は未知であってもよい。 The specified explanatory variable is called a required point. For example, the required point is an explanatory variable (corresponding to each sensor value in the history DB 121) at a predetermined time. Note that the objective variable (set value) at the request point may be unknown.

ここで、ＪＩＴ法では、多次元ベクトルである訓練用のデータ（本実施形態の履歴ＤＢ１２１に相当）と多次元ベクトルである要求点とのユークリッド距離を基に過去の履歴群が取得される。例えば、取得部１３４は、ＪＩＴ法を用いて、計算されたユークリッド距離が小さいｋ（ｋは整数）個のレコードであるｋ最近傍（k-NN:k Nearest Neighbors）を取得する。なお、訓練用のデータと要求点との距離は、ユークリッド距離に限られず、例えばマハラノビス距離及びコサイン類似度等であってもよい。 Here, in the JIT method, a past history group is acquired based on the Euclidean distance between training data (corresponding to the history DB 121 of this embodiment), which is a multidimensional vector, and a request point, which is a multidimensional vector. For example, the acquisition unit 134 uses the JIT method to acquire k Nearest Neighbors (k-NN), which are k (k is an integer) records with small calculated Euclidean distances. Note that the distance between the training data and the required point is not limited to the Euclidean distance, and may be, for example, the Mahalanobis distance or cosine similarity.

また、取得部１３４は、訓練用のデータと要求点との距離だけでなく、履歴ＤＢ１２１の重みを参照してレコードを取得してもよい。ここでは、重みが大きいほど取得対象として望ましいデータである場合、例えば、取得部１３４は、履歴ＤＢ１２１の中から、ｋ最近傍であって、かつ重みが大きいデータを優先して取得する。 Further, the acquisition unit 134 may acquire records by referring not only to the distance between the training data and the request point but also to the weight in the history DB 121. Here, if the larger the weight is, the more desirable the data is as an acquisition target, for example, the acquisition unit 134 preferentially acquires data that is the k-nearest neighbor and has a higher weight from the history DB 121.

さらに、取得部１３４は、履歴に付与された各フラグを参照し、取得した過去の履歴群をさらに選別して、学習データである類似履歴を取得する。 Further, the acquisition unit 134 refers to each flag attached to the history, further selects the acquired past history group, and acquires similar history that is learning data.

取得部１３４は、過去の履歴群から、現時刻（推論時刻）よりも所定期間以前（直近Ｈ時間（例えば１２時間）以前）であり、かつ、オートパイロットフラグが「ＯＦＦ」である第１の履歴を取得する。図３の例の場合、取得部１３４は、現時刻（例えば、16:00:00）よりも直近１２時間以前であり、かつ、オートパイロットフラグが「ＯＦＦ」である履歴Ｂ１１を、第１の履歴として取得する。 From the past history group, the acquisition unit 134 acquires a first signal that is a predetermined period earlier than the current time (inferred time) (before the latest H hours (for example, 12 hours)) and whose autopilot flag is "OFF". Get history. In the case of the example in FIG. 3, the acquisition unit 134 acquires the history B11 that is 12 hours earlier than the current time (for example, 16:00:00) and whose autopilot flag is "OFF" in the first Obtain as history.

これによって、取得部１３４は、オートパイロットフラグと時刻とを基に、オートパイロット中の履歴を学習データから除外し、オペレータの手動による履歴のみを取得することができる。なお、Ｈ時間は、オペレータによる操作間隔等を基に設定される。 Thereby, the acquisition unit 134 can exclude the history during autopilot from the learning data based on the autopilot flag and time, and can acquire only the history manually created by the operator. Note that the H time is set based on the operator's operation interval and the like.

そして、取得部１３４は、過去の履歴群から、オートパイロットフラグが「ＯＮ」であり、かつ、手動操作フラグが「ＯＮ」である第２の履歴を取得する。図３の例の場合、取得部１３４は、オートパイロットフラグが「ＯＮ」であり、かつ、手動操作フラグが「ＯＮ」である履歴群Ｂ１２を、第２の履歴として取得する。 Then, the acquisition unit 134 acquires a second history in which the autopilot flag is "ON" and the manual operation flag is "ON" from the past history group. In the example of FIG. 3, the acquisition unit 134 acquires the history group B12 in which the autopilot flag is "ON" and the manual operation flag is "ON" as the second history.

これによって、取得部１３４は、オートパイロット利用中に、オペレータが干渉して、モデルにより計算された予測値を手動で上書きした場合の履歴を学習データに含めることができる。この履歴は、これまでなかった状況時における履歴であると考えられるため、モデルの学習に含めることが望ましい。なお、第２の履歴取得の際には、第１の履歴取得時の、現時刻よりも所定期間以前である条件は、ＯＦＦとする。 Thereby, the acquisition unit 134 can include in the learning data a history in the case where an operator intervenes and manually overwrites the predicted value calculated by the model while using the autopilot. Since this history is considered to be the history of a situation that has never existed before, it is desirable to include it in model learning. Note that when acquiring the second history, the condition that the time is a predetermined period earlier than the current time at the time of acquiring the first history is set to OFF.

また、取得部１３４は、過去の履歴群から、オートパイロットフラグが「ＯＮ」であり、かつ、手動操作前後フラグが「ＯＮ」である第３の履歴を取得する。図３の例の場合、取得部１３４は、オートパイロットフラグが「ＯＮ」であり、かつ、手動操作前後フラグが「ＯＮ」である履歴群Ｂ１５，Ｂ１６を、第３の履歴として取得する。取得部１３４は、オペレータの手動操作による履歴の前後の履歴を学習データに含めることで、オペレータの手動操作に関する学習の補強を可能とする。 The acquisition unit 134 also acquires a third history in which the autopilot flag is "ON" and the manual operation before/after manual operation flag is "ON" from the past history group. In the example of FIG. 3, the acquisition unit 134 acquires the history groups B15 and B16 in which the autopilot flag is "ON" and the manual operation before and after manual operation flags are "ON" as the third history. The acquisition unit 134 makes it possible to reinforce learning regarding the operator's manual operations by including the history before and after the history of the operator's manual operations in the learning data.

更新部１３５は、取得部１３４によって取得された学習データを用いて、説明変数から目的変数を出力するモデルの学習を行い（図４の（５））、モデルを更新する（図４の（６））。 The update unit 135 uses the learning data acquired by the acquisition unit 134 to learn a model that outputs an objective variable from explanatory variables ((5) in FIG. 4), and updates the model ((6) in FIG. )).

更新部１３５は、モデル情報１２２から構築したモデルに、説明変数を入力することにより計算された目的変数と、取得部１３４によって絞り込まれた学習データに含まれる目的変数との差分を表す目的関数を計算し、該目的関数が小さくなるように、学習の終了条件を満たすまでモデルのパラメータ、すなわちモデル情報１２２を繰り返し更新する。 The update unit 135 generates an objective function that represents the difference between the objective variable calculated by inputting the explanatory variables into the model constructed from the model information 122 and the objective variable included in the learning data narrowed down by the acquisition unit 134. The parameters of the model, that is, the model information 122, are repeatedly updated until the learning termination condition is satisfied so that the objective function becomes smaller.

なお、学習データに重みが付与されている場合には、付与された重みで学習データを学習する。 Note that if weights are assigned to the learning data, the learning data is learned using the assigned weights.

例えば、更新部１３５は、第２の履歴を、第１の履歴よりも高い重要度で用いる。或いは、更新部は、第３の履歴については、他の履歴（例えば、第１の履歴）よりも高い重要度で用いてもよい。例えば、更新部１３５は、第１の履歴に対し、第２の履歴及び第３の履歴の重要度を高く設定する。典型的には、第２の履歴と第３の履歴は同一の重要度が設定される。重要度は、例えば二乗誤差の和（平均）を最小にするようにパラメータを更新する手法を使う場合、単純な和や平均を用いるのではなく、第２の履歴及び／または第３の履歴に重みを付与した重み付き平均を使うなどにより設定することができる。履歴に付与される重みは、例えば、各履歴の種別や、第２の履歴と第３の履歴との時間間隔等に応じて、予め設定される。 For example, the update unit 135 uses the second history with higher importance than the first history. Alternatively, the updating unit may use the third history with a higher degree of importance than other histories (for example, the first history). For example, the update unit 135 sets the importance of the second history and the third history to be higher than the first history. Typically, the same importance level is set for the second history and the third history. For example, when using a method of updating parameters to minimize the sum (average) of squared errors, the importance is determined based on the second history and/or the third history, rather than using a simple sum or average. It can be set by using a weighted average. The weight given to the history is set in advance, for example, depending on the type of each history, the time interval between the second history and the third history, and the like.

推論部１３６は、更新後のモデル情報１２２から構築したモデルに、予測用の説明変数を入力することにより目的変数を計算する。すなわち、推論部１３６は推論処理を行う（図４の（７））。推論された目的変数は、例えば、状況から予測される操作内容である。 The inference unit 136 calculates a target variable by inputting explanatory variables for prediction into a model constructed from the updated model information 122. That is, the inference unit 136 performs inference processing ((7) in FIG. 4). The inferred objective variable is, for example, the content of the operation predicted from the situation.

判定部１３７は、オートパイロット条件を用いて、オートパイロットの実施の可否を判定する（図４の（８））。判定部１３７は、モデルが予測した目的変数と実測値との誤差に基づく計算結果が、オートパイロット条件を満たす場合には、オートパイロットの実施開始が可能であると判定する。判定部１３７は、モデルが予測した目的変数と実測値との誤差に基づく計算結果が、オートパイロット条件を満たさない場合には、オートパイロットの実施が不可能であると判定する。 The determining unit 137 uses the autopilot conditions to determine whether or not the autopilot can be implemented ((8) in FIG. 4). The determining unit 137 determines that it is possible to start implementing the autopilot if the calculation result based on the error between the target variable predicted by the model and the actual measured value satisfies the autopilot condition. The determination unit 137 determines that the autopilot cannot be implemented if the calculation result based on the error between the target variable predicted by the model and the actual measured value does not satisfy the autopilot condition.

表示制御部１３８は、推論された目的変数（例えば、操作内容）とともに、オートパイロット実施判定結果を示すガイダンス画面２１を、端末装置２０に表示させる。これによって、表示制御部１３８は、推論結果及びオートパイロット実施判定結果をオペレータに提示する。 The display control unit 138 causes the terminal device 20 to display a guidance screen 21 showing the result of the autopilot execution determination together with the inferred objective variable (for example, operation details). Thereby, the display control unit 138 presents the inference result and the autopilot implementation determination result to the operator.

例えば、表示制御部１３８は、オートパイロットの実施開始が可能であると判定された場合には、オートパイロットの実施開始が可能であることを示す提示内容と、オートパイロットの実施開始の指示ボタンとを含むガイダンス画面を、端末装置２０に表示させる。 For example, when it is determined that it is possible to start implementing the autopilot, the display control unit 138 displays presentation content indicating that it is possible to start implementing the autopilot, and an instruction button to start implementing the autopilot. A guidance screen including the following is displayed on the terminal device 20.

オートパイロット制御部１３９は、端末装置２０からオートパイロットの開始を指示された場合には、モデルを用いた、プラントシステム３０のオートパイロット制御を行う（図４の（１０））。 When the autopilot control unit 139 is instructed to start the autopilot from the terminal device 20, the autopilot control unit 139 performs autopilot control of the plant system 30 using the model ((10) in FIG. 4).

或いは、表示制御部１３８は、オートパイロットの実施開始が不可能であると判定された場合には、オートパイロットの停止指示と、オートパイロットの停止ボタンとを含むガイダンス画面を、端末装置２０に表示させる。 Alternatively, if it is determined that the autopilot cannot be started, the display control unit 138 displays a guidance screen on the terminal device 20 that includes an autopilot stop instruction and an autopilot stop button. let

オートパイロット制御部１３９は、端末装置２０からオートパイロットの停止を指示されることで、プラントシステム３０のオートパイロット制御を停止し、オペレータによる手動操作に切り替える（図４の（９））。 When the autopilot control unit 139 is instructed to stop the autopilot from the terminal device 20, the autopilot control unit 139 stops the autopilot control of the plant system 30 and switches to manual operation by the operator ((9) in FIG. 4).

［実施の形態の処理］
図５を用いて実施の形態における処理手順を説明する。図５は、実施の形態における処理の手順を示すフローチャートである。 [Processing of embodiment]
The processing procedure in the embodiment will be explained using FIG. 5. FIG. 5 is a flowchart showing the processing procedure in the embodiment.

図５に示すように、まず、処理装置１０は、プラントシステム３０における稼働の履歴を収集する（ステップＳ１１）。処理装置１０は、各履歴が、手動操作によるものかオートパイロットによるものかを示す情報も収集する。また、処理装置１０は、プラントシステム３０に対するオートパイロットのＯＮ状態またはＯＦＦ状態を収集する。 As shown in FIG. 5, the processing device 10 first collects the history of operation in the plant system 30 (step S11). The processing device 10 also collects information indicating whether each history is caused by manual operation or autopilot. The processing device 10 also collects the ON state or OFF state of the autopilot for the plant system 30.

続いて、処理装置１０は、履歴が手動操作によるものかオートパイロットによるものかを示す情報、オートパイロットのＯＮまたはＯＦＦ状態を示す情報を基に、各履歴に、オートパイロットフラグ、手動操作フラグ及び手動操作前後フラグを付与し、履歴ＤＢ１２１に登録する（ステップＳ１２）。 Subsequently, the processing device 10 adds an autopilot flag, a manual operation flag, and A before/after manual operation flag is assigned and registered in the history DB 121 (step S12).

処理装置１０は、履歴ＤＢ１２１に含まれる履歴の中から、履歴の各フラグを参照し、説明変数と指定された説明変数との距離に基づいて、学習データである類似履歴として取得する（ステップＳ１３）。処理装置１０は、過去の履歴群から、現時刻よりも所定期間以前であり、かつ、オートパイロットフラグが「ＯＦＦ」である第１の履歴、オートパイロットフラグが「ＯＮ」であり、かつ、手動操作フラグが「ＯＮ」である第２の履歴、及び／または、オートパイロットフラグが「ＯＮ」であり、かつ、手動操作前後フラグが「ＯＮ」である第３の履歴を取得する。 The processing device 10 refers to each flag of the history from among the history included in the history DB 121, and acquires it as a similar history that is learning data based on the distance between the explanatory variable and the designated explanatory variable (step S13 ). From the past history group, the processing device 10 selects a first history that is a predetermined period earlier than the current time and in which the autopilot flag is "OFF", a first history in which the autopilot flag is "ON", and a manual A second history in which the operation flag is "ON" and/or a third history in which the autopilot flag is "ON" and the manual operation before/after manual operation flag is "ON" is acquired.

処理装置１０は、学習データを用いて、説明変数から目的変数を出力するモデルの学習を行い（ステップＳ１４）、モデルを更新する（ステップＳ１５）。 The processing device 10 uses the learning data to learn a model that outputs an objective variable from explanatory variables (step S14), and updates the model (step S15).

処理装置１０は、更新後のモデル情報１２２から構築したモデルに、予測用の説明変数（例えば、第１温度、第２温度、第１流量など）を入力することにより目的変数（例えば、操作内容）を推論する（ステップＳ１６）。 The processing device 10 inputs explanatory variables for prediction (e.g., first temperature, second temperature, first flow rate, etc.) into the model constructed from the updated model information 122, thereby determining target variables (e.g., operation details). ) is inferred (step S16).

処理装置１０は、オートパイロット条件を用いて、オートパイロットの実施の可否を判定する（ステップＳ１７）。 The processing device 10 uses the autopilot conditions to determine whether autopilot can be implemented (step S17).

処理装置１０は、オートパイロットの実施開始が可能であると判定した場合（ステップＳ１７：Ｙｅｓ）、オートパイロットの実施開始が可能であることを示す提示内容と、オートパイロットの実施開始の指示ボタンとを含むガイダンス画面を、端末装置２０に表示させる。そして、処理装置１０は、端末装置２０からオートパイロットの開始を指示された場合には、モデルを用いた、プラントシステム３０のオートパイロット制御を行う（ステップＳ１８）。 When the processing device 10 determines that it is possible to start implementing the autopilot (step S17: Yes), the processing device 10 displays the presentation content indicating that it is possible to start implementing the autopilot, and an instruction button to start implementing the autopilot. A guidance screen including the following is displayed on the terminal device 20. When the processing device 10 is instructed to start the autopilot from the terminal device 20, the processing device 10 performs autopilot control of the plant system 30 using the model (step S18).

処理装置１０は、オートパイロットの実施開始が不可能であると判定した場合（ステップＳ１７：Ｎｏ）、オートパイロットの停止指示と、オートパイロットの停止ボタンとを含むガイダンス画面を、端末装置２０に表示させる。処理装置１０は、端末装置２０からオートパイロットの停止を指示された場合には、プラントシステム３０のオートパイロット制御を停止し、オペレータによる手動操作の切り替え、或いは、手動操作を継続する（ステップＳ１９）。 When the processing device 10 determines that it is impossible to start implementing the autopilot (step S17: No), the processing device 10 displays a guidance screen including an autopilot stop instruction and an autopilot stop button on the terminal device 20. let When the processing device 10 is instructed to stop the autopilot from the terminal device 20, the processing device 10 stops the autopilot control of the plant system 30, and switches the manual operation by the operator or continues the manual operation (step S19). .

［実施の形態の効果］
このように、実施の形態に係る処理装置１０は、例えばプラントシステム３０から、製品の生産工程における状況を表す説明変数と、前記生産工程における機器の操作を表す目的変数との組み合わせである履歴を収集する。そして、処理装置１０は、履歴に、オートパイロットフラグを少なくとも付与し、手動操作フラグを付与して履歴ＤＢ１２１に登録する。 [Effects of embodiment]
In this way, the processing device 10 according to the embodiment receives, for example, a history from the plant system 30 that is a combination of an explanatory variable representing the situation in the product production process and an objective variable representing the operation of equipment in the production process. collect. Then, the processing device 10 adds at least an autopilot flag and a manual operation flag to the history, and registers the history in the history DB 121.

処理装置１０は、履歴ＤＢ１２１に登録された履歴から、少なくとも、現時刻よりも所定期間以前であり、かつ、オートパイロットフラグが「ＯＦＦ」である第１の履歴を取得する。処理装置１０は、オートパイロットフラグが「ＯＮ」であり、かつ、手動操作フラグが「ＯＮ」である第２の履歴を取得する。 From the histories registered in the history DB 121, the processing device 10 acquires at least a first history that is a predetermined period earlier than the current time and in which the autopilot flag is "OFF". The processing device 10 acquires a second history in which the autopilot flag is “ON” and the manual operation flag is “ON”.

処理装置１０は、取得した第１の履歴及び第２の履歴を用いて、説明変数から目的変数を出力するモデルを更新する。このように、処理装置１０は、オートパイロット中の履歴を学習データから除外し、オペレータの手動による履歴のみを用いてモデルを更新するため、モデルの精度劣化を防止することができる。そして、処理装置１０は、オートパイロット利用中に、オペレータが干渉して、モデルにより計算された予測値を手動で上書きした場合の履歴を学習データに含めることで、これまでなかったオペレータの操作内容をモデルに学習させることができる。 The processing device 10 uses the acquired first history and second history to update the model that outputs the target variable from the explanatory variables. In this way, the processing device 10 excludes the history during autopilot from the learning data and updates the model using only the manual history of the operator, so that deterioration in model accuracy can be prevented. The processing device 10 then includes in the learning data a history of cases in which an operator interferes and manually overwrites the predicted value calculated by the model while using the autopilot, thereby providing unprecedented operator operation content. can be trained by a model.

このため、処理装置１０は、模倣学習においてＪＩＴ法による逐次学習に適する第１の履歴及び第２の履歴を学習データとして用いることで、モデルの精度向上を図ることができる。 Therefore, the processing device 10 can improve the accuracy of the model by using the first history and the second history suitable for sequential learning using the JIT method as learning data in imitation learning.

さらに、処理装置１０は、モデル更新の際に、第２の履歴を、第１の履歴よりも高い重要度で用いることによって、これまでなかったオペレータの操作内容を、特に重要な履歴としてモデルに学習させることができ、モデルの精度向上を図ることができる。 Furthermore, when updating the model, the processing device 10 uses the second history with a higher degree of importance than the first history, so that the processing device 10 uses the second history with a higher degree of importance than the first history, so that the contents of the operator's operation that did not exist before are included in the model as particularly important history. It is possible to train the model and improve the accuracy of the model.

また、処理装置１０は、さらに、オートパイロットフラグが「ＯＮ」であり、かつ、手動操作前後フラグが「ＯＮ」である第３の履歴を用いてモデルを更新する。このため、処理装置１０は、オートパイロット中であるが手動操作の前後のオペレータが監視している時間帯の履歴を学習データに含めることで、オペレータの手動操作に関する学習を補強し、モデルの精度の向上を図ることができる。モデル更新の際に、第３の履歴を、他の履歴よりも高い重要度で用いることによって、オートパイロット中であるが手動操作の前後のオペレータが監視している時間帯の操作内容を、特に重要な履歴としてモデルに学習させることができ、モデルの精度向上を図ることができる。 Furthermore, the processing device 10 further updates the model using a third history in which the autopilot flag is “ON” and the manual operation before/after manual operation flag is “ON”. For this reason, the processing device 10 reinforces the learning regarding the operator's manual operation by including in the learning data the history of the time period during which the operator is monitoring before and after the manual operation during autopilot, and improves the accuracy of the model. It is possible to improve the When updating the model, by using the third history with a higher degree of importance than other histories, the details of operations during the time period when the operator is monitoring before and after manual operation during autopilot can be especially evaluated. This can be learned by the model as an important history, and the accuracy of the model can be improved.

したがって、処理装置１０は、オートパイロット中においても、学習に適した履歴のみを用いて、モデルを更新することができるため、モデルの推論精度を向上させることができる。特に、処理装置１０は、模倣学習においてＪＩＴ法による逐次学習に適する学習データを用いてモデル更新を行うため、オートパイロットを適用した場合であっても、プラントシステム３０の適切な操作と、モデルの精度向上とを並行して実現することができる。 Therefore, even during autopilot, the processing device 10 can update the model using only the history suitable for learning, thereby improving the inference accuracy of the model. In particular, since the processing device 10 updates the model using learning data suitable for sequential learning using the JIT method in imitation learning, even when autopilot is applied, the processing device 10 can properly operate the plant system 30 and update the model. Accuracy can be improved at the same time.

［システム構成等］
また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示のように構成されていることを要しない。すなわち、各装置の分散及び統合の具体的形態は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況等に応じて、任意の単位で機能的又は物理的に分散又は統合して構成することができる。さらに、各装置にて行われる各処理機能は、その全部又は任意の一部が、ＣＰＵ（Central Processing Unit）及び当該ＣＰＵにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。なお、プログラムは、ＣＰＵだけでなく、ＧＰＵ等の他のプロセッサによって実行されてもよい。 [System configuration, etc.]
Further, each component of each device shown in the drawings is functionally conceptual, and does not necessarily need to be physically configured as shown in the drawings. In other words, the specific form of distributing and integrating each device is not limited to what is shown in the diagram, and all or part of the devices may be functionally or physically distributed or integrated in arbitrary units depending on various loads and usage conditions. Can be integrated and configured. Furthermore, each processing function performed by each device is realized in whole or in part by a CPU (Central Processing Unit) and a program that is analyzed and executed by the CPU, or by hardware using wired logic. It can be realized as Note that the program may be executed not only by the CPU but also by another processor such as a GPU.

また、本実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部又は一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部又は一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 Further, among the processes described in this embodiment, all or part of the processes described as being performed automatically can be performed manually, or the processes described as being performed manually can be performed manually. All or part of this can also be performed automatically using known methods. In addition, information including processing procedures, control procedures, specific names, and various data and parameters shown in the above documents and drawings may be changed arbitrarily, unless otherwise specified.

［プログラム］
一実施形態として、処理装置１０は、パッケージソフトウェアやオンラインソフトウェアとして上記の学習処理を実行する学習プログラムを所望のコンピュータにインストールさせることによって実装できる。例えば、上記の学習プログラムを情報処理装置に実行させることにより、情報処理装置を処理装置１０として機能させることができる。ここで言う情報処理装置には、デスクトップ型又はノート型のパーソナルコンピュータが含まれる。また、その他にも、情報処理装置には、タブレット型端末、スマートフォン、携帯電話機やＰＨＳ（Personal Handyphone System）等の移動体通信端末、さらには、ＰＤＡ（Personal Digital Assistant）等のスレート端末等がその範疇に含まれる。 [program]
As one embodiment, the processing device 10 can be implemented by installing a learning program that executes the above-described learning process into a desired computer as packaged software or online software. For example, by causing the information processing device to execute the above learning program, the information processing device can be made to function as the processing device 10. The information processing device referred to here includes a desktop or notebook personal computer. In addition, information processing devices include mobile communication terminals such as tablet terminals, smartphones, mobile phones, and PHS (Personal Handyphone System), as well as slate terminals such as PDAs (Personal Digital Assistants). included in the category.

また、処理装置１０は、ユーザが使用する端末装置をクライアントとし、当該クライアントに上記の学習処理に関するサービスを提供するサーバとして実装することもできる。例えば、サーバは、要求点の指定を入力とし、学習済みのモデルを出力とする学習サービスを提供するサーバ装置として実装される。この場合、サーバは、Ｗｅｂサーバとして実装することとしてもよいし、アウトソーシングによって上記の学習処理に関するサービスを提供するクラウドとして実装することとしてもかまわない。 Furthermore, the processing device 10 can also be implemented as a server that uses a terminal device used by a user as a client and provides the client with services related to the above-mentioned learning process. For example, the server is implemented as a server device that provides a learning service in which the specification of a request point is input and a trained model is output. In this case, the server may be implemented as a Web server, or may be implemented as a cloud that provides services related to the above-mentioned learning processing by outsourcing.

図６は、プログラムを実行するコンピュータの一例を示す図である。コンピュータ１０００は、例えば、メモリ１０１０、ＣＰＵ１０２０を有する。また、コンピュータ１０００は、ハードディスクドライブインタフェース１０３０、ディスクドライブインタフェース１０４０、シリアルポートインタフェース１０５０、ビデオアダプタ１０６０、ネットワークインタフェース１０７０を有する。これらの各部は、バス１０８０によって接続される。 FIG. 6 is a diagram showing an example of a computer that executes a program. Computer 1000 includes, for example, a memory 1010 and a CPU 1020. The computer 1000 also includes a hard disk drive interface 1030, a disk drive interface 1040, a serial port interface 1050, a video adapter 1060, and a network interface 1070. These parts are connected by a bus 1080.

メモリ１０１０は、ＲＯＭ（Read Only Memory）１０１１及びＲＡＭ（Random Access Memory）１０１２を含む。ＲＯＭ１０１１は、例えば、ＢＩＯＳ（Basic Input Output System）等のブートプログラムを記憶する。ハードディスクドライブインタフェース１０３０は、ハードディスクドライブ１０９０に接続される。ディスクドライブインタフェース１０４０は、ディスクドライブ１１００に接続される。例えば磁気ディスクや光ディスク等の着脱可能な記憶媒体が、ディスクドライブ１１００に挿入される。シリアルポートインタフェース１０５０は、例えばマウス１１１０、キーボード１１２０に接続される。ビデオアダプタ１０６０は、例えばディスプレイ１１３０に接続される。 The memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM (Random Access Memory) 1012. The ROM 1011 stores, for example, a boot program such as BIOS (Basic Input Output System). Hard disk drive interface 1030 is connected to hard disk drive 1090. Disk drive interface 1040 is connected to disk drive 1100. For example, a removable storage medium such as a magnetic disk or an optical disk is inserted into disk drive 1100. Serial port interface 1050 is connected to, for example, mouse 1110 and keyboard 1120. Video adapter 1060 is connected to display 1130, for example.

ハードディスクドライブ１０９０は、例えば、ＯＳ１０９１、アプリケーションプログラム１０９２、プログラムモジュール１０９３、プログラムデータ１０９４を記憶する。すなわち、処理装置１０の各処理を規定するプログラムは、コンピュータにより実行可能なコードが記述されたプログラムモジュール１０９３として実装される。プログラムモジュール１０９３は、例えばハードディスクドライブ１０９０に記憶される。例えば、処理装置１０における機能構成と同様の処理を実行するためのプログラムモジュール１０９３が、ハードディスクドライブ１０９０に記憶される。なお、ハードディスクドライブ１０９０は、ＳＳＤ（Solid State Drive）により代替されてもよい。 The hard disk drive 1090 stores, for example, an OS 1091, application programs 1092, program modules 1093, and program data 1094. That is, a program that defines each process of the processing device 10 is implemented as a program module 1093 in which computer-executable code is written. Program module 1093 is stored in hard disk drive 1090, for example. For example, a program module 1093 for executing processing similar to the functional configuration of the processing device 10 is stored in the hard disk drive 1090. Note that the hard disk drive 1090 may be replaced by an SSD (Solid State Drive).

また、上述した実施形態の処理で用いられる設定データは、プログラムデータ１０９４として、例えばメモリ１０１０やハードディスクドライブ１０９０に記憶される。そして、ＣＰＵ１０２０は、メモリ１０１０やハードディスクドライブ１０９０に記憶されたプログラムモジュール１０９３やプログラムデータ１０９４を必要に応じてＲＡＭ１０１２に読み出して、上述した実施形態の処理を実行する。 Furthermore, the setting data used in the processing of the embodiment described above is stored as program data 1094 in, for example, the memory 1010 or the hard disk drive 1090. Then, the CPU 1020 reads out the program module 1093 and program data 1094 stored in the memory 1010 and the hard disk drive 1090 to the RAM 1012 as necessary, and executes the processing of the embodiment described above.

なお、プログラムモジュール１０９３やプログラムデータ１０９４は、ハードディスクドライブ１０９０に記憶される場合に限らず、例えば着脱可能な記憶媒体に記憶され、ディスクドライブ１１００等を介してＣＰＵ１０２０によって読み出されてもよい。あるいは、プログラムモジュール１０９３及びプログラムデータ１０９４は、ネットワーク（ＬＡＮ（Local Area Network）、ＷＡＮ（Wide Area Network）等）を介して接続された他のコンピュータに記憶されてもよい。そして、プログラムモジュール１０９３及びプログラムデータ１０９４は、他のコンピュータから、ネットワークインタフェース１０７０を介してＣＰＵ１０２０によって読み出されてもよい。 Note that the program module 1093 and the program data 1094 are not limited to being stored in the hard disk drive 1090, but may be stored in a removable storage medium, for example, and read by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program module 1093 and the program data 1094 may be stored in another computer connected via a network (LAN (Local Area Network), WAN (Wide Area Network), etc.). The program module 1093 and program data 1094 may then be read by the CPU 1020 from another computer via the network interface 1070.

１プラント運用システム
１０処理装置
２０端末装置
３０プラントシステム
１１通信部
１２記憶部
１３制御部
１２１履歴ＤＢ
１２２モデル情報
１３１受付部
１３２収集部
１３３登録部
１３４取得部
１３５更新部
１３６推論部
１３７判定部
１３８表示制御部
１３９オートパイロット制御部 1 Plant operation system 10 Processing device 20 Terminal device 30 Plant system 11 Communication section 12 Storage section 13 Control section 121 History DB
122 Model information 131 Reception unit 132 Collection unit 133 Registration unit 134 Acquisition unit 135 Update unit 136 Inference unit 137 Judgment unit 138 Display control unit 139 Autopilot control unit

Claims

a collection unit that collects a history that is a combination of an explanatory variable that represents a situation in a product production process and a target variable that represents an operation of equipment in the production process;
a registration unit that adds at least first addition information indicating whether or not the operation is automatic by a control device to the history and registers it in a database;
an acquisition unit that acquires at least a first history in which the first given information for a predetermined period before the current time is not an automatic operation by the control device, from the history registered in the database;
an updating unit that uses the history acquired by the acquisition unit to update a model that outputs the objective variable from the explanatory variable;
A learning device characterized by having.

The registration unit adds third attached information to a history within a predetermined period before and after a history in which the operation of the device is a manual operation by an operator, among the history,
The acquisition unit acquires a history to which the first added information indicating that the operation is automatic by the control device is added, from the history registered in the database, and to which the third added information is added. The learning device according to claim 1, wherein a third history is acquired.

The learning device according to claim 2, wherein the updating unit uses the third history with higher importance than other histories.

A learning method executed by a learning device, comprising:
a step of collecting a history that is a combination of explanatory variables representing the situation in the product production process and objective variables representing the operation of equipment in the production process;
Adding at least first added information indicating whether or not the operation is automatic by a control device to the history and registering it in a database;
from the history registered in the database, acquiring at least a first history in which the first given information is not automatically operated by the control device for a predetermined period before the current time;
updating a model that outputs the objective variable from the explanatory variable using the history acquired in the acquiring step;
A learning method characterized by including.

collecting a history that is a combination of an explanatory variable representing the situation in a product production step and a target variable representing the operation of equipment in the production step;
Adding at least first added information indicating whether or not the operation is automatic by a control device to the history and registering it in a database;
from the history registered in the database, acquiring at least a first history in which the first given information for a predetermined period before the current time is not an automatic operation by the control device;
updating a model that outputs the objective variable from the explanatory variable using the history acquired in the acquiring step;
A learning program for making a computer execute