JP2022035737A

JP2022035737A - Control system, control method, control device and program

Info

Publication number: JP2022035737A
Application number: JP2020140257A
Authority: JP
Inventors: 大地木村; Daichi Kimura; 浩二伊藤; Koji Ito; 健一郎島田; Kenichiro Shimada; 知範泉谷; Tomonori Izumitani
Original assignee: NTT Communications Corp
Current assignee: NTT Communications Corp
Priority date: 2020-08-21
Filing date: 2020-08-21
Publication date: 2022-03-04

Abstract

To provide a control system capable of obtaining a control parameter value of high explainability in the case of intervention to automatic control.SOLUTION: A control system 1 comprises: a creation unit 201 for creating a model representing a relation between a state of a controlled object 40 and a control parameter value through simulation learning on the basis of a history of the control parameter value in a case where an operator intervenes the controlled object; and a calculation unit 103 for calculating the control parameter value through the model in accordance with a state of the controlled object.SELECTED DRAWING: Figure 3

Description

本発明は、制御システム、制御方法、制御装置及びプログラムに関する。 The present invention relates to control systems, control methods, control devices and programs.

化学プラントや製鉄プラント、エネルギープラント等の各種プラントでは、ＰＩＤ（Proportional-Integral-Differential）制御を用いた自動制御が広く行われている。ＰＩＤ制御は単純ながらも優れた自動制御手法であるが、プラントの状態によっては人間のオペレータが手動で制御に介入しなければならない場合が多々あることが知られている。例えば、プラントの状態変化や外乱の影響等により自動制御では制御対象を目標に近付けることが困難になった場合、オペレータはセンサ値等を監視しつつ必要に応じて手動で制御に介入する必要がある。 In various plants such as chemical plants, steelmaking plants, and energy plants, automatic control using PID (Proportional-Integral-Differential) control is widely performed. PID control is a simple but excellent automated control method, but it is known that depending on the state of the plant, a human operator often has to manually intervene in the control. For example, when it becomes difficult to bring the controlled object closer to the target by automatic control due to changes in the state of the plant or the influence of disturbance, the operator needs to manually intervene in the control as necessary while monitoring the sensor values. be.

オペレータの介入の増加は作業負担の増加や人件費の増加等に繋がるため、オペレータの介入を低減することが望ましい。このため、近年では、オペレータの介入を低減するために強化学習を利用した自動制御手法が注目されている。強化学習は複雑な系の自動制御に有効な手法であるが、学習初期にはランダムな制御をプラントに対して行うため制御性が悪化し、運転中のプラントに適用することは難しい。これに対して、自動制御を行うのではなく、強化学習でプラントを自動制御した場合の最適な制御パラメータ値を学習しておき、介入が必要となったときに最適な制御パラメータ値をオペレータに提案することも考えられる。 Since an increase in operator intervention leads to an increase in work load and labor cost, it is desirable to reduce operator intervention. Therefore, in recent years, an automatic control method using reinforcement learning has attracted attention in order to reduce operator intervention. Reinforcement learning is an effective method for automatic control of complex systems, but it is difficult to apply it to an operating plant because random control is performed on the plant at the initial stage of learning, resulting in poor controllability. On the other hand, instead of performing automatic control, the optimum control parameter value when the plant is automatically controlled by reinforcement learning is learned, and the optimum control parameter value is given to the operator when intervention is required. It is also possible to make a suggestion.

特開２０１９－６７２３８号公報Japanese Unexamined Patent Publication No. 2019-67238

しかしながら、強化学習はプラントの最適な自動制御をモデル化するため、オペレータに提案された制御パラメータ値の説明可能性（つまり、なぜその制御パラメータ値が提案されたのかといった判断根拠の説明可能性）が低かった。このため、オペレータはその制御パラメータ値が本当に最適な値なのかを判断することは困難であった。 However, since reinforcement learning models the optimal automatic control of the plant, the possibility of explaining the control parameter value proposed to the operator (that is, the possibility of explaining the judgment basis such as why the control parameter value was proposed). Was low. Therefore, it is difficult for the operator to determine whether the control parameter value is really the optimum value.

本発明の一実施形態は、上記の点に鑑みてなされたもので、自動制御に対する介入時に説明可能性の高い制御パラメータ値を得ることを目的とする。 One embodiment of the present invention has been made in view of the above points, and an object of the present invention is to obtain control parameter values that are highly explainable when intervening in automatic control.

上記目的を達成するため、一実施形態に係る制御システムは、制御対象に対してオペレータが介入を行った場合における制御パラメータ値の履歴に基づいて、前記制御対象の状態と前記制御パラメータ値との関係を表すモデルを模倣学習により作成する作成部と、前記制御対象の状態に応じて、前記モデルにより制御パラメータ値を算出する算出部と、を有する。 In order to achieve the above object, the control system according to the embodiment has the state of the controlled object and the control parameter value based on the history of the control parameter value when the operator intervenes in the controlled object. It has a creation unit that creates a model representing a relationship by imitation learning, and a calculation unit that calculates a control parameter value by the model according to the state of the control target.

自動制御に対する介入時に説明可能性の高い制御パラメータ値を得ることができる。 It is possible to obtain control parameter values that are highly explainable when intervening in automatic control.

本実施形態に係る制御システムの全体構成の一例を示す図である。It is a figure which shows an example of the whole structure of the control system which concerns on this embodiment. 本実施形態に係る制御装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware composition of the control device which concerns on this embodiment. 本実施形態に係る制御システムの機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of the control system which concerns on this embodiment. 本実施形態に係るモデル作成処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the model creation process which concerns on this embodiment. 本実施形態に係る制御処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the control process which concerns on this embodiment.

以下、本発明の一実施形態について説明する。本実施形態では、制御対象（例えば、各種プラントや各種設備、各種機器等）の自動制御に対する介入時に説明可能性の高い制御パラメータ値を得ることができる制御システム１について説明する。本実施形態に係る制御システム１は、機械学習手法の１つである模倣学習（Imitation Learning）によりオペレータが過去に介入した時の制御対象の状態と制御パラメータ値の関係をモデル化した上で、このモデル（以下、「介入モデル」ともいう。）を用いて自動制御に対する介入時の制御パラメータ値を得る。これにより、過去に実際にオペレータが介入した時と同様の制御パラメータ値が得られるため、説明可能性の高い制御パラメータ値が得ることが可能となる。したがって、オペレータに提案される制御パラメータの信頼性が確保され、例えば、プラントの安定的な操業にも資することが可能となる。 Hereinafter, an embodiment of the present invention will be described. In this embodiment, a control system 1 capable of obtaining control parameter values that are highly explainable when intervening in automatic control of a controlled object (for example, various plants, various facilities, various devices, etc.) will be described. The control system 1 according to the present embodiment models the relationship between the state of the controlled object and the control parameter value when the operator intervenes in the past by imitation learning, which is one of the machine learning methods. This model (hereinafter, also referred to as “intervention model”) is used to obtain control parameter values at the time of intervention for automatic control. As a result, the same control parameter value as when the operator actually intervened in the past can be obtained, so that it is possible to obtain a control parameter value with high explainability. Therefore, the reliability of the control parameters proposed to the operator is ensured, and for example, it becomes possible to contribute to the stable operation of the plant.

なお、制御パラメータ値とは制御対象を制御するためのパラメータの値のことであり、例えば、制御対象に対する操作量（ＭＶ：Manipulative Variable）や操作量に影響を与える目標値（ＳＶ：Set Variable）等のことである。 The control parameter value is a parameter value for controlling the control target. For example, an operation amount (MV: Manipulative Variable) for the control target or a target value (SV: Set Variable) that affects the operation amount. And so on.

＜全体構成＞
まず、本実施形態に係る制御システム１の全体構成について、図１を参照しながら説明する。図１は、本実施形態に係る制御システム１の全体構成の一例を示す図である。 <Overall configuration>
First, the overall configuration of the control system 1 according to the present embodiment will be described with reference to FIG. FIG. 1 is a diagram showing an example of the overall configuration of the control system 1 according to the present embodiment.

図１に示すように、本実施形態に係る制御システム１は、制御装置１０と、サーバ２０と、オペレータ端末３０と、制御対象４０とを有する。制御装置１０とサーバ２０は、例えば、インターネット等の通信ネットワークを介して通信可能に接続される。また、制御装置１０とオペレータ端末３０と制御対象４０は、例えば、制御ネットワーク等の通信ネットワークを介して通信可能に接続される。 As shown in FIG. 1, the control system 1 according to the present embodiment includes a control device 10, a server 20, an operator terminal 30, and a controlled object 40. The control device 10 and the server 20 are communicably connected via a communication network such as the Internet. Further, the control device 10, the operator terminal 30, and the controlled object 40 are communicably connected via a communication network such as a control network.

制御装置１０は、制御対象４０を制御するコンピュータ又はコンピュータシステムである。このとき、制御装置１０は、フィードバック制御の１つであるＰＩＤ制御等の自動制御手法により制御対象４０を制御する。制御装置１０としては、例えば、ＰＬＣ（Programmable Logic Controller）やＤＣＳ（Distributed Control System）等を用いることが可能である。 The control device 10 is a computer or a computer system that controls the controlled object 40. At this time, the control device 10 controls the control target 40 by an automatic control method such as PID control, which is one of the feedback controls. As the control device 10, for example, a PLC (Programmable Logic Controller), a DCS (Distributed Control System), or the like can be used.

また、制御装置１０は、オペレータの介入が必要になった場合（例えば、制御対象４０の状態（つまり、観測値（ＰＶ：Process Variable））が目標から外れそうになった場合等）に、介入モデルにより制御パラメータ値を算出し、オペレータ端末３０に送信する。これにより、当該制御パラメータ値が、オペレータ端末３０を利用するオペレータに提案される。 Further, the control device 10 intervenes when the intervention of the operator is required (for example, when the state of the controlled object 40 (that is, the observed value (PV: Process Variable)) is about to deviate from the target). The control parameter value is calculated by the model and transmitted to the operator terminal 30. Thereby, the control parameter value is proposed to the operator who uses the operator terminal 30.

サーバ２０は、オペレータが過去に介入した時の履歴（以下、「介入履歴」ともいう。）を用いて模倣学習により介入モデルを作成し、制御装置１０に送信するコンピュータ又はコンピュータシステムである。 The server 20 is a computer or a computer system that creates an intervention model by imitation learning using the history of the operator's intervention in the past (hereinafter, also referred to as “intervention history”) and transmits it to the control device 10.

オペレータ端末３０は、制御対象４０に対する制御を監視したり介入を行ったりするオペレータが利用する各種端末である。オペレータ端末３０としては、例えば、ＰＣ（パーソナルコンピュータ）、タブレット端末、スマートフォン等を用いることが可能である。 The operator terminal 30 is various terminals used by an operator who monitors or intervenes in the control of the controlled object 40. As the operator terminal 30, for example, a PC (personal computer), a tablet terminal, a smartphone, or the like can be used.

制御対象４０は、制御装置１０によって制御される各種プラントや各種設備、各種機器等である。制御対象４０には各種センサ（例えば、温度センサ、流量計、圧力計、濃度計等）が備え付けられており、当該制御対象４０の状態を示す観測値が制御周期毎に制御装置１０に送信（フィードバック）される。なお、観測値とは制御対象４０の状態を表す各種センサ値（例えば、温度、流量、圧力、特定の成分の濃度等）であるが、これら以外にも、観測値には制御対象４０の状態を表す任意の情報（例えば、制御対象４０を撮影した撮影画像、制御対象４０から出力される音を録音した音データ等）が含まれていてもよい。 The control target 40 is various plants, various facilities, various devices, and the like controlled by the control device 10. The control target 40 is equipped with various sensors (for example, a temperature sensor, a flow meter, a pressure gauge, a densitometer, etc.), and observation values indicating the state of the control target 40 are transmitted to the control device 10 at each control cycle (for example, a temperature sensor, a flow meter, a pressure gauge, a densitometer, etc.). Feedback). The observed values are various sensor values (for example, temperature, flow rate, pressure, concentration of a specific component, etc.) representing the state of the controlled object 40, but other than these, the observed values include the state of the controlled object 40. Arbitrary information (for example, a photographed image of the controlled object 40, sound data obtained by recording the sound output from the controlled object 40, etc.) may be included.

なお、図１に示す制御システム１の全体構成は一例であって、他の構成であってもよい。例えば、制御システム１にはサーバ２０が含まれず、制御装置１０で介入モデルを作成するようにしてもよい。 The overall configuration of the control system 1 shown in FIG. 1 is an example, and may be another configuration. For example, the control system 1 does not include the server 20, and the control device 10 may create an intervention model.

＜ハードウェア構成＞
次に、本実施形態に係る制御装置１０のハードウェア構成について、図２を参照しながら説明する。図２は、本実施形態に係る制御装置１０のハードウェア構成の一例を示す図である。 <Hardware configuration>
Next, the hardware configuration of the control device 10 according to the present embodiment will be described with reference to FIG. FIG. 2 is a diagram showing an example of the hardware configuration of the control device 10 according to the present embodiment.

図２に示すように、本実施形態に係る制御装置１０は一般的なコンピュータ又はコンピュータシステムのハードウェア構成で実現され、入力装置１１と、表示装置１２と、外部Ｉ／Ｆ１３と、通信Ｉ／Ｆ１４と、プロセッサ１５と、メモリ装置１６とを有する。これら各ハードウェアは、それぞれがバス１７を介して通信可能に接続されている。 As shown in FIG. 2, the control device 10 according to the present embodiment is realized by a hardware configuration of a general computer or computer system, and includes an input device 11, a display device 12, an external I / F 13, and a communication I /. It has an F14, a processor 15, and a memory device 16. Each of these hardware is connected so as to be communicable via the bus 17.

入力装置１１は、例えば、キーボードやマウス、タッチパネル等である。表示装置１２は、例えば、ディスプレイ等である。なお、制御装置１０は、入力装置１１及び表示装置１２のうちの少なくとも一方を有していなくてもよい。 The input device 11 is, for example, a keyboard, a mouse, a touch panel, or the like. The display device 12 is, for example, a display or the like. The control device 10 does not have to have at least one of the input device 11 and the display device 12.

外部Ｉ／Ｆ１３は、外部装置とのインタフェースである。外部装置には、記録媒体１３ａ等がある。制御装置１０は、外部Ｉ／Ｆ１３を介して、記録媒体１３ａの読み取りや書き込み等を行うことができる。なお、記録媒体１３ａには、例えば、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disk）、ＳＤメモリカード（Secure Digital memory card）、ＵＳＢ（Universal Serial Bus）メモリカード等がある。 The external I / F 13 is an interface with an external device. The external device includes a recording medium 13a and the like. The control device 10 can read or write the recording medium 13a via the external I / F 13. The recording medium 13a includes, for example, a CD (Compact Disc), a DVD (Digital Versatile Disk), an SD memory card (Secure Digital memory card), a USB (Universal Serial Bus) memory card, and the like.

通信Ｉ／Ｆ１４は、制御装置１０を通信ネットワークに接続するためのインタフェースである。プロセッサ１５は、例えば、ＣＰＵ（Central Processing Unit）やＧＰＵ（Graphics Processing Unit）等の各種演算装置である。メモリ装置１６は、例えば、ＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）、フラッシュメモリ等の各種記憶装置である。 The communication I / F 14 is an interface for connecting the control device 10 to the communication network. The processor 15 is, for example, various arithmetic units such as a CPU (Central Processing Unit) and a GPU (Graphics Processing Unit). The memory device 16 is, for example, various storage devices such as an HDD (Hard Disk Drive), an SSD (Solid State Drive), a RAM (Random Access Memory), a ROM (Read Only Memory), and a flash memory.

本実施形態に係る制御装置１０は、図２に示すハードウェア構成を有することにより、後述する各種処理を実現することができる。ただし、図２に示すハードウェア構成は一例であって、制御装置１０は、他のハードウェア構成を有していてもよい。例えば、制御装置１０は、複数のプロセッサ１５を有していてもよいし、複数のメモリ装置１６を有していてもよい。 By having the hardware configuration shown in FIG. 2, the control device 10 according to the present embodiment can realize various processes described later. However, the hardware configuration shown in FIG. 2 is an example, and the control device 10 may have another hardware configuration. For example, the control device 10 may have a plurality of processors 15 or a plurality of memory devices 16.

なお、サーバ２０及びオペレータ端末３０も同様に一般的なコンピュータ又はコンピュータシステムのハードウェア構成で実現され、入力装置と、表示装置と、外部Ｉ／Ｆと、通信Ｉ／Ｆと、プロセッサと、メモリ装置とを有する。ただし、サーバ２０は、入力装置及び表示装置のうちの少なくとも一方を有していなくてもよい。また、サーバ２０及びオペレータ端末３０は、複数のプロセッサを有していてもよいし、複数のメモリ装置を有していてもよい。 Similarly, the server 20 and the operator terminal 30 are realized by the hardware configuration of a general computer or computer system, and the input device, the display device, the external I / F, the communication I / F, the processor, and the memory are realized. Has a device. However, the server 20 does not have to have at least one of an input device and a display device. Further, the server 20 and the operator terminal 30 may have a plurality of processors or may have a plurality of memory devices.

＜機能構成＞
次に、本実施形態に係る制御システム１の機能構成について、図３を参照しながら説明する。図３は、本実施形態に係る制御システム１の機能構成の一例を示す図である。 <Functional configuration>
Next, the functional configuration of the control system 1 according to the present embodiment will be described with reference to FIG. FIG. 3 is a diagram showing an example of the functional configuration of the control system 1 according to the present embodiment.

≪制御装置１０≫
図３に示すように、本実施形態に係る制御装置１０は、制御部１０１と、介入判定部１０２と、算出部１０３と、提案部１０４と、再学習部１０５とを有する。これら各部は、制御装置１０にインストールされた１以上のプログラムがプロセッサ１５に実行させる処理により実現される。 << Control device 10 >>
As shown in FIG. 3, the control device 10 according to the present embodiment includes a control unit 101, an intervention determination unit 102, a calculation unit 103, a proposal unit 104, and a re-learning unit 105. Each of these parts is realized by a process of causing the processor 15 to execute one or more programs installed in the control device 10.

また、本実施形態に係る制御装置１０は、記憶部１０６を有する。記憶部１０６は、例えば、メモリ装置１６により実現される。なお、記憶部１０６は、制御装置１０と通信ネットワークを介して接続される記憶装置（例えば、データベースサーバ等）により実現されていてもよい。 Further, the control device 10 according to the present embodiment has a storage unit 106. The storage unit 106 is realized by, for example, a memory device 16. The storage unit 106 may be realized by a storage device (for example, a database server or the like) connected to the control device 10 via a communication network.

制御部１０１は、ＰＩＤ制御等の自動制御手法により制御対象４０を制御したり、介入モデルにより算出された制御パラメータ値により制御対象４０を制御したりする。すなわち、制御部１０１は、観測値と目標値を用いて自動制御手法により算出した操作量を制御対象４０に送信したり、介入モデルにより算出された制御パラメータ値に基づく操作量を制御対象４０に送信したりすることで当該制御対象４０を制御する。ここで、制御パラメータ値に基づく操作量とは、例えば、制御パラメータ値が操作量である場合には当該操作量そのもののことであり、制御パラメータ値が目標値である場合には観測値と当該目標値とを用いて自動制御手法により算出した操作量のことである。 The control unit 101 controls the control target 40 by an automatic control method such as PID control, or controls the control target 40 by the control parameter value calculated by the intervention model. That is, the control unit 101 transmits the operation amount calculated by the automatic control method using the observed value and the target value to the control target 40, and the operation amount based on the control parameter value calculated by the intervention model is sent to the control target 40. The control target 40 is controlled by transmitting or the like. Here, the manipulated variable based on the control parameter value is, for example, the manipulated variable itself when the control parameter value is the manipulated variable, and the observed value and the said when the control parameter value is the target value. It is the amount of operation calculated by the automatic control method using the target value.

介入判定部１０２は、自動制御に対して介入が必要か否かを判定する。自動制御に対して介入が必要な場合とは、例えば、制御対象４０の状態を示す観測値と目標値の差が所定の閾値を超えた場合や当該観測値が所定の閾値を超えた（又は下回った）場合等が挙げられる。 The intervention determination unit 102 determines whether or not intervention is necessary for the automatic control. When intervention is required for automatic control, for example, the difference between the observed value indicating the state of the controlled object 40 and the target value exceeds a predetermined threshold value, or the observed value exceeds a predetermined threshold value (or). (Below), etc.

算出部１０３は、介入判定部１０２により介入が必要と判定された場合、制御対象４０の現在の状態を示す観測値を用いて、介入モデルにより制御パラメータ値を算出する。 When the intervention determination unit 102 determines that intervention is necessary, the calculation unit 103 calculates the control parameter value by the intervention model using the observation value indicating the current state of the control target 40.

提案部１０４は、算出部１０３により算出された制御パラメータ値をオペレータ端末３０に送信し、この制御パラメータ値をオペレータに提案する。 The proposal unit 104 transmits the control parameter value calculated by the calculation unit 103 to the operator terminal 30, and proposes the control parameter value to the operator.

再学習部１０５は、提案部１０４がオペレータに提案した制御パラメータ値が採用されたか否か（つまり、当該制御パラメータ値で介入が行われたか否か）に応じて、制御対象４０の現在の状態を示す観測値と当該制御パラメータ値とを用いて介入モデルの再学習を行う。 The re-learning unit 105 determines the current state of the controlled object 40 according to whether or not the control parameter value proposed by the proposal unit 104 to the operator is adopted (that is, whether or not the intervention is performed with the control parameter value). The intervention model is relearned using the observed values indicating the above and the control parameter values.

記憶部１０６は、サーバ２０で作成された介入モデルを記憶する。また、記憶部１０６には、オペレータが過去に介入した時の介入履歴も記憶される。なお、記憶部１０６には、自動制御に対する介入が行われる毎に、この介入に関する介入履歴が記憶される。 The storage unit 106 stores the intervention model created by the server 20. In addition, the storage unit 106 also stores the intervention history when the operator intervened in the past. In addition, every time an intervention for automatic control is performed, the storage unit 106 stores an intervention history related to this intervention.

ここで、各介入履歴には、例えば、介入が行われた日時と、この介入時の制御対象４０の状態を示す観測値と、この介入時の制御パラメータ値とが含まれる。なお、これら以外にも、各介入履歴には、例えば、当該介入を行ったオペレータのＩＤ（以下、「オペレータＩＤ」ともいう。）が含まれていてもよいし、当該介入の結果を示す情報（例えば、次の制御周期（又はそれ以降の制御周期）における観測値とその目標値との差等）が含まれていてもよい。 Here, each intervention history includes, for example, the date and time when the intervention was performed, the observed value indicating the state of the controlled object 40 at the time of the intervention, and the control parameter value at the time of the intervention. In addition to these, each intervention history may include, for example, the ID of the operator who performed the intervention (hereinafter, also referred to as “operator ID”), and information indicating the result of the intervention. (For example, the difference between the observed value and the target value in the next control cycle (or the control cycle after that)) may be included.

≪サーバ２０≫
図３に示すように、本実施形態に係るサーバ２０は、モデル作成部２０１を有する。モデル作成部２０１は、例えば、サーバ２０にインストールされた１以上のプログラムがプロセッサに実行させる処理により実現される。 ≪Server 20≫
As shown in FIG. 3, the server 20 according to the present embodiment has a model creation unit 201. The model creation unit 201 is realized, for example, by a process of causing a processor to execute one or more programs installed in the server 20.

また、本実施形態に係るサーバ２０は、記憶部２０２を有する。記憶部２０２は、例えば、メモリ装置により実現される。なお、記憶部２０２は、サーバ２０と通信ネットワークを介して接続される記憶装置（例えば、データベースサーバ等）により実現されていてもよい。 Further, the server 20 according to the present embodiment has a storage unit 202. The storage unit 202 is realized by, for example, a memory device. The storage unit 202 may be realized by a storage device (for example, a database server or the like) connected to the server 20 via a communication network.

モデル作成部２０１は、記憶部２０２に記憶されている複数の介入履歴を用いて模倣学習により介入モデルを作成（学習）する。すなわち、モデル作成部２０１は、複数の介入履歴を用いて、制御対象４０の状態を示す観測値と当該状態のときに行われた介入の制御パラメータ値との関係を模倣学習によりモデル化し、観測値を入力、制御パラメータ値を出力とする介入モデルを作成する。そして、モデル作成部２０１は、当該介入モデルを制御装置１０に送信する。なお、模倣学習とは機械学習手法の１つ（特に、強化学習に類似する枠組みの機械学習手法の１つ）であり、行動履歴（本実施形態では介入履歴）を用いて環境（本実施形態では観測値）に対する最適な行動（本実施形態では制御パラメータ値）を学習する手法のことである。 The model creation unit 201 creates (learns) an intervention model by imitation learning using a plurality of intervention histories stored in the storage unit 202. That is, the model creation unit 201 uses a plurality of intervention histories to model and observe the relationship between the observed value indicating the state of the controlled object 40 and the control parameter value of the intervention performed in the state by imitation learning. Create an intervention model that inputs values and outputs control parameter values. Then, the model creation unit 201 transmits the intervention model to the control device 10. It should be noted that imitation learning is one of the machine learning methods (particularly, one of the machine learning methods of the framework similar to the enhanced learning), and the environment (in the present embodiment) using the behavior history (intervention history in the present embodiment). Then, it is a method of learning the optimum behavior (control parameter value in this embodiment) for the observed value).

記憶部２０２は、制御装置１０から送信された複数の介入履歴を記憶する。これらの介入履歴は、介入モデルが作成される際に、例えば、制御装置１０から送信される。 The storage unit 202 stores a plurality of intervention histories transmitted from the control device 10. These intervention histories are transmitted, for example, from the control device 10 when the intervention model is created.

なお、図３に示す制御システム１の機能構成は一例であって、他の構成であってもよい。例えば、制御装置１０で介入モデルが作成される場合には、制御装置１０がモデル作成部２０１を有していてもよい。 The functional configuration of the control system 1 shown in FIG. 3 is an example, and may be another configuration. For example, when the intervention model is created by the control device 10, the control device 10 may have the model creation unit 201.

＜モデル作成処理＞
次に、本実施形態に係るモデル作成処理の流れについて、図４を参照しながら説明する。図４は、本実施形態に係るモデル作成処理の流れの一例を示すフローチャートである。なお、図４に示すモデル作成処理は、後述する制御処理よりも前に実行される。以降では、サーバ２０の記憶部２０２には、制御装置１０から送信された複数の介入履歴が記憶されているものとする。 <Model creation process>
Next, the flow of the model creation process according to the present embodiment will be described with reference to FIG. FIG. 4 is a flowchart showing an example of the flow of the model creation process according to the present embodiment. The model creation process shown in FIG. 4 is executed before the control process described later. Hereinafter, it is assumed that the storage unit 202 of the server 20 stores a plurality of intervention histories transmitted from the control device 10.

モデル作成部２０１は、記憶部２０２に記憶されている複数の介入履歴を用いて模倣学習により介入モデルを作成する（ステップＳ１０１）。このとき、モデル作成部２０１は、記憶部２０２に記憶されている全ての介入履歴を用いて介入モデルを作成してもよいし、記憶部２０２に記憶されている複数の介入履歴の中から選択した一部の介入履歴を用いて介入モデルを作成してもよい。ここで、介入モデルの作成に用いられる介入履歴を選択する際には任意の方法で選択すればよいが、例えば、以下の選択方法１～選択方法３のいずれかの方法により選択することが考えられる。 The model creation unit 201 creates an intervention model by imitation learning using a plurality of intervention histories stored in the storage unit 202 (step S101). At this time, the model creation unit 201 may create an intervention model using all the intervention histories stored in the storage unit 202, or may select from a plurality of intervention histories stored in the storage unit 202. An intervention model may be created using some of the intervention histories. Here, when selecting the intervention history used for creating the intervention model, it may be selected by any method, but for example, it is conceivable to select by any of the following selection methods 1 to 3. Will be.

選択方法１：記憶部２０２に記憶されている複数の介入履歴の中からオペレータ（又は介入モデル作成の担当者）の判断により介入モデルの作成に用いる介入履歴を選択する。これは、過去の介入履歴の中から人間が「良い介入が行われた時の介入履歴」と「悪い介入が行われた時の介入履歴」を決定及び選択することを意味する。 Selection method 1: The intervention history used for creating the intervention model is selected from the plurality of intervention histories stored in the storage unit 202 at the discretion of the operator (or the person in charge of creating the intervention model). This means that a human determines and selects "intervention history when a good intervention is performed" and "intervention history when a bad intervention is performed" from the past intervention history.

選択方法２：記憶部２０２に記憶されている複数の介入履歴を所定の期間毎に分割した上で、各期間で所定の統計値（例えば、自己相関関数値又は相互相関関数値）を算出し、これらの統計値により介入モデルの作成に用いる介入履歴を選択する。具体的には、例えば、統計値が自己相関関数値又は相互相関関数値である場合、自己相関関数値又は相互相関関数値が所定の閾値以上（又は未満）の期間に含まれる介入履歴を選択すればよい。これにより、相関（自己相関又は相互相関）がある（又はない）期間に含まれる介入履歴を選択することができる。 Selection method 2: After dividing a plurality of intervention histories stored in the storage unit 202 into predetermined periods, a predetermined statistical value (for example, an autocorrelation function value or a cross-correlation function value) is calculated in each period. , Select the intervention history used to create the intervention model based on these statistics. Specifically, for example, when the statistical value is an autocorrelation function value or a cross-correlation function value, an intervention history in which the autocorrelation function value or the cross-correlation function value is included in a period equal to or less than a predetermined threshold is selected. do it. This allows you to select an intervention history that is included in a period with (or without) correlation (autocorrelation or cross-correlation).

選択方法３：記憶部２０２に記憶されている複数の介入履歴のうち、或る特定のオペレータＩＤが含まれる介入履歴を選択したり、或る特定のオペレータＩＤが含まれる介入履歴以外の介入履歴を選択したりする。具体的には、例えば、熟練者のオペレータのオペレータＩＤが含まれる介入履歴を選択したり、経験の浅いオペレータのオペレータＩＤが含まれる介入履歴以外の介入履歴を選択したりすればよい。これにより、介入時の制御パラメータ値の決定が上手いオペレータの介入履歴を選択することができたり、逆に下手なオペレータの介入履歴を除外したりすることができる。 Selection method 3: Among a plurality of intervention histories stored in the storage unit 202, an intervention history including a specific operator ID is selected, or an intervention history other than an intervention history including a specific operator ID is selected. Or select. Specifically, for example, an intervention history including an operator ID of a skilled operator may be selected, or an intervention history other than an intervention history including an operator ID of an inexperienced operator may be selected. As a result, it is possible to select the intervention history of an operator who is good at determining the control parameter value at the time of intervention, or conversely, it is possible to exclude the intervention history of an operator who is not good at determining the control parameter value.

そして、モデル作成部２０１は、上記のステップＳ１０１で作成された介入モデルを制御装置１０に送信する（ステップＳ１０２）。これにより、制御装置１０の記憶部１０６に当該介入モデルが記憶される。 Then, the model creation unit 201 transmits the intervention model created in step S101 to the control device 10 (step S102). As a result, the intervention model is stored in the storage unit 106 of the control device 10.

以上のように、本実施形態に係る制御システム１は、オペレータが過去に行った実際の介入の履歴を用いて、当該介入時の制御対象４０の状態を示す観測値と当該介入時の制御パラメータ値との関係を模倣学習によりモデル化する。これにより、介入時のオペレータと同等の制御則をモデル化することが可能となり、後述するように、介入の必要が発生した際のオペレータの負担を軽減させることができると共に、説明可能性の高い制御パラメータ値をオペレータに提案することができるようになる。 As described above, the control system 1 according to the present embodiment uses the history of the actual intervention performed by the operator in the past, and the observed value indicating the state of the controlled object 40 at the time of the intervention and the control parameter at the time of the intervention. The relationship with the value is modeled by imitation learning. This makes it possible to model the same control rules as the operator at the time of intervention, and as will be described later, it is possible to reduce the burden on the operator when the need for intervention occurs, and it is highly explainable. It becomes possible to propose control parameter values to the operator.

＜制御処理＞
次に、本実施形態に係る制御処理の流れについて、図５を参照しながら説明する。図５は、本実施形態に係る制御処理の流れの一例を示すフローチャートである。この図５に示す制御処理は制御周期毎に繰り返し実行される。以降では、或る１つの制御周期における制御処理について説明する。また、以降では、制御装置１０の記憶部１０６には、サーバ２０で作成された介入モデルが記憶されているものとする。 <Control processing>
Next, the flow of the control process according to the present embodiment will be described with reference to FIG. FIG. 5 is a flowchart showing an example of the flow of control processing according to the present embodiment. The control process shown in FIG. 5 is repeatedly executed every control cycle. Hereinafter, the control process in a certain control cycle will be described. Further, from now on, it is assumed that the intervention model created by the server 20 is stored in the storage unit 106 of the control device 10.

制御部１０１は、制御対象４０の現在の状態を示す観測値を受信する（ステップＳ２０１）。 The control unit 101 receives an observed value indicating the current state of the controlled object 40 (step S201).

介入判定部１０２は、上記のステップＳ２０１で受信した観測値から介入が必要か否かを判定する（ステップＳ２０２）。なお、上述したように、介入が必要な場合とは、例えば、当該観測値と目標値の差が所定の閾値を超えた場合や当該観測値が所定の閾値を超えた（又は下回った）場合等が挙げられる。 The intervention determination unit 102 determines whether or not intervention is necessary from the observation value received in step S201 above (step S202). As described above, the case where intervention is required is, for example, the case where the difference between the observed value and the target value exceeds a predetermined threshold value or the case where the observed value exceeds (or falls below) a predetermined threshold value. And so on.

上記のステップＳ２０２で介入が必要ないと判定された場合、制御部１０１は、当該観測値と目標値を用いて自動制御手法により算出した操作量を制御対象４０に送信する（ステップＳ２０３）。これにより、当該操作量に従って制御対象４０が制御される。 When it is determined in step S202 that intervention is not necessary, the control unit 101 transmits the manipulated variable calculated by the automatic control method using the observed value and the target value to the controlled object 40 (step S203). As a result, the control target 40 is controlled according to the operation amount.

一方で、上記のステップＳ２０２で介入が必要であると判定された場合、算出部１０３は、記憶部１０６に記憶されている介入モデルにより制御パラメータ値を算出する（ステップＳ２０４）。すなわち、算出部１０３は、上記のステップＳ２０１で受信した観測値を介入モデルに入力することで、その出力として制御パラメータ値を算出する。 On the other hand, when it is determined in step S202 that intervention is necessary, the calculation unit 103 calculates the control parameter value by the intervention model stored in the storage unit 106 (step S204). That is, the calculation unit 103 inputs the observation value received in step S201 to the intervention model, and calculates the control parameter value as its output.

ここで、算出部１０３は、上記の制御パラメータ値に加えて、その制御パラメータ値の根拠を表す根拠情報を作成してもよい。例えば、算出部１０３は、以下の根拠情報１～根拠情報４のうちの１つ以上の根拠情報を作成すればよい。 Here, in addition to the above-mentioned control parameter value, the calculation unit 103 may create evidence information indicating the basis of the control parameter value. For example, the calculation unit 103 may create one or more of the following grounds information 1 to ground information 4.

根拠情報１：制御対象４０の現在の状態を示す観測値と当該制御パラメータ値とを用いて、介入モデルの作成及び再学習に用いられた複数の介入履歴を検索し、その検索結果を根拠情報として作成する。これにより、例えば、検索結果に含まれるオペレータＩＤ（つまり、過去に制御対象４０が同様の状態のときに同様の制御パラメータ値で介入を行ったオペレータＩＤを）等を、オペレータ端末３０のオペレータに提示することが可能となる。また、このとき、例えば、介入の結果を示す情報が介入履歴に含まれる場合には、この介入の結果を示す情報も当該オペレータに提示することが可能となる。なお、介入モデルの再学習については後述する。 Evidence information 1: Using the observed values indicating the current state of the controlled object 40 and the control parameter values, a plurality of intervention histories used for creating and relearning the intervention model are searched, and the search results are used as evidence information. Create as. As a result, for example, the operator ID included in the search result (that is, the operator ID that intervened with the same control parameter value when the control target 40 was in the same state in the past) or the like is given to the operator of the operator terminal 30. It will be possible to present. At this time, for example, when the intervention history includes information indicating the result of the intervention, the information indicating the result of the intervention can also be presented to the operator. The re-learning of the intervention model will be described later.

根拠情報２：上記の根拠情報１で得られたオペレータＩＤ（及び介入の結果を示す情報）を数値化した情報を根拠情報としてもよい。このとき、例えば、オペレータの熟練度や経験に応じて、熟練度が高かったり経験が豊富なほど根拠情報の値を高くし、熟練度が低かったり経験が浅いほど根拠情報の値を低くすればよい。また、介入の結果を示す情報に応じて、制御対象４０の状態が目標に近づくほど根拠情報の値を高くし、そうでないほど根拠情報の値を低くすればよい。 Evidence information 2: The information obtained by quantifying the operator ID (and the information indicating the result of the intervention) obtained in the above-mentioned evidence information 1 may be used as the evidence information. At this time, for example, depending on the skill level and experience of the operator, the higher the skill level or the more experience, the higher the value of the evidence information, and the lower the skill level or the less experience, the lower the value of the evidence information. good. Further, depending on the information indicating the result of the intervention, the value of the evidence information may be increased as the state of the controlled object 40 approaches the target, and the value of the evidence information may be decreased as the state of the controlled object 40 approaches the target.

根拠情報３：制御対象４０の現在の状態を示す観測値及び当該制御パラメータ値と記憶部１０６に記憶されている複数の介入履歴のうちの直近のＮ－１（ただし、Ｎは予め決められた自然数）個の介入履歴とを用いて、介入モデルの作成及び再学習に用いられた複数の介入履歴のうちのＮ個の介入履歴との相互相関関数値を類似度として算出した上で、最も高い値の類似度が得られたＮ個の介入履歴と当該類似度とを根拠情報として作成する。これにより、制御対象４０の現在の状態と類似する過去の介入履歴と、それがどの程度類似するのかとをオペレータに提示することが可能となる。なお、上記の相互相関関数の代わりに、動的時間伸縮法（ＤＴＷ：Dynamic Time Warping）により類似度が算出されてもよい。 Rationale information 3: The most recent N-1 (however, N is predetermined) among the observed value indicating the current state of the controlled object 40, the control parameter value, and the plurality of intervention histories stored in the storage unit 106. Using the natural number) intervention history, the cross-correlation function value with N intervention histories out of the multiple intervention histories used to create and relearn the intervention model was calculated as the degree of similarity, and then the most. The history of N interventions for which a high degree of similarity was obtained and the degree of similarity are used as evidence information. This makes it possible to present to the operator the past intervention history similar to the current state of the controlled object 40 and how similar it is. Instead of the above-mentioned cross-correlation function, the similarity may be calculated by the dynamic time warping method (DTW).

根拠情報４：既知の要因可視化技術を用いて、介入モデルの作成及び再学習に用いられた複数の介入履歴のうちどの介入履歴が判断根拠となっているか示す情報を根拠情報として作成する。なお、このような要因可視化技術は機械学習モデルの推論結果に対する判断根拠（要因）を可視化する技術として一般に知られている。 Evidence information 4: Using known factor visualization technology, information indicating which intervention history is the basis for judgment among a plurality of intervention histories used for creating and relearning an intervention model is created as evidence information. It should be noted that such a factor visualization technique is generally known as a technique for visualizing the judgment basis (factor) for the inference result of the machine learning model.

ステップＳ２０４に続いて、提案部１０４は、上記のステップＳ２０４で算出された制御パラメータ値（及びその根拠情報）をオペレータ端末３０に送信する（ステップＳ２０５）。これにより、当該オペレータ端末３０のオペレータに対して当該制御パラメータ値が提案される。当該制御パラメータ値を受信したオペレータ端末３０は、例えば、この制御パラメータ値を任意の形態（例えば、数値やグラフ等）で画面上に表示すると共に、アラートを発出したり、警告灯を点滅させたりしてもよい。これに対して、オペレータはオペレータ端末３０を操作し、制御装置１０から提案された制御パラメータ値を採用するか否かを当該制御装置１０に返信する。このとき、オペレータが当該制御パラメータ値を採用しない場合は、当該制御パラメータ値とは異なる値の新たな制御パラメータ値を返信する。 Following step S204, the proposal unit 104 transmits the control parameter value (and its basis information) calculated in step S204 to the operator terminal 30 (step S205). As a result, the control parameter value is proposed to the operator of the operator terminal 30. The operator terminal 30 that has received the control parameter value displays the control parameter value on the screen in an arbitrary form (for example, a numerical value, a graph, etc.), issues an alert, or blinks a warning light. You may. On the other hand, the operator operates the operator terminal 30 and returns to the control device 10 whether or not to adopt the control parameter value proposed by the control device 10. At this time, if the operator does not adopt the control parameter value, a new control parameter value different from the control parameter value is returned.

なお、オペレータは介入不要と判断した場合には、オペレータ端末３０を操作し、介入不要であることを示す情報を制御装置１０に返信してもよい。この場合は、上記のステップＳ２０３が実行され、自動制御が行われる。 If the operator determines that intervention is unnecessary, the operator may operate the operator terminal 30 and return information indicating that intervention is unnecessary to the control device 10. In this case, the above step S203 is executed and automatic control is performed.

次に、制御部１０１は、オペレータ端末３０から採用を示す情報が返信された場合は上記のステップＳ２０４で算出された制御パラメータ値に基づく操作量を制御対象４０に送信し、オペレータ端末３０から不採用を示す情報と新たな制御パラメータ値が返信された場合は新たな制御パラメータ値に基づく操作量を制御対象４０に送信する（ステップＳ２０６）。なお、このとき、制御部１０１は、上記のステップＳ２０１で受信した観測値（つまり、制御対象４０の現在の状態を示す観測値）と、上記のステップＳ２０４で算出した制御パラメータ値又は新たな制御パラメータ値とを含む介入履歴を作成し、記憶部１０６に記憶させる。 Next, when the operator terminal 30 returns the information indicating the adoption, the control unit 101 transmits the operation amount based on the control parameter value calculated in step S204 to the control target 40, and the operator terminal 30 fails. When the information indicating adoption and the new control parameter value are returned, the operation amount based on the new control parameter value is transmitted to the control target 40 (step S206). At this time, the control unit 101 has the observed value received in step S201 (that is, the observed value indicating the current state of the controlled object 40), the control parameter value calculated in step S204, or new control. An intervention history including parameter values is created and stored in the storage unit 106.

続いて、再学習部１０５は、上記のステップＳ２０５におけるオペレータ端末３０の返信結果（採用又は不採用）に応じて、記憶部１０６に記憶されている介入モデルを再学習する（ステップＳ２０７）。すなわち、再学習部１０５は、上記のステップＳ２０１で受信した観測値と上記のステップＳ２０４で算出した制御パラメータ値とを用いて模倣学習により介入モデルを再学習する。このとき、再学習部１０５は、上記のステップＳ２０５におけるオペレータ端末３０の返信結果が不採用を示す情報である場合はペナルティが課されるように介入モデルの再学習を行う。このようなペナルティは、介入モデルの作成及び再学習に用いられる目的関数に対して、不採用を示す情報がオペレータ端末３０から返信された場合には目的関数値の評価に対して罰則を課す項（これは罰則項又はペナルティ項等と呼ばれる。）を追加することで実現することができる。 Subsequently, the re-learning unit 105 relearns the intervention model stored in the storage unit 106 according to the reply result (adopted or rejected) of the operator terminal 30 in step S205 (step S207). That is, the re-learning unit 105 relearns the intervention model by imitation learning using the observation value received in step S201 and the control parameter value calculated in step S204. At this time, the re-learning unit 105 relearns the intervention model so that a penalty is imposed if the reply result of the operator terminal 30 in step S205 is information indicating rejection. Such a penalty imposes a penalty on the evaluation of the objective function value when the operator terminal 30 returns information indicating rejection to the objective function used for creating and relearning the intervention model. (This is called a penalty item or a penalty item, etc.) can be added.

以上のように、本実施形態に係る制御システム１は、オペレータが過去に行った実際の介入の履歴を模倣学習によりモデル化した介入モデルを用いて、制御対象４０の自動制御に対して介入の必要が生じた場合に制御パラメータ値をオペレータに提案する。また、このとき、本実施形態に係る制御システム１は、その制御パラメータ値を介入モデルが算出したことの根拠を表す情報も当該オペレータに提示することができる。これにより、オペレータの負担を軽減させることができると共に、説明可能性の高い制御パラメータ値をオペレータに提案することができるようになる。 As described above, the control system 1 according to the present embodiment intervenes in the automatic control of the controlled object 40 by using the intervention model in which the history of the actual intervention performed by the operator in the past is modeled by imitation learning. Propose control parameter values to the operator when the need arises. At this time, the control system 1 according to the present embodiment can also present to the operator information indicating the basis for calculating the control parameter value by the intervention model. As a result, the burden on the operator can be reduced, and control parameter values that are highly explainable can be proposed to the operator.

＜変形例＞
以下、本実施形態の変形例について説明する。 <Modification example>
Hereinafter, a modification of the present embodiment will be described.

≪変形例１≫
本実施形態では、介入モデルにより算出された制御パラメータ値をオペレータに提案したが、オペレータに提案せずに、当該制御パラメータ値に基づく操作量が制御対象４０に送信されてもよい。つまり、自動制御に対して介入の必要があると判定された場合には、介入モデルにより算出された制御パラメータ値に基づく操作量により制御対象４０が制御されてもよい。 << Modification 1 >>
In the present embodiment, the control parameter value calculated by the intervention model is proposed to the operator, but the operation amount based on the control parameter value may be transmitted to the control target 40 without proposing to the operator. That is, when it is determined that intervention is necessary for automatic control, the control target 40 may be controlled by the operation amount based on the control parameter value calculated by the intervention model.

また、このとき、上記の根拠情報２の値や上記の根拠情報３の類似度（これらの値や類似度は「確信度」等と称されてもよい。）が所定の閾値を超えている場合（つまり、確信度が高く、介入モデルにより算出された制御パラメータ値で制御対象４０を適切に制御できる可能性が高い場合）にのみ制御パラメータ値に基づく操作量が制御対象４０に送信されてもよい。 Further, at this time, the value of the above-mentioned evidence information 2 and the degree of similarity of the above-mentioned evidence information 3 (these values and the degree of similarity may be referred to as “confidence” or the like) exceed a predetermined threshold value. The operation amount based on the control parameter value is transmitted to the control target 40 only in the case (that is, when the certainty is high and there is a high possibility that the control target 40 can be appropriately controlled by the control parameter value calculated by the intervention model). May be good.

≪変形例２≫
本実施形態では、１つの介入モデルを作成し、この介入モデルにより制御パラメータ値を算出したが、複数の介入モデルを作成し、予め決められた条件に応じて介入モデルを切り替えて使用してもよい。例えば、夜間用の介入モデルと昼間用の介入モデルを作成し、制御対象４０の運用時間帯に応じて介入モデルを切り替えてもよい。同様に、例えば、製品の種類毎に介入モデルを作成し、制御対象４０が製造する製品に応じて介入モデルを切り替えてもよい。また、例えば、制御対象４０の状態が取り得る範囲（例えば、温度の範囲等）毎に複数の介入モデルを作成し、制御対象４０の状態に応じて介入モデルを切り替えてもよい。 << Modification 2 >>
In the present embodiment, one intervention model is created and control parameter values are calculated by this intervention model. However, even if a plurality of intervention models are created and the intervention models are switched and used according to predetermined conditions. good. For example, an intervention model for nighttime and an intervention model for daytime may be created, and the intervention model may be switched according to the operating time zone of the controlled object 40. Similarly, for example, an intervention model may be created for each product type, and the intervention model may be switched according to the product manufactured by the controlled object 40. Further, for example, a plurality of intervention models may be created for each range (for example, a temperature range) in which the state of the control target 40 can be taken, and the intervention model may be switched according to the state of the control target 40.

本発明は、具体的に開示された上記の実施形態に限定されるものではなく、特許請求の範囲の記載から逸脱することなく、種々の変形や変更、既知の技術との組み合わせ等が可能である。 The present invention is not limited to the above-described embodiment disclosed specifically, and various modifications and modifications, combinations with known techniques, and the like are possible without departing from the description of the scope of claims. be.

１制御システム
１０制御装置
１１入力装置
１２表示装置
１３外部Ｉ／Ｆ
１３ａ記録媒体
１４通信Ｉ／Ｆ
１５プロセッサ
１６メモリ装置
１７バス
２０サーバ
３０オペレータ端末
４０制御対象
１０１制御部
１０２介入判定部
１０３算出部
１０４提案部
１０５再学習部
１０６記憶部
２０１モデル作成部
２０２記憶部 1 Control system 10 Control device 11 Input device 12 Display device 13 External I / F
13a Recording medium 14 Communication I / F
15 Processor 16 Memory device 17 Bus 20 Server 30 Operator terminal 40 Controlled object 101 Control unit 102 Intervention judgment unit 103 Calculation unit 104 Proposal unit 105 Re-learning unit 106 Storage unit 201 Model creation unit 202 Storage unit

Claims

A creation unit that creates a model representing the relationship between the state of the controlled object and the control parameter value by imitation learning based on the history of the control parameter value when the operator intervenes in the controlled object.
A calculation unit that calculates control parameter values using the model according to the state of the control target,
Control system with.

The control system according to claim 1, further comprising a proposal unit that proposes a control parameter value calculated by the calculation unit to the operator.

The calculation unit creates evidence information representing the basis of the control parameter value calculated by the model.
The control system according to claim 2, wherein the proposal unit proposes the basis information to the operator in addition to the control parameter value.

The second or third aspect of claim 2 or 3, further comprising a re-learning unit that relearns the model by imitation learning depending on whether or not an intervention is performed on the controlled object by the control parameter value proposed by the proposal unit. Control system.

The control system according to claim 1, further comprising a control unit that controls the control target by the control parameter value calculated by the calculation unit.

The calculation unit further calculates a predetermined index value related to the control parameter value calculated by the model.
The control system according to claim 5, wherein the control unit controls the control target by the control parameter value calculated by the calculation unit when the index value exceeds a predetermined threshold value.

The creating unit creates a plurality of the models according to the time zone, the type of the product manufactured by the controlled object, or the range of values that the state of the controlled object can take.
The calculation unit calculates a control parameter value by one of a plurality of the models according to the state of the control target and the time zone or the type of the product manufactured by the control target. The control system according to any one of 1 to 6.

The creation unit calculates a predetermined statistical value from the history at a predetermined period, selects a control parameter value to be used for creating the model from the calculated statistical value, and uses the selected control parameter value and the control parameter value. The control system according to any one of claims 1 to 7, wherein the model is created using the state of the controlled object at the time of intervention.

A creation procedure for creating a model representing the relationship between the state of the controlled object and the control parameter value by imitation learning based on the history of the control parameter value when the operator intervenes in the controlled object.
A calculation procedure for calculating a control parameter value by the model according to the state of the controlled object, and
The control method that the computer performs.

A creation unit that creates a model representing the relationship between the state of the controlled object and the control parameter value by imitation learning based on the history of the control parameter value when the operator intervenes in the controlled object.
A calculation unit that calculates control parameter values using the model according to the state of the control target,
Control device with.

A creation procedure for creating a model representing the relationship between the state of the controlled object and the control parameter value by imitation learning based on the history of the control parameter value when the operator intervenes in the controlled object.
A calculation procedure for calculating a control parameter value by the model according to the state of the controlled object, and
A program that causes a computer to run.