JP7462905B2

JP7462905B2 - CONTROL DEVICE, METHOD, PROGRAM AND SYSTEM

Info

Publication number: JP7462905B2
Application number: JP2020203983A
Authority: JP
Inventors: 純一出澤; 志門菅原
Original assignee: AISing Ltd
Current assignee: AISing Ltd
Priority date: 2019-10-21
Filing date: 2020-12-09
Publication date: 2024-04-08
Anticipated expiration: 2039-10-21
Also published as: JP2021068458A

Description

この発明は、フィードバック制御を行う制御装置に関する。 This invention relates to a control device that performs feedback control.

様々な装置の制御には、フィードバック制御、例えば、ＰＩＤ制御が広く使用されている。フィードバック制御は、古典制御に属するものの、その過去の実績に基づく信頼や、技術者の経験則による調整の容易さから、現在に至っても未だ産業界の主力の制御手法である。 Feedback control, for example PID control, is widely used to control various devices. Although feedback control belongs to classical control, it is still the main control method in industry today due to its reliability based on past performance and the ease of adjustment based on engineers' empirical rules.

図１１は、従前のフィードバック制御の基本的構成、すなわち従前のフィードバックシステム２００について示す図である。同図から明らかな通り、制御機構２０２の検出子（例えば、センサ等）から得られた出力ｙは同図左側の入力側へとフィードバックされ、目標値ｒとの偏差が算出される。この算出された偏差はさらにコントローラ２０１へと入力されて操作量ｕが算出される。この操作量ｕに応じて制御機構２０２の操作子（例えば、アクチュエータ等）は動作し、図示しない制御対象を制御する。このとき、外乱ｗが乗ることがある。この一連のプロセスが繰り返されることで、偏差が小さくなるような制御、すなわち、出力ｙを目標値ｒへと近付ける制御が行われる。 Figure 11 shows the basic configuration of conventional feedback control, that is, a conventional feedback system 200. As is clear from the figure, the output y obtained from a detector (e.g., a sensor, etc.) of the control mechanism 202 is fed back to the input side on the left side of the figure, and the deviation from the target value r is calculated. This calculated deviation is further input to the controller 201, which calculates the manipulated variable u. The operator (e.g., an actuator, etc.) of the control mechanism 202 operates in response to this manipulated variable u, and controls a control target (not shown). At this time, a disturbance w may occur. This series of processes is repeated to perform control that reduces the deviation, that is, control that brings the output y closer to the target value r.

一方、近年、機械学習の分野が高い注目を集めている。このような背景の中、本願の発明者らは、木構造を有する新たな機械学習の枠組み（学習木）を提唱している（特許文献１）。 Meanwhile, the field of machine learning has been attracting a lot of attention in recent years. In this context, the inventors of the present application have proposed a new machine learning framework with a tree structure (learning tree) (Patent Document 1).

図１２は、上述の新たな機械学習の枠組みについて示す説明図、すなわち、学習木の構造について示す説明図である。図１２（ａ）には、当該学習手法における学習木の構造が示されており、図１２（ｂ）には、当該構造に対応する状態空間のイメージが示されている。同図から明らかな通り、学習木構造は、階層的に分割された各状態空間に対応する各ノードを、最上位ノード（始端ノード又は根ノード）から最下端ノード（末端ノード又は葉ノード）まで、樹形状乃至格子状に分岐して配置することにより構成されている。なお、同図は、Ｎ階層ｄ次元ｎ分割の学習木においてｄが２、ｎが２の場合の例を示しており、図１２（ａ）に記載の学習木の１階層目の４つの末端ノードに付された１～４の番号は、それぞれ、図１２（ｂ）に記載の４つの状態空間に対応している。 Figure 12 is an explanatory diagram showing the new machine learning framework described above, that is, an explanatory diagram showing the structure of a learning tree. Figure 12(a) shows the structure of a learning tree in this learning method, and Figure 12(b) shows an image of a state space corresponding to this structure. As is clear from the figure, the learning tree structure is configured by arranging each node corresponding to each hierarchically divided state space in a tree shape or a lattice shape from the top node (start node or root node) to the bottom node (terminal node or leaf node). Note that the figure shows an example of an N-layer d-dimensional n-partition learning tree where d is 2 and n is 2, and the numbers 1 to 4 assigned to the four terminal nodes in the first layer of the learning tree shown in Figure 12(a) correspond to the four state spaces shown in Figure 12(b), respectively.

上記学習木を用いて学習処理を行う際には、入力されるデータが、逐次、分割された各状態空間に対応付けられ、それらが各状態空間に蓄積されていくこととなる。このとき、それまでデータが存在しなかった状態空間に新たにデータが入力された場合には、新たなノードが順次生成されていく。予測出力は、学習後に各状態空間に内包される各データに対応する出力値又は出力ベクトルの相加平均をとることで算出されることとなる。 When performing learning processing using the above learning tree, the input data is sequentially associated with each divided state space, and the data is accumulated in each state space. At this time, when new data is input into a state space where no data existed before, new nodes are generated sequentially. The predicted output is calculated by taking the arithmetic mean of the output values or output vectors corresponding to each piece of data contained in each state space after learning.

このような機械学習技術によれば、省メモリかつ高速の機械学習を実現することができる。 This type of machine learning technology makes it possible to achieve high-speed machine learning with low memory consumption.

特開２０１６－１７３６８６号公報JP 2016-173686 A

ところで、ＰＩＤ制御をはじめとする従前のフィードバック制御では、ゲインを制御開始前に調整・設定し、制御開始後は当該ゲインを固定的に使用することが一般的であった。そのため、例えば、経年劣化等により制御対象や操作子の特性が変化したような場合には適応的な対応ができず、従って、制御の精度が低下するおそれがあった。 However, in conventional feedback control such as PID control, it was common to adjust and set the gain before control started, and then use the fixed gain after control started. Therefore, for example, when the characteristics of the controlled object or the operator change due to aging or other reasons, it was not possible to respond adaptively, and there was a risk of the accuracy of the control decreasing.

本発明は、上述の技術的背景の下になされたものであり、その目的とするところは、長年利用されている信頼性の高いフィードバック制御を利用しつつも、制御中に得られたデータに基づいて適応的な制御を行うことにある。 The present invention was made against the above technical background, and its purpose is to perform adaptive control based on data obtained during control while utilizing highly reliable feedback control that has been used for many years.

本発明のさらに他の目的並びに作用効果については、明細書の以下の記述を参照することにより、当業者であれば容易に理解されるであろう。 Further objects and effects of the present invention will be readily understood by those skilled in the art by referring to the following description of the specification.

上述の技術的課題は、以下の構成を有する制御装置、方法、プログラム及びシステム等により解決することができる。 The above-mentioned technical problems can be solved by a control device, method, program, system, etc. having the following configuration.

すなわち、本発明に係る制御装置は、所定の装置に対してフィードバック制御を行うための制御装置であって、前記装置からフィードバックされる出力と目標値に基づいて、前記装置に対する第１の操作量を生成する、第１のコントローラと、前記装置からフィードバックされる出力と前記第１の操作量に基づいて前記装置からの予測出力を生成するよう機械学習された学習済モデルを備えた、予測出力生成部と、前記予測出力と前記目標値に基づいて、前記装置に対する第２の操作量を生成する、第２のコントローラと、前記第１の操作量と前記第２の操作量とに基づいて、前記装置に対する操作量である統合操作量を生成する、統合操作量生成部と、前記第２の操作量が無効化処理される場合に、前記第１の操作量、前記装置からフィードバックされた前記出力及び前記統合操作量に対応する前記装置からの出力を機械学習用データとして記憶する、記憶部と、を備えている。 That is, the control device according to the present invention is a control device for performing feedback control on a predetermined device, and includes a first controller that generates a first operation amount for the device based on an output fed back from the device and a target value, a predicted output generation unit having a trained model that has been machine-learned to generate a predicted output from the device based on the output fed back from the device and the first operation amount, a second controller that generates a second operation amount for the device based on the predicted output and the target value, an integrated operation amount generation unit that generates an integrated operation amount that is an operation amount for the device based on the first operation amount and the second operation amount, and a storage unit that stores the first operation amount, the output fed back from the device, and the output from the device corresponding to the integrated operation amount as machine learning data when the second operation amount is invalidated.

このような構成によれば、フィードバック制御という長年利用されている信頼性の高い制御技術を利用しつつも、機械学習技術により、制御中に得られたデータに基づいて適応的な制御を行うことができる。 This configuration makes it possible to use feedback control, a highly reliable control technique that has been used for many years, while also using machine learning technology to perform adaptive control based on data obtained during control.

前記制御装置は、さらに、前記機械学習用データに基づいて学習処理を行い前記学習済モデルを更新する、学習処理部を備えてもよい。 The control device may further include a learning processing unit that performs learning processing based on the machine learning data to update the trained model.

このような構成によれば、装置の制御を行いつつも学習処理を行い、制御を最適化することができる。 With this configuration, learning processing can be performed while controlling the device, optimizing control.

前記制御装置は、さらに、前記第２の操作量が無効化条件を満たすか否かを判定する、判定部と、前記判定部において無効化条件を満たすと判定された場合に、前記第２の操作量を無効化処理する、無効化処理部と、を備える、ものであってもよい。 The control device may further include a determination unit that determines whether the second operation amount satisfies an invalidation condition, and an invalidation processing unit that invalidates the second operation amount when the determination unit determines that the invalidation condition is satisfied.

このような構成によれば、第２の操作量が予め定めた条件を満たす場合には第２の操作量を無効化して第１の操作量に基づく制御のみが行われるので、より信頼性の高い制御を行うことができる。また、当該期間のデータは機械学習用データとして供されるので将来に向けて制御精度の向上を見込むことができる。 With this configuration, when the second operation amount satisfies a predetermined condition, the second operation amount is invalidated and only control based on the first operation amount is performed, so that more reliable control can be performed. In addition, the data from the period is provided as data for machine learning, so that improvement in control accuracy can be expected in the future.

前記無効化条件は、前記第２の操作量が第１の閾値よりも大きいか、又は前記第１の閾値よりも小さい第２の閾値よりも小さいことであってもよい。 The invalidation condition may be that the second operation amount is greater than a first threshold value or is smaller than a second threshold value that is smaller than the first threshold value.

このような構成によれば、操作量が想定を超える場合に第２の操作量を無効化するので、より信頼性の高い制御を行うことができる。また、当該期間のデータは機械学習用データとして供されるので将来に向けて制御精度の向上を見込むことができる。 With this configuration, the second operation amount is invalidated when the operation amount exceeds the expected amount, so that more reliable control can be performed. In addition, the data from the period is provided as data for machine learning, so that improvement in control accuracy can be expected in the future.

前記記憶部は、さらに、前記第２の操作量が０又はその近傍値となる場合に、前記第１の操作量、前記装置からフィードバックされた前記出力及び前記統合操作量に対応する前記装置からの出力を機械学習用データとして記憶する、ものであってもよい。 The storage unit may further store, when the second operation amount is 0 or a value close to 0, the first operation amount, the output fed back from the device, and the output from the device corresponding to the integrated operation amount as data for machine learning.

このような構成によれば、第２の操作量が０又はその近傍値となる場合も利用して学習を進めることができるので、さらなる制御精度の向上を期待することができる。 With this configuration, learning can be continued even when the second manipulated variable is zero or a value close to zero, so further improvement in control accuracy can be expected.

前記記憶部は、さらに、前記第２の操作量が無効化処理される場合の参照時間ステップに係る前記第１の操作量、前記装置からフィードバックされた前記出力及び前記統合操作量に対応する前記装置からの出力に加えて、さらに、前記参照時間より時間的に前の１又は複数の時間ステップに係る前記第１の操作量、前記装置からフィードバックされた前記出力及び前記統合操作量に対応する前記装置からの出力を機械学習用データとして記憶する、ものであってもよい。 The storage unit may further store, as machine learning data, the first operation amount relating to a reference time step when the second operation amount is invalidated, the output fed back from the device, and the output from the device corresponding to the integrated operation amount, as well as the first operation amount relating to one or more time steps prior to the reference time, the output fed back from the device, and the output from the device corresponding to the integrated operation amount.

このような構成によれば、参照時間ステップの前の１又は複数の時間ステップに係るデータも併せて学習するので、より汎化させやすくなり、学習速度の向上を期待することができる。 With this configuration, data relating to one or more time steps prior to the reference time step is also learned, making generalization easier and improving learning speed is expected.

前記第１のコントローラ及び／又は前記第２のコントローラは、それぞれ、Ｐ制御、ＰＩ制御、ＰＤ制御又はＰＩＤ制御のいずれかを行うものであってもよい。 The first controller and/or the second controller may each perform any one of P control, PI control, PD control, or PID control.

このような構成によれば、長年利用されている信頼性の高い制御技術を利用しつつも、機械学習技術により、装置運転中に得られたデータに基づいて制御精度をさらに向上させることができる。 This configuration allows for the use of highly reliable control technology that has been in use for many years, while also using machine learning technology to further improve control accuracy based on data obtained during operation of the equipment.

前記学習済モデルは、階層的に分割された状態空間へとそれぞれ対応付けられた複数のノードを階層的に配置することにより構成された木構造を有する学習モデルを用いて機械学習を行うことにより得られるものであってもよい。 The trained model may be obtained by performing machine learning using a training model having a tree structure constructed by hierarchically arranging a plurality of nodes, each of which corresponds to a hierarchically divided state space.

このような構成によれば、人工ニューラルネットワーク等に対する学習に比べて省メモリで高速学習を行うことが可能となるので、装置を動作させつつ同時学習（オンライン学習）を行う場合に特に有利となる。 This configuration makes it possible to perform high-speed learning with less memory than learning using artificial neural networks, etc., which is particularly advantageous when performing simultaneous learning (online learning) while the device is operating.

本発明は、方法としても観念することができる。すなわち、本発明に係る制御方法は、所定の装置に対してフィードバック制御を行うための制御装置における制御方法であって、前記制御装置は、前記装置からフィードバックされる出力と目標値に基づいて、前記装置に対する第１の操作量を生成する、第１のコントローラと、前記装置からフィードバックされる出力と前記第１の操作量に基づいて前記装置からの予測出力を生成するよう機械学習された学習済モデルを備えた、予測出力生成部と、前記予測出力と前記目標値に基づいて、前記装置に対する第２の操作量を生成する、第２のコントローラと、を備え、前記第１の操作量と前記第２の操作量とに基づいて、前記装置に対する操作量である統合操作量を生成する、統合操作量生成ステップと、前記第２の操作量が無効化処理される場合に、前記第１の操作量、前記装置からフィードバックされた前記出力及び前記統合操作量に対応する前記装置からの出力を機械学習用データとして記憶する、記憶ステップと、を備えている。 The present invention can also be conceived as a method. That is, the control method according to the present invention is a control method in a control device for performing feedback control on a predetermined device, the control device comprising: a first controller that generates a first operation amount for the device based on an output fed back from the device and a target value; a predicted output generating unit that has a trained model that has been machine-learned to generate a predicted output from the device based on the output fed back from the device and the first operation amount; and a second controller that generates a second operation amount for the device based on the predicted output and the target value, and an integrated operation amount generating step that generates an integrated operation amount that is an operation amount for the device based on the first operation amount and the second operation amount; and a storage step that stores the first operation amount, the output fed back from the device, and the output from the device corresponding to the integrated operation amount as machine learning data when the second operation amount is invalidated.

本発明は、プログラムとしても観念することができる。すなわち、本発明に係る制御プログラムは、所定の装置に対してフィードバック制御を行うための制御装置の制御プログラムであって、前記制御装置は、前記装置からフィードバックされる出力と目標値に基づいて、前記装置に対する第１の操作量を生成する、第１のコントローラと、前記装置からフィードバックされる出力と前記第１の操作量に基づいて前記装置からの予測出力を生成するよう機械学習された学習済モデルを備えた、予測出力生成部と、前記予測出力と前記目標値に基づいて、前記装置に対する第２の操作量を生成する、第２のコントローラと、を備え、前記第１の操作量と前記第２の操作量とに基づいて、前記装置に対する操作量である統合操作量を生成する、統合操作量生成ステップと、前記第２の操作量が無効化処理される場合に、前記第１の操作量、前記装置からフィードバックされた前記出力及び前記統合操作量に対応する前記装置からの出力を機械学習用データとして記憶する、記憶ステップと、を備えている。 The present invention can also be conceived as a program. That is, the control program according to the present invention is a control program for a control device for performing feedback control on a predetermined device, the control device comprising: a first controller that generates a first operation amount for the device based on an output fed back from the device and a target value; a predicted output generating unit that has a trained model that has been machine-learned to generate a predicted output from the device based on the output fed back from the device and the first operation amount; and a second controller that generates a second operation amount for the device based on the predicted output and the target value, and an integrated operation amount generating step that generates an integrated operation amount that is an operation amount for the device based on the first operation amount and the second operation amount; and a storage step that stores the first operation amount, the output fed back from the device, and the output from the device corresponding to the integrated operation amount as machine learning data when the second operation amount is invalidated.

本発明は、システムとしても観念することができる。すなわち、本発明に係る制御システムは、所定の装置に対してフィードバック制御を行うための制御システムであって、前記装置からフィードバックされる出力と目標値に基づいて、前記装置に対する第１の操作量を生成する、第１のコントローラと、前記装置からフィードバックされる出力と前記第１の操作量に基づいて前記装置からの予測出力を生成するよう機械学習された学習済モデルを備えた、予測出力生成部と、前記予測出力と前記目標値に基づいて、前記装置に対する第２の操作量を生成する、第２のコントローラと、前記第１の操作量と前記第２の操作量とに基づいて、前記装置に対する操作量である統合操作量を生成する、統合操作量生成部と、前記第２の操作量が無効化処理される場合に、前記第１の操作量、前記装置からフィードバックされた前記出力及び前記統合操作量に対応する前記装置からの出力を機械学習用データとして記憶する、記憶部と、を備えている。 The present invention can also be conceived as a system. That is, the control system according to the present invention is a control system for performing feedback control on a predetermined device, and includes a first controller that generates a first operation amount for the device based on an output fed back from the device and a target value, a predicted output generating unit having a trained model that has been machine-learned to generate a predicted output from the device based on the output fed back from the device and the first operation amount, a second controller that generates a second operation amount for the device based on the predicted output and the target value, an integrated operation amount generating unit that generates an integrated operation amount that is an operation amount for the device based on the first operation amount and the second operation amount, and a storage unit that stores the first operation amount, the output fed back from the device, and the output from the device corresponding to the integrated operation amount as machine learning data when the second operation amount is invalidated.

本発明によれば、信頼性の高いフィードバック制御を利用しつつも、制御中に得られたデータに基づいて適応的な制御を行うことができる。 The present invention makes it possible to use highly reliable feedback control while performing adaptive control based on data obtained during control.

図１は、制御システムのハードウェア構成図である。FIG. 1 is a hardware configuration diagram of a control system. 図２は、システムの動作に関するゼネラルフローチャートである。FIG. 2 is a general flow chart of the operation of the system. 図３は、基本システムに関するブロック図である。FIG. 3 is a block diagram of the basic system. 図４は、基本システムの動作に関する詳細フローチャートである。FIG. 4 is a detailed flow chart of the operation of the basic system. 図５は、初期学習に関する詳細フローチャートである。FIG. 5 is a detailed flowchart relating to the initial learning. 図６は、拡張システムの動作に関する詳細フローチャートである。FIG. 6 is a detailed flow chart of the operation of the expansion system. 図７は、拡張システムに関するブロック図である。FIG. 7 is a block diagram of an extended system. 図８は、拡張システムにおける制御処理に関する詳細フローチャート（その１）である。FIG. 8 is a detailed flowchart (part 1) of the control process in the extended system. 図９は、拡張システムにおける制御処理に関する詳細フローチャート（その２）である。FIG. 9 is a detailed flowchart (part 2) of the control process in the extended system. 図１０は、第２の操作量の条件に関する説明図である。FIG. 10 is an explanatory diagram relating to the second manipulated variable condition. 図１１は、フィードバックシステムの基本的構成に関するブロック図である。FIG. 11 is a block diagram showing the basic configuration of the feedback system. 図１２は、学習木に関する説明図である。FIG. 12 is an explanatory diagram of a learning tree.

以下、本発明の実施の一形態を、添付の図面を参照しつつ、詳細に説明する。 One embodiment of the present invention will now be described in detail with reference to the accompanying drawings.

＜１．第１の実施形態＞
＜１．１構成＞
図１は、制御装置１００と制御機構１２とから成る制御システムのハードウェア構成図である。 1. First embodiment
<1.1 Configuration>
FIG. 1 is a hardware configuration diagram of a control system including a control device 100 and a control mechanism 12. As shown in FIG.

同図から明らかな通り、制御装置１００は、制御部１、記憶部２、Ｉ／Ｏ部３、入力部４、表示部５及び通信部６を備え、それらは互いにバスを介して接続されている。また、制御装置１００は、制御機構１２を構成する操作部１２１及び検出部１２２と接続され、図示しない制御対象を制御可能に構成されている。 As is clear from the figure, the control device 100 comprises a control unit 1, a memory unit 2, an I/O unit 3, an input unit 4, a display unit 5, and a communication unit 6, which are connected to each other via a bus. The control device 100 is also connected to an operation unit 121 and a detection unit 122 that constitute the control mechanism 12, and is configured to be able to control a control target (not shown).

制御部１は、ＣＰＵ等の情報処理部であり、記憶部２に記憶されている各種のプログラムを読み出して実行する。記憶部２は、ＲＯＭ、ＲＡＭ、ハードディスク、フラッシュメモリ等の揮発性又は不揮発性の記憶装置であり、機械学習対象となるデータを含む後述の各種のデータを記憶する。Ｉ／Ｏ部３は、外部装置との入出力を行うインタフェースである。入力部４は、キーボード、タッチパネル、ボタン等を介して入力された信号を処理する。表示部５は、ディスプレイ等と接続されて表示制御を行い、ディスプレイ等を介してユーザにＧＵＩを提供する。通信部６は、有線又は無線にて外部機器と通信を行う通信ユニットである。 The control unit 1 is an information processing unit such as a CPU, and reads and executes various programs stored in the memory unit 2. The memory unit 2 is a volatile or non-volatile storage device such as a ROM, RAM, hard disk, or flash memory, and stores various data described below, including data that is the subject of machine learning. The I/O unit 3 is an interface that performs input and output with an external device. The input unit 4 processes signals input via a keyboard, touch panel, button, or the like. The display unit 5 is connected to a display or the like to perform display control and provide a GUI to the user via the display or the like. The communication unit 6 is a communication unit that communicates with external devices via a wired or wireless connection.

操作部１２１は、所定の操作量に基づいて制御対象に影響を与えるものであり、例えば、アクチュエータ等で構成される。検出部１２２は、制御対象の状態等を検出するものであり、例えば、センサ等で構成される。 The operation unit 121 affects the controlled object based on a predetermined operation amount, and is composed of, for example, an actuator. The detection unit 122 detects the state of the controlled object, and is composed of, for example, a sensor.

なお、ハードウェア構成は、本実施形態に係る構成に限定されるものではなく、構成や機能を分散又は統合してもよい。例えば、複数台の制御装置１００を用いて分散的に処理を行っても良いし、大容量記憶装置をさらに外部に設けて制御装置１００と接続する等してもよい。また、インターネット等を介してコンピュータネットワークを形成して処理を行ってもよい。 The hardware configuration is not limited to the configuration according to this embodiment, and the configuration and functions may be distributed or integrated. For example, processing may be performed in a distributed manner using multiple control devices 100, or a large-capacity storage device may be provided externally and connected to the control device 100. Processing may also be performed by forming a computer network via the Internet, etc.

さらに、本実施形態に係る処理は、ＦＰＧＡ等の半導体回路（ＩＣ等）を用いて、所謂ハードウェアとして実装してもよい。 Furthermore, the processing according to this embodiment may be implemented as so-called hardware using semiconductor circuits (ICs, etc.) such as FPGAs.

＜１．２動作＞
次に、図２～図１０を参照しつつ、制御装置１００の動作に関して説明する。 1.2 Operation
Next, the operation of the control device 100 will be described with reference to FIGS.

図２は、制御装置１００の動作に関するゼネラルフローチャートである。 Figure 2 is a general flowchart of the operation of the control device 100.

同図から明らかな通り、処理が開始すると、後述の基本システム１０の第１のＰＩＤコントローラ１１に設定される各ゲイン（すなわち、Ｐ（比例）ゲイン、Ｉ（積分）ゲイン、Ｄ（微分）ゲイン）の設定処理が行われる（Ｓ１）。 As is clear from the figure, when processing starts, a process is performed to set each gain (i.e., P (proportional) gain, I (integral) gain, and D (differential) gain) to be set in the first PID controller 11 of the basic system 10 described below (S1).

図３は、基本システム１０に関するブロック図である。同図から明らかな通り、基本システム１０は、第１のＰＩＤコントローラ１１と、第１のＰＩＤコントローラ１１の後段に設けられ、操作部１２１と検出部１２２を備えた制御機構１２と、第１のＰＩＤコントローラから出力される操作量ｕ_０と制御機構１２の検出部１２２から出力される出力値ｙを記録するデータロガー１３とから構成されている。なお、その動作は、図１１に示したフィードバックシステム２００と略同一であるが、データロガー１３が、第１のＰＩＤコントローラから出力される操作量ｕ_０と制御機構１２の検出部１２２から出力される出力値ｙを記録する点において相違する。 3 is a block diagram of the basic system 10. As is clear from the figure, the basic system 10 is composed of a first PID controller 11, a control mechanism 12 provided in the rear stage of the first PID controller 11 and equipped with an operation unit 121 and a detection unit 122, and a data logger 13 for recording the operation amount _u0 output from the first PID controller and the output value y output from the detection unit 122 of the control mechanism 12. The operation is substantially the same as that of the feedback system 200 shown in FIG. 11, but is different in that the data logger 13 records the operation amount _u0 output from the first PID controller and the output value y output from the detection unit 122 of the control mechanism 12.

ユーザは、基本システム１０を動作させ又はシミュレーションを行う等して、公知の手法で第１のＰＩＤコントローラ１１の各ゲインを調整し、入力部４等を介して最終的なゲインを入力して設定する。この入力された各ゲインは記憶部２へと記憶される。 The user adjusts each gain of the first PID controller 11 by a known method, for example by operating the basic system 10 or performing a simulation, and inputs and sets the final gain via the input unit 4, etc. These input gains are stored in the memory unit 2.

図２に戻り、ゲインの設定処理（Ｓ１）が完了すると、当該ゲインを利用して実際に基本システム１０を動作させる処理、すなわち、機械学習用のデータを取得し記憶する処理が行われる（Ｓ３）。 Returning to FIG. 2, when the gain setting process (S1) is completed, a process is performed in which the basic system 10 is actually operated using the gain, that is, a process is performed in which data for machine learning is acquired and stored (S3).

図４は、基本システム１０の動作に関する詳細フローチャートである。同図から明らかな通り、処理が開始すると、時間ステップに相当する所定の整数値ｔを初期化する（例えば、１とする）処理が行われる（Ｓ３１）。初期化が完了すると、所定の目標値ｒ（ｔ）と、１つ前の時間ステップ（ｔ－１）の出力値ｙ（ｔ－１）を読み出し、その偏差（ｒ（ｔ）－ｙ（ｔ－１））を算出し、同偏差を第１のコントローラ１１へと入力する処理が行われる（Ｓ３２）。 Figure 4 is a detailed flowchart of the operation of the basic system 10. As is clear from the figure, when processing begins, a process is performed in which a predetermined integer value t corresponding to a time step is initialized (for example, set to 1) (S31). When initialization is complete, a predetermined target value r(t) and the output value y(t-1) of the previous time step (t-1) are read out, the deviation (r(t)-y(t-1)) is calculated, and this deviation is input to the first controller 11 (S32).

第１のコントローラ１１は、偏差が入力されると、設定されたゲインに基づいて操作量ｕ（ｔ）を算出する（Ｓ３３）。この操作量ｕ（ｔ）は、制御機構１２の操作部１２１へと提供され、これにより制御対象に対して所定の制御行為が行われる。その後、制御機構１２の検出部１２２を介して、現在（ｔ）の出力値ｙ（ｔ）が検出される（Ｓ３４）。 When the deviation is input, the first controller 11 calculates the manipulated variable u(t) based on the set gain (S33). This manipulated variable u(t) is provided to the operation unit 121 of the control mechanism 12, which performs a predetermined control action on the controlled object. After that, the current (t) output value y(t) is detected via the detection unit 122 of the control mechanism 12 (S34).

以上一連の処理が終了すると、１つ前の時間ステップの出力値ｙ（ｔ－１）、操作量ｕ（ｔ）及び現在時間（ｔ）の出力値ｙ（ｔ）を、データロガー１３を介して記憶部２へと記憶する処理が行われる（Ｓ３６）。その後、ｔの値を１だけインクリメントして（Ｓ３８）、再度一連の処理（Ｓ３２～Ｓ３８）が行われる。 When the above series of processes is completed, the output value y(t-1) of the previous time step, the manipulated variable u(t), and the output value y(t) at the current time (t) are stored in the memory unit 2 via the data logger 13 (S36). After that, the value of t is incremented by 1 (S38), and the series of processes (S32 to S38) are performed again.

すなわち、制御対象を制御しつつ、１つ前の時間ステップの出力値ｙ（ｔ－１）、操作量ｕ（ｔ）及び現在時間の出力値ｙ（ｔ）を、データロガー１３を介して記憶部２へと記憶する処理が継続的に行われる。これにより、後述する予測処理部３５で用いられる学習済モデルを生成するための機械学習用データが所望量蓄積されていくことになる。 That is, while controlling the controlled object, a process is continuously performed in which the output value y(t-1) of the previous time step, the manipulated variable u(t), and the output value y(t) at the current time are stored in the storage unit 2 via the data logger 13. This allows the desired amount of machine learning data to be accumulated in order to generate a trained model used in the prediction processing unit 35, which will be described later.

図２に戻り、基本システム１０の動作に基づいてデータの取得と記憶処理が完了すると（Ｓ３）、得られたデータに基づいて初期学習を行う処理が行われる（Ｓ５）。 Returning to FIG. 2, once data acquisition and storage processing is completed based on the operation of the basic system 10 (S3), processing is performed to perform initial learning based on the acquired data (S5).

図５は、初期学習に関する詳細フローチャートである。本実施形態においては、機械学習技術として図１２を用いて示した上述の木構造を利用した機械学習技術を用いる。 Figure 5 is a detailed flowchart of the initial learning. In this embodiment, the machine learning technique used is the one using the tree structure described above and shown in Figure 12.

同図から明らかな通り、処理が開始すると、学習木の構造（階層数、次元数、分割数など）や種々の初期パラメータを含む、学習に関するパラメータファイルを記憶部２から読み出す処理が行われる。その後、所定の整数値ｔを初期化する（例えば１とする）処理が行われる（Ｓ５２）。 As is clear from the figure, when processing starts, a parameter file related to learning, including the structure of the learning tree (number of levels, number of dimensions, number of divisions, etc.) and various initial parameters, is read from the storage unit 2. Then, a process is performed to initialize a predetermined integer value t (for example, to 1) (S52).

この初期化の後、ｔ番目の入力データ、すなわち、１つ前の時間ステップの出力値ｙ（ｔ－１）と操作量ｕ（ｔ）を読み出して学習木への入力とする処理が行われる（Ｓ５３）。その後、当該入力は、所定の分岐条件に応じて分類されて根ノードから葉ノードへと至る複数のノードが特定され、各ノードと対応付けて記憶される（Ｓ５４）。 After this initialization, the tth input data, i.e., the output value y(t-1) of the previous time step and the operation amount u(t) are read and input to the learning tree (S53). After that, the input is classified according to a predetermined branching condition, and multiple nodes from the root node to the leaf nodes are identified and stored in association with each node (S54).

その後、各ノードにおいて、出力値ｙに基づくそれまでの相加平均値を更新するように、新たな出力値ｙ（ｔ）も加えた相加平均値を算出し、ノードと対応付けて記憶する処理が行われる（Ｓ５６）。 After that, in each node, the arithmetic mean value based on the output value y is updated by calculating the arithmetic mean value including the new output value y(t) and storing it in association with the node (S56).

その後、ｔの値が所定の最大値（ｔ＿ｍａｘ）と一致するかを判定し、未だｔの値が最大値ではない場合（Ｓ５７ＮＯ）、ｔの値を１だけインクリメントして、再度上述の学習処理（Ｓ５３～Ｓ５６）を繰り返す。一方、ｔの値が所定の最大値となる場合（Ｓ５７ＹＥＳ）、処理は終了する。 Then, it is determined whether the value of t matches a predetermined maximum value (t_max), and if the value of t is not yet the maximum value (S57 NO), the value of t is incremented by 1 and the above-mentioned learning process (S53 to S56) is repeated again. On the other hand, if the value of t is the predetermined maximum value (S57 YES), the process ends.

すなわち、これにより、１つ前の時間ステップの出力値ｙ（ｔ－１）と現在時間の操作量ｕ（ｔ）とに基づいて、出力値ｙ（ｔ）を予測する学習済モデルが生成されることとなる。 In other words, this generates a trained model that predicts the output value y(t) based on the output value y(t-1) of the previous time step and the operation amount u(t) at the current time.

図２に戻り、初期学習処理が完了すると、次に、基本システム１０を拡張した後述の拡張システム３０を動作させる（Ｓ７）。 Returning to FIG. 2, once the initial learning process is complete, the extension system 30, which is an extension of the basic system 10 and will be described later, is then operated (S7).

図６は、拡張システム３０の動作に関する詳細フローチャートである。同図から明らかな通り、処理が開始すると、拡張システム３０に基づく制御処理が行われる（Ｓ７１）。 Figure 6 is a detailed flowchart of the operation of the extended system 30. As is clear from the figure, when processing starts, control processing based on the extended system 30 is performed (S71).

図７は、拡張システム３０のブロック図である。同図から明らかな通り、拡張システム３０は、第１のフィードバックループを備える基本システム１０の構成に加えて、さらに、第２のフィードバックループと学習処理部３４とを備えている。第２のフィードバックループは、学習済モデルを備える予測処理部３５、その後段に設けられた第２のコントローラ３７、さらにその後段に設けられた無効化処理部３８及び判定部３９とから成る。 Figure 7 is a block diagram of the extended system 30. As is clear from the figure, the extended system 30 includes a second feedback loop and a learning processing unit 34 in addition to the configuration of the basic system 10 with a first feedback loop. The second feedback loop is made up of a prediction processing unit 35 with a learned model, a second controller 37 provided downstream of the prediction processing unit 35, and an invalidation processing unit 38 and a judgment unit 39 provided downstream of the prediction processing unit 35.

予測処理部３５は、１つ前の時間ステップの出力値ｙ（ｔ－１）と現在時間の第１の操作量ｕ_１（ｔ）とに基づいて、予測出力値ｙ_ｈａｔ（ｔ）を生成する学習済モデルを備えている。また、第２のコントローラ３７は、目標値ｒ（ｔ）と予測出力値ｙ_ｈａｔ（ｔ）との偏差（ｒ（ｔ）－ｙ_ｈａｔ（ｔ））に基づいて、第２の操作量ｕ_２（ｔ）を生成する。判定部３９は、第２の操作量ｕ_２（ｔ）に関して所定の条件判定を行い、無効化処理部３８に対して判定結果を提供する。無効化処理部３８は、判定部３９から提供される判定結果に応じて、第２の操作量ｕ_２（ｔ）を無効化して（例えば、第２の操作量ｕ_２（ｔ）を０として）又はそのまま提供する。 The prediction processing unit 35 includes a learned model that generates a predicted output value y _hat (t) based on the output value y(t-1) of the previous time step and the first operation amount u ₁ (t) of the current time. The second controller 37 generates a second operation amount u 2 (t) based on the deviation (r(t)-y _hat (t)) between the target value r(t) and the predicted output value y _hat (t). The determination unit 39 performs a predetermined condition determination on the second operation amount u ₂ (t) and provides the determination result to the invalidation processing unit 38. Depending on the determination result provided from the determination unit 39, the invalidation processing unit 38 invalidates the second operation amount u ₂ (t) (for example, sets the second operation amount _{u 2} ₍ t) to 0) or provides it as it is.

また、学習処理部３４は、データロガー５３を通じて記憶部２に記憶されたデータを読み出して、所定の条件下、学習処理を行い、更新された学習済モデルを予測処理部３５へと提供する。 The learning processing unit 34 also reads the data stored in the memory unit 2 through the data logger 53, performs learning processing under specified conditions, and provides the updated learned model to the prediction processing unit 35.

図８及び図９は、拡張システム３０における制御処理に関する詳細フローチャートである。 Figures 8 and 9 are detailed flowcharts of the control processing in the expansion system 30.

図８において、処理が開始すると、後述の処理において使用されるフラグを初期化する処理が行われる（Ｓ７１１）。次に、１つ前の時間ステップの出力値ｙ（ｔ－１）と目標値ｒ（ｔ）の偏差（ｒ（ｔ）－ｙ（ｔ－１））を第１のコントローラ３１へと入力する処理が行われる（Ｓ７１２）。第１のコントローラ３１は、当該入力と設定されたゲインに基づき、第１の操作量ｕ_１（ｔ）を算出する処理が行われる（Ｓ７１３）。 8, when the process starts, a process is performed to initialize flags used in the process described later (S711). Next, a process is performed to input the deviation (r(t)-y(t-1)) between the output value y(t-1) of the previous time step and the target value r(t) to the first controller 31 (S712). The first controller 31 performs a process to calculate a first manipulated variable u ₁ (t) based on the input and a set gain (S713).

その後、第１の操作量ｕ_１（ｔ）及び１つ前の時間ステップの出力値ｙ（ｔ－１）を予測処理部３５へと入力する処理が行われる（Ｓ７１４）。予測処理部３５は、学習済モデルへと第１の操作量ｕ_１（ｔ）及び１つ前の時間ステップの出力値ｙ（ｔ－１）を入力することにより、予測出力ｙ_ｈａｔ（ｔ）を算出する（Ｓ７１５）。この算出の後、予測出力ｙ_ｈａｔ（ｔ）と目標値ｒ（ｔ）との偏差（ｒ（ｔ）－ｙ_ｈａｔ（ｔ））を第２のコントローラ３７へと入力する処理が行われる（Ｓ７１６）。第２のコントローラ３７は、予測出力ｙ_ｈａｔ（ｔ）と目標値ｒ（ｔ）との偏差に基づいて第２の操作量ｕ_２（ｔ）を算出する（Ｓ７１７）。 Thereafter, a process is performed in which the first manipulated variable u ₁ (t) and the output value y(t-1) of the previous time step are input to the prediction processing unit 35 (S714). The prediction processing unit 35 calculates a predicted output y _hat (t) by inputting the first manipulated variable u ₁ (t) and the output value y(t-1) of the previous time step to the learned model (S715). After this calculation, a process is performed in which the deviation (r(t)-y _hat (t)) between the predicted output y _hat (t) and the target value r(t) is input to the second controller 37 (S716). The second controller 37 calculates a second manipulated variable u ₂ (t) based on the deviation between the predicted output y _hat (t) and the target value r(t) (S717).

図９へと続き、第２の操作量ｕ_２（ｔ）が算出されると、判定部３９により、当該第２の操作量ｕ_２（ｔ）が所定の条件を満たすか否かを判定する処理が行われる（Ｓ７１９）。 Continuing with FIG. 9, when the second operation amount u ₂ (t) is calculated, the determination unit 39 performs a process of determining whether or not the second operation amount u ₂ (t) satisfies a predetermined condition (S719).

図１０は、第２の操作量ｕ_２（ｔ）の所定条件の概要に関する説明図である。同図から明らかな通り、所定条件は、第２の操作量ｕ_２（ｔ）が所定の閾値Ｕ_Ｌ以上かつ所定の閾値Ｕ_Ｈ以下の範囲（同図Ｒで示した範囲）にあるか否かである。 10 is an explanatory diagram regarding an outline of the predetermined condition for the second manipulated variable _u2 (t). As is clear from the figure, the predetermined condition is whether or not the second manipulated variable _u2 (t) is in a range between a predetermined threshold value _UL or more and a predetermined threshold value _UH or less (the range indicated by R in the figure).

この範囲（Ｒ）に無い場合（Ｓ７１９ＮＯ）、すなわち、第２の操作量ｕ_２（ｔ）が所定の閾値Ｕ_Ｌより小さいか又は所定の閾値Ｕ_Ｈより大きい場合、判定部３９は、無効化処理部３８へと所定の範囲内に第２の操作量ｕ_２（ｔ）が無いことを表す判定信号を提供し、無効化処理部３８は、第２の操作量ｕ_２（ｔ）を無効化する処理を行う（Ｓ７２０）。この無効化処理を行った後、無効化を行ったことを意味するフラグをＯＮとする処理が行われる（Ｓ７２１）。 If it is not within this range (R) (NO in S719), that is, if the second operation amount _u2 (t) is smaller than the predetermined threshold value _UL or larger than the predetermined threshold value _UH , the determination unit 39 provides the invalidation processing unit 38 with a determination signal indicating that the second operation amount _u2 (t) is not within the predetermined range, and the invalidation processing unit 38 performs processing to invalidate the second operation amount _u2 (t) (S720). After performing this invalidation processing, processing is performed to set a flag indicating that invalidation has been performed to ON (S721).

一方、第２の操作量ｕ_２（ｔ）が上記範囲（Ｒ）内に存在する場合（Ｓ７１９ＹＥＳ）、判定部３９は、無効化処理部３８へと所定の範囲内に第２の操作量ｕ_２（ｔ）が存在することを表す判定信号を提供し、無効化処理部３８は、第２の操作量ｕ_２（ｔ）をそのまま第１のフィードバックループの第１のコントローラ１３の出力後段へと提供する（Ｓ７２２）。 On the other hand, if the second operating variable _u2 (t) is within the above range (R) (S719 YES), the judgment unit 39 provides a judgment signal indicating that the second operating variable _u2 (t) is within the specified range to the invalidation processing unit 38, and the invalidation processing unit 38 provides the second operating variable _u2 (t) as is to the downstream output stage of the first controller 13 in the first feedback loop (S722).

その後、第１のフィードバックループの第１のコントローラ１３の出力後段において、第１の操作量ｕ_１（ｔ）と第２の操作量ｕ_２（ｔ）を加算して、操作量ｕ（ｔ）を算出する処理が行われる（Ｓ７２３）。この操作量ｕ（ｔ）は、制御機構３２の操作部１２１へと入力され、検出部１２２を通じてその結果としての出力値ｙ（ｔ）が検出される（Ｓ７２４）。 Thereafter, in the output stage of the first controller 13 in the first feedback loop, the first manipulated variable _u1 (t) and the second manipulated variable _u2 (t) are added together to calculate the manipulated variable u(t) (S723). This manipulated variable u(t) is input to the manipulation unit 121 of the control mechanism 32, and the resulting output value y(t) is detected through the detection unit 122 (S724).

この検出処理の後、１つ前の時間ステップの出力値ｙ（ｔ－１）、操作量ｕ（ｔ）、出力値ｙ（ｔ）及びフラグ信号を記憶する処理が行われ（Ｓ７２５）、拡張システム３０における制御処理の一周期に相当する処理は終了する。 After this detection process, the output value y(t-1), the manipulated variable u(t), the output value y(t) and the flag signal of the previous time step are stored (S725), and the process corresponding to one cycle of the control process in the expansion system 30 is completed.

図６に戻り、拡張システム３０における制御処理の一周期相当が終了すると、記憶されたフラグの状態を判定する処理が行われる（Ｓ７３）。フラグがＯＦＦ状態であると判定された場合（Ｓ７３ＮＯ）、拡張システム３０の次の時間ステップにおける処理が再度行われる（Ｓ７１）。一方、フラグがＯＮ状態にあると判定された場合（Ｓ７３ＹＥＳ）、すなわち、第２の操作量ｕ_２（ｔ）の無効化処理が行われていた場合、学習処理が行われる（Ｓ７５）。 6, when one cycle of the control process in the expansion system 30 is completed, a process is performed to determine the state of the stored flag (S73). If it is determined that the flag is OFF (S73 NO), the process in the next time step of the expansion system 30 is performed again (S71). On the other hand, if it is determined that the flag is ON (S73 YES), that is, if the invalidation process of the second manipulated variable _u2 (t) has been performed, a learning process is performed (S75).

学習処理（Ｓ７５）の内容は図５に示したものと略同一であるので、ここでは説明を省略する。この学習処理の後、拡張システム３０の次の時間ステップにおける処理が再度行われる（Ｓ７１）。 The contents of the learning process (S75) are substantially the same as those shown in FIG. 5, so a detailed explanation is omitted here. After this learning process, processing at the next time step of the extended system 30 is performed again (S71).

また、このような構成によれば、第２の操作量ｕ_２（ｔ）が予め定めた条件を満たす場合には第２の操作量ｕ_２（ｔ）を無効化して第１の操作量ｕ_１（ｔ）に基づく制御のみを行うので、信頼性の高い制御を行うことができる。また、当該期間のデータは機械学習用データとして供されるので将来に向けて制御精度の向上を見込むことができる。 Moreover, according to this configuration, when the second manipulated variable _u2 (t) satisfies a predetermined condition, the second manipulated variable _u2 (t) is invalidated and only the control based on the first manipulated variable _u1 (t) is performed, so that highly reliable control can be performed. Moreover, since the data for the period is provided as data for machine learning, it is possible to expect improvement in control accuracy in the future.

＜２．変形例＞
上記実施形態は例示的な実施形態であり、本発明は様々な変形が可能である。 2. Modified Examples
The above-described embodiment is an exemplary embodiment, and the present invention can be modified in various ways.

上述の実施形態においては、コントローラとして、ＰＩＤコントローラを例示したが、本発明はこのような構成に限定されない。従って、同種の機能を有する他のコントローラであってもよく、又、例えば、Ｐ制御、ＰＩ、ＰＤ制御など、その一部のゲインのみを利用した制御を利用してもよい。 In the above embodiment, a PID controller is used as an example of a controller, but the present invention is not limited to this configuration. Therefore, other controllers having similar functions may be used, or control using only some of the gains, such as P control, PI control, or PD control, may be used.

上述の実施形態においては、各時間ステップ毎にフラグの状態を確認して、都度リアルタイムに学習処理を行う構成（オンライン学習）としたが、本発明はこのような構成に限定されない。従って、例えば、ある程度学習対象となるデータが蓄積されるのを待って、バッチ的に学習（バッチ学習、ミニバッチ学習）を行ってもよい。 In the above embodiment, the state of the flag is checked for each time step, and the learning process is performed in real time each time (online learning), but the present invention is not limited to such a configuration. Therefore, for example, learning may be performed in batches (batch learning, mini-batch learning) after waiting for a certain amount of data to be learned to accumulate.

上述の実施形態においては、フラグがＯＮとなった場合に（Ｓ７２１）、前１ステップに係るデータを学習する構成としたが、本発明はこのような構成に限定されない。従って、例えば、当該１ステップへと至る１又は複数のステップのデータも利用して学習（Ｓ７５）を行ってもよい。このような学習は、特に、学習対象に連続性がある場合に有効となり得る。 In the above embodiment, when the flag is ON (S721), data relating to the previous step is learned, but the present invention is not limited to such a configuration. Therefore, for example, learning (S75) may be performed using data from one or more steps leading up to the step in question. Such learning can be particularly effective when there is continuity in the learning subject.

上述の実施形態においては、第２の操作量ｕ_２（ｔ）が所定の範囲（図１０の「Ｒ」で示される領域）から外れる場合に（Ｓ７１９ＮＯ）、無効化処理（Ｓ７２０）が行われることから、当該領域（Ｒ）から外れる場合にフラグをＯＮとして学習（Ｓ７５）する構成とした。しかしながら、本発明はこのような構成に限定されない。従って、例えば、第２の操作量ｕ_２（ｔ）が所定の範囲（Ｒ）内にあるか否かを問わず、第２の操作量ｕ_２（ｔ）が０又はその近傍（０±εの範囲）（εは微小な値）となる場合に学習（Ｓ７５）を行ってもよい。なお、このとき、この微小な値εをユーザが任意に設定可能なように構成してもよい。 In the above embodiment, when the second operation amount u ₂ (t) falls outside a predetermined range (the area indicated by "R" in FIG. 10) (NO in S719), the invalidation process (S720) is performed, and therefore, when the second operation amount u 2 (t) falls outside the area (R), the flag is set to ON and learning is performed (S75). However, the present invention is not limited to such a configuration. Therefore, for example, regardless of whether the second operation amount u ₂ (t) is within the predetermined range (R) or not, learning (S75) may be performed when the second operation amount u ₂ (t) is 0 or close to it (0±ε range) (ε is a small value). In this case, the small value ε may be arbitrarily set by the user.

上述の実施形態においては、木構造モデルを基本とした機械学習モデルを利用したが、本発明はこのような構成に限定されない。従って、例えば、ニューラルネットワークやサポート・ベクター・マシーン等の他の機械学習モデルを利用してもよい。 In the above embodiment, a machine learning model based on a tree structure model was used, but the present invention is not limited to such a configuration. Therefore, for example, other machine learning models such as neural networks and support vector machines may be used.

本発明は、制御装置を利用する種々の産業等にて利用可能である。 The present invention can be used in various industries that use control devices.

１制御部
２記憶部
３Ｉ／Ｏ部
４入力部
５表示部
６通信部
１０基本システム
１１第１のＰＩＤコントローラ
１２制御機構
１００制御装置
１２１操作部
１２２検出部
１３データロガー
３０拡張システム
３１第１のコントローラ
３２制御機構
３３データロガー
３４学習処理部
３５予測処理部
３７第２のコントローラ
３８無効化処理部
３９判定部
２００フィードバックシステム
２０１コントローラ
２０２制御機構 REFERENCE SIGNS LIST 1 Control unit 2 Memory unit 3 I/O unit 4 Input unit 5 Display unit 6 Communication unit 10 Basic system 11 First PID controller 12 Control mechanism 100 Control device 121 Operation unit 122 Detection unit 13 Data logger 30 Expansion system 31 First controller 32 Control mechanism 33 Data logger 34 Learning processing unit 35 Prediction processing unit 37 Second controller 38 Invalidation processing unit 39 Determination unit 200 Feedback system 201 Controller 202 Control mechanism

Claims

A control device for performing feedback control on a predetermined device,
a first controller that generates a first manipulated variable for the device based on an output and a target value fed back from the device;
A predicted output generating unit including a trained model that has been machine-learned to generate a predicted output from the device based on an output fed back from the device and the first manipulated variable;
a second controller that receives the deviation between the predicted output and the target value as an input and generates a second operation input for the device;
an integrated operation amount generating unit that adds the first operation amount and the second operation amount to generate an integrated operation amount that is an operation amount for the device;
A determination unit that determines whether the second manipulated variable is within a predetermined value range ;
an invalidation processing unit that does not provide the second operation amount to the integrated operation amount generation unit when the determination unit determines that the second operation amount is not within the value range, thereby setting the first operation amount as the integrated operation amount;
a storage unit that stores the first manipulated variable, the output fed back from the device, and the output from the device corresponding to the integrated manipulated variable as machine learning data when it is determined that the second manipulated variable is not within the value range ;
The control device, wherein the first controller and the second controller each perform any one of P control, PI control, PD control, and PID control.

The control device further comprises:
The control device according to claim 1 , further comprising a learning processing unit that performs a learning process based on the machine learning data and updates the trained model.

The storage unit further includes:
The control device according to claim 1 , wherein when the second operation amount becomes 0 , the first operation amount, the output fed back from the device, and the output from the device corresponding to the integrated operation amount are stored as data for machine learning.

The storage unit further includes:
The control device according to claim 1, further storing as machine learning data the first operation amount relating to one or more time steps temporally prior to the reference time step, the output fed back from the device, and the output from the device corresponding to the integrated operation amount, in addition to the first operation amount relating to a reference time step when it is determined that the second operation amount is not within the value range, the output fed back from the device, and the output from the device corresponding to the integrated operation amount.

The control device according to claim 1, wherein the trained model is obtained by performing machine learning using a training model having a tree structure constructed by hierarchically arranging a plurality of nodes each corresponding to a hierarchically divided state space.

A control method in a control device for performing feedback control on a predetermined device, comprising:
The control device includes:
a first controller that generates a first manipulated variable for the device based on an output and a target value fed back from the device;
A predicted output generating unit including a trained model that has been machine-learned to generate a predicted output from the device based on an output fed back from the device and the first manipulated variable;
a second controller that uses the deviation between the predicted output and the target value as an input to generate a second operation amount for the device;
an integrated operation amount generating step of adding the first operation amount and the second operation amount to generate an integrated operation amount that is an operation amount for the device;
a determination step of determining whether the second manipulated variable is within a predetermined value range ;
an invalidation processing step of not adding the second operation amount to the first operation amount in the integrated operation amount generation step when it is determined in the determination step that the second operation amount is not within the value range, thereby setting the first operation amount as the integrated operation amount;
and a storage step of storing, when it is determined that the second manipulated variable is not within the value range , the first manipulated variable, the output fed back from the device, and the output from the device corresponding to the integrated manipulated variable as machine learning data;
A control method, wherein the first controller and the second controller each perform any one of P control, PI control, PD control, and PID control.

A control program for a control device for performing feedback control on a predetermined device,
The control device includes:
a first controller that generates a first manipulated variable for the device based on an output and a target value fed back from the device;
A predicted output generating unit including a trained model that has been machine-learned to generate a predicted output from the device based on an output fed back from the device and the first manipulated variable;
a second controller that uses the deviation between the predicted output and the target value as an input to generate a second operation amount for the device;
an integrated operation amount generating step of adding the first operation amount and the second operation amount to generate an integrated operation amount that is an operation amount for the device;
a determination step of determining whether the second manipulated variable is within a predetermined value range ;
an invalidation processing step of not adding the second operation amount to the first operation amount in the integrated operation amount generation step when it is determined in the determination step that the second operation amount is not within the value range, thereby setting the first operation amount as the integrated operation amount;
and a storage step of storing, when it is determined that the second manipulated variable is not within the value range , the first manipulated variable, the output fed back from the device, and the output from the device corresponding to the integrated manipulated variable as machine learning data;
The control program, wherein the first controller and the second controller each perform any one of P control, PI control, PD control, and PID control.

A control system for performing feedback control on a predetermined device, comprising:
a first controller that generates a first manipulated variable for the device based on an output and a target value fed back from the device;
A predicted output generating unit including a trained model that has been machine-learned to generate a predicted output from the device based on an output fed back from the device and the first manipulated variable;
a second controller that receives the deviation between the predicted output and the target value as an input and generates a second operation input for the device;
an integrated operation amount generating unit that adds the first operation amount and the second operation amount to generate an integrated operation amount that is an operation amount for the device;
A determination unit that determines whether the second manipulated variable is within a predetermined value range ;
an invalidation processing unit that does not provide the second operation amount to the integrated operation amount generation unit when the determination unit determines that the second operation amount is not within the value range, thereby setting the first operation amount as the integrated operation amount;
a storage unit that stores the first manipulated variable, the output fed back from the device, and the output from the device corresponding to the integrated manipulated variable as machine learning data when it is determined that the second manipulated variable is not within the value range ;
A control system, wherein the first controller and the second controller each perform any one of P control, PI control, PD control, and PID control.