JP4627509B2

JP4627509B2 - Plant control apparatus and plant control method

Info

Publication number: JP4627509B2
Application number: JP2006091672A
Authority: JP
Inventors: 孝朗関合; 昭彦山田; 喜治林; 尚弘楠見; 悟清水; 雅之深井
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2006-03-29
Filing date: 2006-03-29
Publication date: 2011-02-09
Anticipated expiration: 2026-03-29
Also published as: JP2007265212A

Description

本発明は火力発電プラント等のプラントの制御装置及びプラントの制御方法に関する。 The present invention relates to a control device for a plant such as a thermal power plant and a method for controlling the plant.

プラントの制御装置では、制御対象であるプラントから得られる計測信号を処理し、制御対象に与える操作信号を算出する。制御装置には、プラントの計測信号が運転目標を達成するように、操作信号を計算するアルゴリズムが実装されている。 The plant control device processes a measurement signal obtained from a plant that is a control target, and calculates an operation signal to be given to the control target. The control device is implemented with an algorithm for calculating an operation signal so that the measurement signal of the plant achieves the operation target.

プラントの制御に用いられている制御アルゴリズムとして、ＰＩ（比例・積分）制御アルゴリズムがある。ＰＩ制御では、運転目標値とプラントの計測信号との偏差に比例ゲインを乗じた値に、偏差を時間積分した値を加算し、プラントを制御する制御装置の操作信号を導出する。また、学習アルゴリズムを用いて、プラントを制御する制御装置の操作信号を導出する場合もある。 As a control algorithm used for plant control, there is a PI (proportional / integral) control algorithm. In PI control, a value obtained by multiplying the deviation between the operation target value and the plant measurement signal by a proportional gain is added to a value obtained by time-integrating the deviation to derive an operation signal of a control device that controls the plant. Moreover, the operation signal of the control apparatus which controls a plant may be derived | led-out using a learning algorithm.

学習アルゴリズムを用いてプラントを制御する制御装置の操作信号を導出する方法として、特開２０００−３５９５６号公報にはエージェント学習装置に関する技術が記載されている。 As a method of deriving an operation signal of a control device that controls a plant using a learning algorithm, Japanese Patent Application Laid-Open No. 2000-35956 describes a technique related to an agent learning device.

技術文献の強化学習（ＲｅｉｎｆｏｒｃｅｍｅｎｔＬｅａｒｎｉｎｇ）の２４７頁〜２５３頁にはＤｙｎａ−アーキテクチャを用いる方法に関する技術が記載されている。 Techniques relating to a method using the Dyna-architecture are described in pages 247 to 253 of Reinforcement Learning in the technical literature.

これらの技術による方法では、制御装置に制御対象の特性を予測するモデルと、このモデルの予測結果であるモデル出力がモデル出力目標を達成するようなモデル入力の生成方法を予め学習する学習部を持ち、学習部による学習結果に従って制御対象に与える操作信号を生成している。 In the methods based on these technologies, a learning unit that learns in advance a model that predicts the characteristics of a control target in a control device and a model input generation method that achieves a model output target as a model output that is a prediction result of the model. The operation signal to be given to the controlled object is generated according to the learning result of the learning unit.

そして、モデルと制御対象の制御特性との間に誤差がある場合には、制御対象を操作した結果である計測信号を用いてモデルを修正し、この修正されたモデルを対象に操作信号の生成方法を再度学習するようになっている。 If there is an error between the model and the control characteristics of the controlled object, the model is corrected using the measurement signal that is the result of operating the controlled object, and an operation signal is generated for the corrected model. Learn to learn again.

特開２０００−３５９５６号公報JP 2000-35956 A 強化学習（ＲｅｉｎｆｏｒｃｅｍｅｎｔＬｅａｒｎｉｎｇ）、三上貞芳・皆川雅章共訳、森北出版株式会社、２０００年１２月２０日出版Reinforcement Learning, Sadayoshi Mikami and Masaaki Minagawa, Morikita Publishing Co., Ltd., published on December 20, 2000

特許文献１、及び非特許文献１に記載の方法を用いて制御装置に対する操作信号の生成方法を学習する際に、学習の拘束条件を決定する必要がある。例えば、制御対象のプラントの操作端の動作速度が変わると、１回の操作で動かすことのできる操作量の幅が変わるため、学習の結果も変化する。従って、学習結果を得るためには、操作端の動作速度に関する情報を用いて学習の拘束条件を適切に設定する必要がある。 When learning the method of generating the operation signal for the control device using the methods described in Patent Literature 1 and Non-Patent Literature 1, it is necessary to determine the constraint conditions for learning. For example, when the operating speed of the operation end of the plant to be controlled changes, the range of the operation amount that can be moved by one operation changes, so the learning result also changes. Therefore, in order to obtain a learning result, it is necessary to appropriately set a learning constraint condition using information on the operation speed of the operation end.

しかしながら、このような学習の拘束条件を事前に設定することは難しい。プラントの制御では制御装置の複数の操作端を用いてプラントが運転されており、同じ設計仕様の操作端であっても実際の動作速度にばらつきがある場合が多い。また、これらの操作端が経年劣化して動作速度が低下する可能性もある。 However, it is difficult to set such learning constraint conditions in advance. In plant control, a plant is operated using a plurality of operation ends of a control device, and even if the operation ends have the same design specifications, the actual operation speed often varies. In addition, there is a possibility that the operating speed is lowered due to deterioration of these operation ends over time.

操作端に動作速度のばらつきや動作速度の低下が発生すると、学習したモデル入力の生成方法に従って生成した操作信号を制御対象のプラントに与えても、望ましい制御結果が得られないことになる。 When a variation in operation speed or a decrease in operation speed occurs at the operation end, a desired control result cannot be obtained even if an operation signal generated according to the learned model input generation method is given to the plant to be controlled.

本発明の目的は、プラントの制御に使用する複数の操作端の動作速度にはばらつきがある場合や、操作端が経年劣化し動作速度が劣化した場合でも、プラントを良好に制御することができるように学習の拘束条件を適切に決定する機能を持つプラントの制御装置及びプラントの制御方法を提供することにある。 An object of the present invention is to control a plant satisfactorily even when there are variations in the operation speeds of a plurality of operation ends used for plant control, or even when the operation ends deteriorate over time. An object of the present invention is to provide a plant control apparatus and a plant control method having a function of appropriately determining learning constraint conditions.

本発明プラントの制御装置は、プラントの運転状態量である計測信号を用いてプラントに与える制御指令となる操作信号を算出する操作信号生成部を備えたプラントの制御装置において、制御装置には、制御対象となるプラントの制御特性を模擬するモデルと、操作信号生成部で操作信号の算出に使用する制御パラメータを含む制御ロジックデータが保存されている制御ロジックデータベースと、プラントの状態量を制御する操作端の操作端仕様データが保存されている操作端仕様データベースと、過去の操作信号が保存されている操作信号データベースと、過去の計測信号が保存されている計測信号データベースと、制御ロジックデータベースと操作端仕様データベースに保存されている制御ロジックデータと操作端仕様データを用いて単位時間当たりの操作信号変化幅の制限値である操作端の動作限界速度、上限値、及び下限値が含まれる学習パラメータの初期値を決定する機能、及び操作信号データベースと計測信号データベースに保存されている操作信号データと計測信号データを用いて、サンプル制御周期である単位時間当たりの計測信号データの変化量が操作信号データの変化量よりも小さい場合に、制御ロジックデータである前記学習パラメータに含まれる操作端の動作限界速度を前記計測信号データの変化量の値に更新する機能とを持つ学習条件決定部と、学習パラメータに含まれている単位時間当たりの操作信号変化幅の制限値を学習の拘束条件に設定して前記モデルを用いて、モデルで模擬するモデル出力がモデル出力の目標値を達成するモデル入力の生成方法を学習する学習部と、学習部でモデル入力の生成方法を学習した結果である学習情報データが保存されている学習情報データベースを夫々備えさせ、操作信号生成部にはプラントの運転状態量である計測信号と学習情報データベースに保存されている学習情報データを用いてプラントに対する操作信号を算出する学習信号生成部を備えさせたことを特徴とする。
また、本発明プラントの制御装置は、火力発電プラントの運転状態量である計測信号を用いて火力発電プラントに与える制御指令となる操作信号を算出して火力発電プラントを制御するプラントの制御装置において、制御装置には、火力発電プラントの運転状態量である計測信号を用いてプラントに与える制御指令となる操作信号を算出する操作信号生成部と、制御対象となる火力発電プラントの制御特性を模擬するモデルと、操作信号生成部で操作信号の算出に使用する制御パラメータを含む制御ロジックデータが保存されている制御ロジックデータベースと、火力発電プラントの状態量を制御する操作端の操作端仕様データが保存されている操作端仕様データベースと、過去の操作信号が保存されている操作信号データベースと、過去の計測信号が保存されている計測信号データベースと、制御ロジックデータベースと操作端仕様データベースに保存されている制御ロジックデータと操作端使用データを用いて単位時間当たりの操作信号変化幅の制限値である操作端の動作限界速度、上限値、下限値が含まれる学習パラメータの初期値を決定する機能、及び操作信号データベースと計測信号データベースに保存されている操作信号データと計測信号データを用いて、サンプル制御周期である単位時間当たりの計測信号データの変化量が操作信号データの変化量よりも小さい場合に、制御ロジックデータである前記学習パラメータに含まれる操作端の動作限界速度を前記計測信号データの変化量の値に更新する機能を持つ学習条件決定部と、学習パラメータに含まれている単位時間当たりの操作信号変化幅の制限値を学習の拘束条件に設定して前記モデルを用いて、モデルで模擬するモデル出力がモデル出力の目標値を達成するように、火力発電プラントの操作方法を学習する学習部と、学習部でモデル入力の生成方法を学習した結果である学習情報データが保存されている学習情報データベースを夫々備えさせ、前記操作信号生成部にはプラントの運転状態量である計測信号と学習情報データベースに保存されている学習情報データを用いて火力発電プラントに対する操作信号を算出する学習信号生成部を備えさせたことを特徴とする。
The plant control apparatus of the present invention is a plant control apparatus including an operation signal generation unit that calculates an operation signal that is a control command to be given to the plant using a measurement signal that is an operation state quantity of the plant. A model that simulates the control characteristics of the plant to be controlled, a control logic database that stores control logic data including control parameters used to calculate operation signals in the operation signal generator, and a state quantity of the plant is controlled An operation end specification database storing operation end specification data of an operation end, an operation signal database storing past operation signals, a measurement signal database storing past measurement signals, and a control logic database; Using control logic data and operation end specification data stored in the operation end specification database, A function that determines the initial value of the learning parameter including the operation limit speed, upper limit value, and lower limit value of the operation end, which is the limit value of the operation signal change width per time, and is stored in the operation signal database and the measurement signal database. Included in the learning parameter that is control logic data when the amount of change in measurement signal data per unit time, which is the sample control period, is smaller than the amount of change in operation signal data Learning condition determination unit having a function of updating the operation limit speed of the operation end to be changed to the value of the change amount of the measurement signal data, and learning the limit value of the operation signal change width per unit time included in the learning parameter How to generate a model input in which the model output simulated by the model achieves the target value of the model output And a learning information database in which learning information data, which is a result of learning a model input generation method by the learning unit, is stored, and the operation signal generation unit is an operation state quantity of the plant. A learning signal generation unit that calculates an operation signal for the plant using the measurement signal and learning information data stored in a learning information database is provided.
The plant control device of the present invention is a plant control device that controls the thermal power plant by calculating an operation signal as a control command to be given to the thermal power plant using a measurement signal that is an operation state quantity of the thermal power plant. The control device simulates the control characteristics of a thermal power plant to be controlled, and an operation signal generator that calculates an operation signal that is a control command given to the plant using a measurement signal that is an operation state quantity of the thermal power plant. A control logic database in which control logic data including a control parameter used to calculate an operation signal in the operation signal generation unit is stored, and operation end specification data of an operation end that controls the state quantity of the thermal power plant A stored operation end specification database, an operation signal database storing past operation signals, and a past Operation that is the limit value of operation signal change width per unit time using measurement signal database in which measurement signals are stored, control logic data stored in control logic database and operation end specification database, and operation end use data Sample control using the function to determine the initial value of the learning parameter including the upper limit speed, upper limit value, and lower limit value, and the operation signal data and measurement signal data stored in the operation signal database and measurement signal database When the change amount of the measurement signal data per unit time, which is a cycle, is smaller than the change amount of the operation signal data, the change in the measurement signal data is determined as the operation limit speed of the operation end included in the learning parameter that is the control logic data. A learning condition determination unit that has the function of updating to a quantity value, and the unit time included in the learning parameter Learning how to operate a thermal power plant using the model to set the limit value of the change width of the operation signal per hit as a learning constraint, so that the model output simulated by the model achieves the target value of the model output And a learning information database in which learning information data, which is a result of learning a model input generation method by the learning unit, is stored, and the operation signal generation unit measures a plant operating state quantity. A learning signal generation unit that calculates an operation signal for the thermal power plant using the signal and learning information data stored in the learning information database is provided.

また、本発明プラントの制御方法は、プラントの運転状態量である計測信号を用いてプラントに与える制御指令となる操作信号を算出してプラントを制御するプラントの制御方法において、プラントの制御装置に備えたモデルによって制御対象となるプラントの制御特性を模擬し、制御装置に備えた操作信号生成部によって操作信号の算出に使用する制御パラメータを含む制御ロジックデータを制御装置の制御ロジックデータベースに保存し、プラントの状態量を制御する操作端の操作端仕様データを制御装置に備えた操作端仕様データベースに保存し、過去の操作信号を制御装置に備えた操作信号データベースに保存し、過去の計測信号を制御装置に備えた計測信号データベースに保存し、制御装置に備えた学習条件決定部によって制御ロジックデータベースと操作端仕様データベースに保存されている制御ロジックデータと操作端使用データを用いて単位時間当たりの操作信号変化幅の制限値である操作端の動作限界速度、上限値、及び下限値が含まれる学習パラメータの初期値を決定すると共に、操作信号データベースと計測信号データベースに保存されている操作信号データと計測信号データを用いてサンプル制御周期である単位時間当たりの計測信号データの変化量が操作信号データの変化量よりも小さい場合に、制御ロジックデータである前記学習パラメータに含まれる操作端の動作限界速度の値を前記計測信号データの変化量の値に更新するようにし、制御装置に備えた学習部によって学習パラメータに含まれている単位時間当たりの操作信号変化幅の制限値を学習の拘束条件に設定して前記モデルを用いてモデルで模擬するモデル出力がモデル出力の目標値を達成するように、前記学習部によってプラントの特性を模擬してモデル入力の生成方法を学習し、学習部でモデル入力の生成方法を学習した結果である学習情報データを学習情報データベースに保存し、前記操作信号生成部に備えた学習信号生成部によってプラントの運転状態量である計測信号と学習情報データベースに保存されている学習情報データを用いてプラントに与える制御指令となる操作信号を算出して、プラントを制御するようにしたことを特徴とする。
また、本発明プラントの制御方法は、火力発電プラントの運転状態量である計測信号を用いて火力発電プラントに与える制御指令となる操作信号を算出して火力発電プラントを制御するプラントの制御方法において、プラントの制御装置に備えたモデルによって制御対象となるプラントの制御特性を模擬し、制御装置に備えた操作信号生成部によって操作信号の算出に使用する制御パラメータを含む制御ロジックデータを制御装置に備えた制御ロジックデータベースに保存し、プラントの状態量を制御する操作端の操作端仕様データを操作端仕様データベースに保存し、過去の操作信号を制御装置に備えた操作信号データベースに保存し、過去の計測信号を制御装置に備えた計測信号データベースに保存し、制御装置に備えた学習条件決定部によって制御ロジックデータベースと前記操作端仕様データベースに保存されているデータを用いて単位時間当たりの操作信号変化幅の制限値である操作端の動作限界速度、上限値、及び下限値が含まれる学習パラメータの初期値を決定すると共に、前記操作信号データベースと前記計測信号データベースに保存されている操作信号データと計測信号データを用いていてサンプル制御周期である単位時間当たりの計測信号データの変化量が操作信号データの変化量よりも小さい場合に、制御ロジックデータである前記学習パラメータに含まれる操作端の動作限界速度の値を前記計測信号データの変化量の値に更新するようにし、制御装置に備えた学習部によって学習パラメータに含まれている単位時間当たりの操作信号変化幅の制限値を学習の拘束条件に設定して前記モデルを用いてモデルで模擬するモデル出力がモデル出力の目標値を達成するように、前記学習部によってプラントの特性を模擬してモデル入力の生成方法を学習し、学習部でモデル入力の生成方法を学習した結果である学習情報データを学習情報データベースに保存し、前記操作信号生成部に備えた学習信号生成部によってプラントの運転状態量である計測信号と学習情報データベースに保存されている学習情報データを用いてプラントに与える制御指令となる操作信号を算出して、プラントを制御するようにしたことを特徴とする。

Further, the plant control method of the present invention is a plant control method for controlling a plant by calculating an operation signal serving as a control command to be given to the plant using a measurement signal that is an operation state quantity of the plant. The control model of the plant to be controlled is simulated by the equipped model, and the control logic data including the control parameters used to calculate the operation signal is stored in the control logic database of the control device by the operation signal generator provided in the control device. The operation end specification data of the operation end for controlling the state quantity of the plant is stored in the operation end specification database provided in the control device, the past operation signal is stored in the operation signal database provided in the control device, and the past measurement signal is stored. Are stored in the measurement signal database provided in the control device and controlled by the learning condition determination unit provided in the control device. Operation limit speed, upper limit value, and lower limit value that are the limit value of the operation signal change width per unit time using control logic data and operation end usage data stored in the control database and operation end specification database The amount of change in measurement signal data per unit time, which is the sample control period, is determined using the operation signal data and measurement signal data stored in the operation signal database and measurement signal database . Is smaller than the change amount of the operation signal data, the operation limit speed value of the operation end included in the learning parameter which is the control logic data is updated to the change amount value of the measurement signal data, and the control device Learn the limit value of the operation signal change width per unit time included in the learning parameter by the learning unit The learning unit simulates the characteristics of the plant and learns how to generate the model input so that the model output that is set as a constraint condition and simulated by the model using the model achieves the target value of the model output. The learning information data, which is the result of learning the generation method of the model input in the unit, is stored in the learning information database, and the measurement signal and the learning information database that are the operation state amount of the plant by the learning signal generation unit provided in the operation signal generation unit The operation information serving as a control command to be given to the plant is calculated using the learning information data stored in the plant to control the plant.
The plant control method of the present invention is a plant control method for controlling a thermal power plant by calculating an operation signal as a control command to be given to the thermal power plant using a measurement signal which is an operation state quantity of the thermal power plant. The control device simulates the control characteristics of the plant to be controlled by the model provided in the control device of the plant, and the control logic data including the control parameters used for calculating the operation signal by the operation signal generation unit provided in the control device is supplied to the control device. Save to the control logic database provided, save the operation end specification data of the operation end that controls the state quantity of the plant to the operation end specification database, save the past operation signal to the operation signal database provided in the control device, Is stored in a measurement signal database provided in the control device, and a learning condition determination unit provided in the control device Therefore, the learning parameters including the operation limit speed, the upper limit value, and the lower limit value of the operation end which are the limit values of the operation signal change width per unit time using the data stored in the control logic database and the operation end specification database. And the change amount of the measurement signal data per unit time, which is a sample control cycle, is determined by using the operation signal data and the operation signal data stored in the measurement signal database and the measurement signal data. When the change amount of the signal data is smaller , the value of the operation limit speed of the operation end included in the learning parameter that is the control logic data is updated to the value of the change amount of the measurement signal data. Learning the limit value of the operation signal change width per unit time included in the learning parameter by the learning unit The learning unit simulates the characteristics of the plant and learns how to generate the model input so that the model output that is set as a constraint condition and simulated by the model using the model achieves the target value of the model output. The learning information data, which is the result of learning the generation method of the model input in the unit, is stored in the learning information database, and the measurement signal and the learning information database that are the operation state amount of the plant by the learning signal generation unit provided in the operation signal generation unit The operation information serving as a control command to be given to the plant is calculated using the learning information data stored in the plant to control the plant.

本発明によれば、プラントの制御に使用する複数の操作端の動作速度にはばらつきがある場合や操作端が経年劣化し動作速度が劣化した場合でも、プラントを良好に制御することができるように学習の拘束条件を適切に決定する機能を持つプラントの制御装置及びプラントの制御方法を実現することができる。 According to the present invention, it is possible to control a plant satisfactorily even when the operation speeds of a plurality of operation ends used for plant control vary or even when the operation ends deteriorate due to aging. In addition, it is possible to realize a plant control apparatus and a plant control method having a function of appropriately determining learning constraint conditions.

次に、本発明の実施例であるプラントの制御装置について図面を参照して説明する。 Next, a plant control apparatus according to an embodiment of the present invention will be described with reference to the drawings.

図１は、本発明の一実施例であるプラントの制御装置を示す制御システム図である。 FIG. 1 is a control system diagram showing a plant control apparatus according to an embodiment of the present invention.

図１において、プラント１００は制御装置２００によって制御されるように構成されている。 In FIG. 1, the plant 100 is configured to be controlled by a control device 200.

制御対象のプラント１００の制御を行う制御装置２００には、演算装置として、操作信号生成部３００、学習部４００、モデル５００、評価値計算部６００、学習条件決定部７００、及び学習情報追加部８００が夫々設けられている。 The control device 200 that controls the plant 100 to be controlled includes an operation signal generation unit 300, a learning unit 400, a model 500, an evaluation value calculation unit 600, a learning condition determination unit 700, and a learning information addition unit 800 as arithmetic devices. Are provided.

また、制御装置２００には、データーベースとして、計測信号データベース２１０、操作端仕様データベース２２０、操作信号データベース２３０、制御ロジックデータベース２４０、学習パラメータデータベース２５０、評価値計算パラメータデータベース２６０、モデルパラメータデータベース２７０、及び学習情報データベース２８０が夫々設けられている。 Further, the control device 200 includes, as a database, a measurement signal database 210, an operation end specification database 220, an operation signal database 230, a control logic database 240, a learning parameter database 250, an evaluation value calculation parameter database 260, a model parameter database 270, And a learning information database 280 are provided.

また、制御装置２００には、外部とのインターフェイスとして、外部入力インターフェイス２０１及び外部出力インターフェイス２０２が設けられている。 In addition, the control device 200 is provided with an external input interface 201 and an external output interface 202 as interfaces with the outside.

そして、前記制御装置２００では、外部入力インターフェイス２０１を介してプラント１００からプラント１００の制御出力である計測信号１を制御装置２００に取り込む。また、外部出力インターフェイス２０２を介して制御対象１００に制御装置２００から制御指令となる操作信号２４を送信するようになっている。 In the control device 200, the measurement signal 1 that is a control output of the plant 100 is taken into the control device 200 from the plant 100 via the external input interface 201. Further, an operation signal 24 serving as a control command is transmitted from the control device 200 to the controlled object 100 via the external output interface 202.

次に制御装置２００における制御の詳細を説明すると、プラント１００の計測信号１として外部入力インターフェイス２０１に取り込んだ計測信号２は、操作信号生成部３００に伝送されると共に、計測信号データベース２１０に保存される。また、操作信号生成部３００にて生成する操作信号２３は、外部出力インターフェイス２０２に伝送されると共に、操作信号データベース２３０に保存される。 Next, the details of the control in the control apparatus 200 will be described. The measurement signal 2 taken into the external input interface 201 as the measurement signal 1 of the plant 100 is transmitted to the operation signal generation unit 300 and also stored in the measurement signal database 210. The Further, the operation signal 23 generated by the operation signal generation unit 300 is transmitted to the external output interface 202 and is stored in the operation signal database 230.

操作信号生成部３００では、制御ロジックデータベース２４０に保存されている制御ロジックデータ１１、及び学習情報データベース２８０に保存されている学習情報データ２２を用いて、プラント１００の計測信号１が運転目標値を達成するように、操作信号２３を生成する。 In the operation signal generation unit 300, the measurement signal 1 of the plant 100 uses the control logic data 11 stored in the control logic database 240 and the learning information data 22 stored in the learning information database 280 to determine the operation target value. An operation signal 23 is generated to achieve this.

この制御ロジックデータベース２４０には操作信号生成部３００に制御ロジックデータ１１を出力するため、制御ロジックデータ１１を算出する制御回路及び制御パラメータが保存されている。 The control logic database 240 stores control circuits and control parameters for calculating the control logic data 11 in order to output the control logic data 11 to the operation signal generation unit 300.

学習情報データベース２８０に保存される学習情報データは、学習部４００、もしくは学習情報追加部８００にて生成される。学習部４００は、モデル５００、評価値計算部６００、及び学習条件決定部７００と夫々接続されている。 The learning information data stored in the learning information database 280 is generated by the learning unit 400 or the learning information adding unit 800. The learning unit 400 is connected to the model 500, the evaluation value calculation unit 600, and the learning condition determination unit 700, respectively.

モデル５００は、プラント１００の制御特性を模擬する機能を持つものである。すなわち、制御指令となる操作信号２４をプラント１００に与え、その制御結果の計測信号１を得るのと同じことを模擬演算するものである。
この模擬演算のために、モデル５００を動作させるモデル入力１７を学習部４００から受け、モデル５００にてプラント１００の制御動作を模擬演算して、その模擬演算結果のモデル出力１８を得るように構成されている。ここで、モデル出力１８は、プラント１００の計測信号１の予測値となる。 The model 500 has a function of simulating the control characteristics of the plant 100. That is, the operation signal 24 serving as a control command is given to the plant 100 and the same operation as that for obtaining the measurement signal 1 as a result of the control is simulated.
For this simulation calculation, the model input 17 for operating the model 500 is received from the learning unit 400, the control operation of the plant 100 is simulated by the model 500, and the model output 18 of the simulation calculation result is obtained. Has been. Here, the model output 18 is a predicted value of the measurement signal 1 of the plant 100.

このモデル５００は、プラント１００の制御特性を模擬演算するモデルを有しており、物理法則に基づくモデル式用いた物理モデル、ニューラルネットワークなどの統計的手法を用いた統計モデル、あるいは、物理モデルと統計モデルを併用して、モデル入力１７に対するモデル出力１８を計算する機能を持っている。 The model 500 has a model for simulating the control characteristics of the plant 100. A physical model using a model formula based on a physical law, a statistical model using a statistical method such as a neural network, or a physical model and It has a function of calculating the model output 18 for the model input 17 by using the statistical model together.

モデル５００では、モデル入力１７に基づいてプラント１００の制御を模擬演算してモデル出力１８を計算する際に必要な他のデータは、モデルパラメータデータベース２７０に保存されているデータをモデル５００に入力させて使用する。 In the model 500, other data necessary for calculating the model output 18 by simulating the control of the plant 100 based on the model input 17 is input to the model 500 as data stored in the model parameter database 270. To use.

評価値計算部６００は、評価値計算パラメータデータベース２６０に保存されている評価値計算パラメータ１５とモデル５００から入力したモデル出力１８を用いて、評価値１９を計算する。 The evaluation value calculation unit 600 calculates the evaluation value 19 using the evaluation value calculation parameter 15 stored in the evaluation value calculation parameter database 260 and the model output 18 input from the model 500.

学習部４００は、学習情報データベース２８０に保存されている学習情報データ２１と、学習パラメータデータベース２５０に保存されている学習パラメータ１４を用いて、モデル５００に入力すべきモデル入力１７を生成する。 The learning unit 400 generates a model input 17 to be input to the model 500 using the learning information data 21 stored in the learning information database 280 and the learning parameter 14 stored in the learning parameter database 250.

モデル５００ではモデル入力１７を入力して内部の模擬モデルを使用して模擬演算したモデル出力１８を出力する。 In the model 500, the model input 17 is inputted, and the model output 18 simulated by using the internal simulation model is outputted.

評価値計算部６００ではモデル５００で模擬演算したモデル出力１８から評価値１９を計算し、この評価値１９を学習部４００に入力する。 The evaluation value calculation unit 600 calculates an evaluation value 19 from the model output 18 simulated by the model 500 and inputs the evaluation value 19 to the learning unit 400.

学習部４００では、学習パラメータに含まれている単位時間当たりの操作信号変化幅の制限値を学習の拘束条件に設定してモデルを用いてプラントの操作方法を学習するために、モデル５００で模擬演算されるモデル出力１８がモデル出力目標値を達成するようなモデル入力の生成方法を、モデル出力１８、あるいは評価値１９を用いて学習する。学習結果である学習情報データ２０は、学習情報データベース２８０に保存される。 The learning unit 400 simulates the model 500 in order to learn the plant operation method using the model by setting the limit value of the operation signal change width per unit time included in the learning parameter as a learning constraint condition. A model input generation method in which the calculated model output 18 achieves the model output target value is learned using the model output 18 or the evaluation value 19. The learning information data 20 that is a learning result is stored in the learning information database 280.

学習条件決定部７００では、操作端仕様データベース２２０に保存されているプラントの操作端の動作可能範囲及び動作速度の操作端仕様データ４、及び制御ロジックデータベース２４０に保存されている制御ロジックデータ６を用いて、単位時間当たりの操作信号変化幅の制限値が含まれている学習パラメータ８の初期値を生成する。 In the learning condition determination unit 700, the operation end specification data 4 of the operable range and the operation speed of the operation end of the plant stored in the operation end specification database 220 and the control logic data 6 stored in the control logic database 240 are stored. The initial value of the learning parameter 8 including the limit value of the operation signal change width per unit time is generated.

また、学習条件決定部７００では、計測信号データベース２１０に保存されている過去の計測信号である計測信号データ３、操作信号データベース２３０に保存されている過去の操作信号である操作信号データ５、及び学習パラメータデータベース２５０に保存されている学習パラメータ９を用いて、学習パラメータ８を更新する。 In the learning condition determination unit 700, measurement signal data 3 that is a past measurement signal stored in the measurement signal database 210, operation signal data 5 that is a past operation signal stored in the operation signal database 230, and The learning parameter 8 is updated using the learning parameter 9 stored in the learning parameter database 250.

学習パラメータ９と学習パラメータ８の値が異なる場合には学習トリガ７を「１」とし、この値を学習部４００、及び学習情報追加部８００に送信する。それ以外の場合は、学習トリガ７は「０」の値である。 When the values of the learning parameter 9 and the learning parameter 8 are different, the learning trigger 7 is set to “1”, and this value is transmitted to the learning unit 400 and the learning information adding unit 800. In other cases, the learning trigger 7 has a value of “0”.

学習情報追加部８００では、学習トリガ７が「１」となった時に、学習パラメータデータベース２５０に保存されている学習パラメータ１０、及び学習情報データベース２８０に保存されている学習情報データ１２を用いて、追加学習情報データ１３を生成する。この追加学習情報データ１３は、学習情報データベース２８０に保存される。 The learning information adding unit 800 uses the learning parameter 10 stored in the learning parameter database 250 and the learning information data 12 stored in the learning information database 280 when the learning trigger 7 becomes “1”. Additional learning information data 13 is generated. The additional learning information data 13 is stored in the learning information database 280.

プラント１００の運転員は、キーボード９０１とマウス９０２で構成される外部入力装置９００、制御装置２００とデータを送受信できるデータ送受信処理部９３０を備えた保守ツール９１０、及び画像表示装置９５０を用いることにより、制御装置２００に備えられている種々のデータベースに保存されている情報にアクセスすることができる。 An operator of the plant 100 uses an external input device 900 including a keyboard 901 and a mouse 902, a maintenance tool 910 including a data transmission / reception processing unit 930 capable of transmitting / receiving data to / from the control device 200, and an image display device 950. Information stored in various databases included in the control device 200 can be accessed.

保守ツール９１０は、外部入力インターフェイス９２０、データ送受信処理部９３０、外部出力インターフェイス９４０で構成される。 The maintenance tool 910 includes an external input interface 920, a data transmission / reception processing unit 930, and an external output interface 940.

入力装置９００で生成した保守ツール入力信号３１は、外部入力インターフェイス９２０を介して保守ツール９１０に取り込まれる。保守ツール９１０のデータ送受信処理部９３０では、保守ツール入力信号３２の情報に従って、制御装置２００に備えられているデータベース情報３０を取得する。 The maintenance tool input signal 31 generated by the input device 900 is taken into the maintenance tool 910 via the external input interface 920. The data transmission / reception processing unit 930 of the maintenance tool 910 acquires the database information 30 provided in the control device 200 according to the information of the maintenance tool input signal 32.

データ送受信処理部９３０では、データベース情報３０を処理した結果得られる保守ツール出力信号３３を、外部出力インターフェイス９４０に送信する。保守ツール出力信号３４は、画像表示装置９５０に表示される。 The data transmission / reception processing unit 930 transmits a maintenance tool output signal 33 obtained as a result of processing the database information 30 to the external output interface 940. The maintenance tool output signal 34 is displayed on the image display device 950.

尚、上記した本発明の実施例の制御装置２００では、計測信号データベース２１０、操作端仕様データベース２２０、操作信号データベース２３０、制御ロジックデータベース２４０、学習パラメータデータベース２５０、評価値計算パラメータデータベース２６０、モデルパラメータデータベース２７０、及び学習情報データベース２８０が制御装置２００の内部に配置されているが、これらの全て、あるいは一部を制御装置２００の外部に配置することもできる。 In the control device 200 of the embodiment of the present invention described above, the measurement signal database 210, the operation end specification database 220, the operation signal database 230, the control logic database 240, the learning parameter database 250, the evaluation value calculation parameter database 260, the model parameters. Although the database 270 and the learning information database 280 are arranged inside the control device 200, all or part of them can be arranged outside the control device 200.

また同様に、学習部４００、モデル５００、評価値計算部６００、学習条件決定部７００、学習情報追加部８００が制御装置２００の内部に配置されているが、これらの全て、あるいは一部を制御装置２００の外部に配置することもできる。 Similarly, a learning unit 400, a model 500, an evaluation value calculation unit 600, a learning condition determination unit 700, and a learning information addition unit 800 are arranged inside the control device 200, but all or a part of them are controlled. It can also be arranged outside the device 200.

例えば、学習部４００、モデル５００、評価値計算部６００、学習パラメータデータベース２５０、評価値計算パラメータデータベース２６０、及びモデルパラメータデータベース２７０を外部のシステムとして構成し、この外部のシステムと制御装置２００とをインターネットで接続して、外部のシステムの学習部４００で生成された学習情報データ２０をインターネット経由で制御装置２００に送信するようにしても良い。 For example, the learning unit 400, the model 500, the evaluation value calculation unit 600, the learning parameter database 250, the evaluation value calculation parameter database 260, and the model parameter database 270 are configured as external systems, and the external system and the control device 200 are configured. The learning information data 20 generated by the learning unit 400 of the external system may be transmitted to the control device 200 via the Internet by connecting via the Internet.

また、評価値計算部６００及び学習情報追加部８００の一方、或いは両方を用いずに、制御装置２００を構築すれば、高度な制御機能は低下するがプラントの制御は可能である。 Further, if the control device 200 is constructed without using one or both of the evaluation value calculation unit 600 and the learning information addition unit 800, the plant can be controlled although the advanced control function is reduced.

また、プラント１００とモデル５００の特性が一致するように、モデルパラメータデータベース２７０に保存されているモデルパラメータ１６を修正する機能を付け加えるように構成しても良い。 In addition, a function of correcting the model parameter 16 stored in the model parameter database 270 may be added so that the characteristics of the plant 100 and the model 500 match.

以下では、本発明の実施例であるプラントに対する制御装置２００を、火力発電プラント１００aに適用した場合について説明する。尚、火力発電プラント以外のプラントを制御する際にも、本発明の実施例の制御装置２００を使用することができることはいうまでもない。 Below, the case where the control apparatus 200 with respect to the plant which is an Example of this invention is applied to the thermal power plant 100a is demonstrated. Needless to say, the control device 200 of the embodiment of the present invention can also be used when controlling a plant other than the thermal power plant.

図２は、火力発電プラント１００aを制御対象のプラントにした場合のプラントの概略システムを示す図である。まず、火力発電プラント１００aにおける発電の仕組みについて説明する。 FIG. 2 is a diagram showing a schematic system of a plant when the thermal power plant 100a is a control target plant. First, a power generation mechanism in the thermal power plant 100a will be described.

火力発電プラント１００aを構成するボイラ１０１には、ミル１１０で石炭を細かく粉砕した燃料となる微粉炭と、微粉炭搬送用の１次空気、及び燃焼調整用の２次空気を供給するバーナー１０２が設けられており、このバーナー１０２を介して供給した微粉炭をボイラ１０１の内部で燃焼させる。尚、微粉炭と１次空気は配管１３４から、２次空気は配管１４１からバーナー１０２に導かれる。 A boiler 101 constituting the thermal power plant 100a has a burner 102 for supplying pulverized coal, which is fuel obtained by finely pulverizing coal in a mill 110, primary air for conveying pulverized coal, and secondary air for combustion adjustment. The pulverized coal supplied via the burner 102 is combusted inside the boiler 101. The pulverized coal and the primary air are led from the pipe 134 and the secondary air is led from the pipe 141 to the burner 102.

また、ボイラ１０１には２段燃焼用のアフタエアをボイラ１０１に投入するアフタエアポート１０３が設けられており、アフタエアは配管１４２からアフタエアポート１０３に導かれる。 Further, the boiler 101 is provided with an after air port 103 for introducing after-air for two-stage combustion into the boiler 101, and the after air is led from the pipe 142 to the after-air port 103.

微粉炭の燃焼により発生した高温の燃焼ガスは、ボイラ１０１の内部の経路に沿って下流側に流れた後、ボイラ１０１に配設された熱交換器１０６を通過して熱交換し、このエアーヒーター１０４にて高温・高圧の蒸気を発生させる。その後は、排ガス処理した後に煙突から大気に放出される。 The high-temperature combustion gas generated by the combustion of the pulverized coal flows downstream along the path inside the boiler 101 and then passes through the heat exchanger 106 disposed in the boiler 101 to exchange heat. High-temperature and high-pressure steam is generated by the heater 104. After that, after exhaust gas treatment, it is emitted from the chimney to the atmosphere.

ボイラ１０１の熱交換器１０６を循環する給水は、給水ポンプ１０５を介して熱交換器１０６に給水を供給し、熱交換器１０６においてボイラ１０１を流下する燃焼ガスによって過熱され、高温高圧の蒸気となる。尚、本実施例では熱交換器１０６の数を１つとしているが、熱交換器１０６を複数個配置してもよい。 The feed water circulating through the heat exchanger 106 of the boiler 101 is supplied with the feed water to the heat exchanger 106 via the feed water pump 105, and is superheated by the combustion gas flowing down the boiler 101 in the heat exchanger 106. Become. In this embodiment, the number of heat exchangers 106 is one, but a plurality of heat exchangers 106 may be arranged.

熱交換器１０６を通過した高温高圧の蒸気はタービンガバナ１０７を介して蒸気タービン１０８に導かれ、蒸気の持つエネルギーによって蒸気タービン１０８を駆動して発電機１０９で発電する。 The high-temperature and high-pressure steam that has passed through the heat exchanger 106 is guided to the steam turbine 108 through the turbine governor 107, and the steam turbine 108 is driven by the energy of the steam to generate power by the generator 109.

火力発電プラント１００aには火力発電プラントの運転状態を検出する様々な計測器が配置されており、これらの計測器から取得されたプラントの制御出力に関する情報は、計測情報１として制御装置２００に送信される。例えば、図２には、プラントの制御出力に関する情報を検出するものとして、流量計測器１５０、温度計測器１５１、圧力計測器１５２、発電出力計測器１５３、及び濃度計測器１５４が図示されている。 Various measuring devices for detecting the operating state of the thermal power plant are arranged in the thermal power plant 100a, and information regarding the control output of the plant acquired from these measuring devices is transmitted to the control device 200 as measurement information 1. Is done. For example, FIG. 2 illustrates a flow rate measuring device 150, a temperature measuring device 151, a pressure measuring device 152, a power generation output measuring device 153, and a concentration measuring device 154 for detecting information related to plant control output. .

流量計測器１５０では、給水ポンプ１０５からボイラ１０１に供給される給水の流量を計測する。また、温度計測器１５１及び圧力計測器１５２は、熱交換器１０６から蒸気タービン１０８に供給される蒸気の温度、圧力を計測する。 The flow rate measuring device 150 measures the flow rate of the feed water supplied from the feed water pump 105 to the boiler 101. The temperature measuring device 151 and the pressure measuring device 152 measure the temperature and pressure of the steam supplied from the heat exchanger 106 to the steam turbine 108.

発電機１０９で発電された電力量は、発電出力計測器１５３で計測する。ボイラ１０１を通過する燃焼ガスに含まれている成分（ＣＯ、ＮＯｘなど）の濃度に関する情報は、ボイラ１０１の下流側に設けた濃度計測器１５４で計測することができる。 The amount of power generated by the power generator 109 is measured by a power generation output measuring device 153. Information on the concentration of components (CO, NOx, etc.) contained in the combustion gas passing through the boiler 101 can be measured by a concentration measuring device 154 provided on the downstream side of the boiler 101.

尚、一般的には、図２に図示した以外にも多数の計測器が火力発電プラントに配置されているが、ここでは図示を省略する。 In general, a number of measuring instruments other than those shown in FIG. 2 are arranged in the thermal power plant, but the illustration is omitted here.

次に、ボイラ１０１の内部にバーナー１０２から投入される１次空気と２次空気の経路、及びアフタエアポート１０３から投入されるアフタエアの経路について説明する。 Next, the paths of primary air and secondary air that are input from the burner 102 into the boiler 101 and the path of after-air that is input from the after-air port 103 will be described.

１次空気は、ファン１２０から配管１３０に導かれ、途中でボイラ１０１の下流側に設置されたエアーヒーター１０４を通過する配管１３２と通過せずにバイパスする配管１３１とに分岐して再び配管１３３にて合流し、バーナー１０２の上流側に設置されたミル１１０に導かれる。 The primary air is guided from the fan 120 to the pipe 130, and is branched into a pipe 132 that passes through the air heater 104 installed on the downstream side of the boiler 101 and a pipe 131 that bypasses the air heater 104 without passing through, and is again pipe 133. And is guided to a mill 110 installed on the upstream side of the burner 102.

エアーヒーター１０４を通過する空気は、ボイラ１０１を流下する燃焼ガスにより加熱される。この１次空気を用いてミル１１０において粉砕した微粉炭を１次空気と共にバーナー１０２に搬送する。 The air passing through the air heater 104 is heated by the combustion gas flowing down the boiler 101. Using this primary air, the pulverized coal pulverized in the mill 110 is conveyed to the burner 102 together with the primary air.

２次空気及びアフタエアは、ファン１２１から配管１４０に導かれ、エアーヒーター１０４で同様にして加熱された後に、２次空気用の配管１４１とアフタエア用の配管１４２とに分岐して、それぞれバーナー１０２とアフタエアポート１０３に導かれる。 The secondary air and the after air are led from the fan 121 to the pipe 140 and heated in the same manner by the air heater 104, and then branched into the secondary air pipe 141 and the after air pipe 142, respectively. And the after-air port 103.

図３は、図２に示した１次空気、２次空気、及びアフタエアの通過する配管１３０、１３１、１３２、１３３、１４０、１４１、１４２の配管部、並びにエアーヒーター１０４を表した拡大図である。 FIG. 3 is an enlarged view showing the piping part of the piping 130, 131, 132, 133, 140, 141, 142 through which the primary air, the secondary air, and the after air shown in FIG. is there.

図３に示すように、これらの配管のうち、配管１３１、１３２、１４１、１４２にはエアダンパ１６０、１６１、１６２、１６３が夫々配置されている。これらのエアダンパ１６０、１６１、１６２、１６３を夫々操作することにより、前記各配管１３１、１３２、１４１、１４２における空気が通過する面積を変更することできるので、配管１３１、１３２、１４１、１４２を通過する空気流量を夫々個別に調整できる。 As shown in FIG. 3, among these pipes, air dampers 160, 161, 162, and 163 are arranged on the pipes 131, 132, 141, and 142, respectively. By operating these air dampers 160, 161, 162, and 163, the area through which the air passes through each of the pipes 131, 132, 141, and 142 can be changed, so that the pipes 131, 132, 141, and 142 are passed through. The air flow to be adjusted can be adjusted individually.

制御装置２００によって生成された各種の操作信号２４を用いて、制御対象の火力発電プラント１００aの状態量を制御する操作端を構成する給水ポンプ１０５、ミル１１０、エアダンパ１６０、１６１、１６２、１６３などの機器を夫々操作する。尚、本実施例では給水ポンプ１０５、ミル１１０、エアダンパ１６０、１６１、１６２、１６３などの機器のことを操作端と呼び、これを操作するのに必要な指令信号を操作信号２４と呼ぶ。 A water supply pump 105, a mill 110, air dampers 160, 161, 162, 163, and the like that constitute an operation end for controlling the state quantity of the thermal power plant 100a to be controlled using various operation signals 24 generated by the control device 200, etc. Each of the devices. In the present embodiment, devices such as the water supply pump 105, the mill 110, and the air dampers 160, 161, 162, and 163 are referred to as operation ends, and a command signal necessary for operating them is referred to as an operation signal 24.

また、燃焼用等の空気、或いは微粉炭等の燃料をボイラ１０１に投入する際に、その吐出角度を上下に動かすことのできる機能をバーナー１０２及びアフタエアポート１０３に付加して、これらの角度を操作信号２４に含めることもできる。 In addition, when fuel such as air for combustion or fuel such as pulverized coal is introduced into the boiler 101, a function capable of moving the discharge angle up and down is added to the burner 102 and the after-air port 103, and these angles are set. It can also be included in the operation signal 24.

図４は、制御装置２００の操作信号生成部３００における信号処理を説明する詳細図である。図４において、操作信号生成部３００では、プラント１００の計測信号１を外部入力インターフェイス２０１を介して収集した計測信号２、学習情報データベース２８０に保存されている学習情報データ２２、及び制御ロジックデータベース２４０に保存されている制御ロジックデータ１１が夫々入力され、これらの信号及びデータを参照して操作信号生成部３００にて演算したプラント１００に対する制御指令である操作信号２４を外部入力インターフェイス２０２を介して出力する操作信号２３を生成する。 FIG. 4 is a detailed diagram illustrating signal processing in the operation signal generation unit 300 of the control device 200. In FIG. 4, in the operation signal generation unit 300, the measurement signal 2 collected from the measurement signal 1 of the plant 100 via the external input interface 201, the learning information data 22 stored in the learning information database 280, and the control logic database 240. The control logic data 11 stored in the operation signal 24 is respectively input, and an operation signal 24 that is a control command for the plant 100 calculated by the operation signal generation unit 300 with reference to these signals and data is input via the external input interface 202. An operation signal 23 to be output is generated.

操作信号生成部３００には、学習信号生成部３１０、運転目標値３２０、加減算器３３０、３３１、３３２、比例積分制御器３４０、変化率制限器３５０、３５１、高値選択器３６０、３６１、低値選択器３７０、３７１が夫々配置されており、これらの各機器は図４に図示されている態様に接続されている。 The operation signal generation unit 300 includes a learning signal generation unit 310, an operation target value 320, adders / subtractors 330, 331 and 332, a proportional integration controller 340, change rate limiters 350 and 351, high value selectors 360 and 361, and a low value. Selectors 370 and 371 are arranged, and each of these devices is connected in the manner shown in FIG.

そして、操作信号生成部３００の前記各機器を動作させるのに必要な制御パラメータは、制御ロジックデータベース２４０及び学習情報データベース２８０に保存されているものを入力して使用する。尚、操作信号生成部３００の構成は図４に示した機器構成以外のものを用いてもよい。 The control parameters necessary for operating each device of the operation signal generation unit 300 are input and used from those stored in the control logic database 240 and the learning information database 280. Note that the configuration of the operation signal generation unit 300 may be other than the device configuration shown in FIG.

加減算器３３０、３３１、３３２では、入力された２つの信号を用いてゼロの値に信号値を加算、或いは減算の演算を夫々行なう。図４では加算する信号を「＋」、減算する信号を「−」で表記している。 In the adders / subtracters 330, 331, and 332, the signal value is added to the zero value or the subtraction operation is performed using the two input signals. In FIG. 4, a signal to be added is represented by “+”, and a signal to be subtracted is represented by “−”.

前記加減算器３３０では、加減算器３３０に組み込まれた（１）式の関数に基づいて、操作信号生成部３００に取り込まれた計測信号２及び運転目標値信号３８０を用いて信号３８１を計算する。 The adder / subtractor 330 calculates the signal 381 using the measurement signal 2 and the operation target value signal 380 taken into the operation signal generation unit 300 based on the function of the expression (1) incorporated in the adder / subtractor 330.

ここで、χ₁は信号３８１の値、χ₂は運転目標値信号３８０、χ₃は計測信号２の値である。 Here, χ ₁ is the value of the signal 381, χ ₂ is the operation target value signal 380, and χ ₃ is the value of the measurement signal 2.

次に比例積分制御器３４０では、比例積分制御器３４０に組み込まれた（２）式の関数に基づいて、信号３８１、信号３８１の前回値と、基準信号３８２の前回値を用いて基準信号３８２を計算する。尚、前回値とは、１サンプル制御周期前の値であることを意味する。 Next, the proportional integration controller 340 uses the previous value of the signal 381 and the signal 381 and the previous value of the reference signal 382 and the reference signal 382 based on the function of the equation (2) incorporated in the proportional integration controller 340. Calculate The previous value means a value before one sample control period.

ここで、Ｐ₁、及びＰ₂は制御パラメータ、χ₄は基準信号３８２の値、χ₅は信号３８１、χ₆は信号３８１の前回値、χ₇は基準信号３８２の前回値である。 Here, P ₁ and P ₂ are control parameters, χ ₄ is the value of the reference signal 382, χ ₅ is the signal 381, χ ₆ is the previous value of the signal 381, and χ ₇ is the previous value of the reference signal 382.

また、学習信号生成部３１０では、学習情報データベース２８０に保存されている学習情報データ２２を参照しながら、計測信号２を用いて推奨信号３８３を導出する。この推奨信号３８３は操作信号２３の推奨値である。 In addition, the learning signal generation unit 310 derives the recommendation signal 383 using the measurement signal 2 while referring to the learning information data 22 stored in the learning information database 280. This recommendation signal 383 is a recommended value of the operation signal 23.

学習情報データベース２８０保存されている学習情報データ２２は、学習部４００で評価値１９からモデル入力１７を生成する関数を構築するのに必要なデータである。学習部４００で評価値１９からモデル入力１７を生成するのと同じように、学習信号生成部３１０では計測信号２から推奨信号３８３を生成する。 The learning information data 22 stored in the learning information database 280 is data necessary for constructing a function for generating the model input 17 from the evaluation value 19 in the learning unit 400. The learning signal generation unit 310 generates the recommendation signal 383 from the measurement signal 2 in the same manner as the learning unit 400 generates the model input 17 from the evaluation value 19.

加減算器３３１では、加減算器３３１に組み込まれた（３）式の関数に基づいて、基準信号３８２と推奨信号３８３を用いて信号３８４を計算する。 The adder / subtractor 331 calculates the signal 384 using the reference signal 382 and the recommendation signal 383 based on the function of the expression (3) incorporated in the adder / subtracter 331.

ここで、χ₈は信号３８４、χ₉は推奨信号３８３、χ₁₀は基準信号３８２の値である。 Here, χ ₈ is the value of the signal 384, χ ₉ is the recommendation signal 383, and χ ₁₀ is the value of the reference signal 382.

変化率制限器３５０では、１サンプル制御周期あたりに変化する信号３８４の値を制限する。この変化率制限器３５０では、変化率制限器３５０に組み込まれた（４）式の関数に基づいて、信号３８５を計算する。 The change rate limiter 350 limits the value of the signal 384 that changes per sample control period. The change rate limiter 350 calculates the signal 385 based on the function of the equation (4) incorporated in the change rate limiter 350.

ここで、Ｐ₃、Ｐ₄は制御パラメータであり、χ₁₁は信号３８５、χ₁₂は信号３８４の前回値、χ₁₃は信号３８４の値である。Ｐ₃、Ｐ₄はそれぞれ、増レートパラメータ、減レートパラメータと呼ぶ。 Here, P ₃ and P ₄ are control parameters, χ ₁₁ is the signal 385, χ ₁₂ is the previous value of the signal 384, and χ ₁₃ is the value of the signal 384. P ₃ and P ₄ are called an increase rate parameter and a decrease rate parameter, respectively.

変化率制限器３５０を用いることにより、１サンプル制御周期あたりに変化する操作信号３８４の値が、増レートパラメータの値と減レートパラメータの範囲内になるように、信号３８５の値を制限できる。 By using the change rate limiter 350, the value of the signal 385 can be limited so that the value of the operation signal 384 that changes per sample control period is within the range of the value of the increase rate parameter and the value of the decrease rate parameter.

高値選択器３６０は、信号３８６がある閾値以下の値にならないようにする機能を持つ。高値選択器３６０では、高値選択器３６０に組み込まれた（５）式の関数に基づいて、信号３８６を計算する。 The high value selector 360 has a function of preventing the signal 386 from becoming a value below a certain threshold value. The high value selector 360 calculates the signal 386 based on the function of the equation (5) incorporated in the high value selector 360.

ここで、Ｐ₅は制御パラメータであり、χ₁₄は信号３８６、χ₁₅は信号３８５の値である。Ｐ₅は下減パラメータと呼ぶ。高値選択器３６０を用いることにより、信号３８６の値がＰ₅の値以下にならないようにすることができる。 Here, P ₅ is a control parameter, χ ₁₄ is the value of signal 386, and χ ₁₅ is the value of signal 385. P ₅ is called a decrease parameter. By using the high value selector 360, the value of the signal 386 can be prevented from becoming less than or equal to the value of P _5.

低値選択器３７０は、補正信号３８７がある閾値以上の値にならないようにする機能を持つ。低値選択器３７０では、低値選択器３７０に組み込まれた（６）式の関数に基づいて、補正信号３８７を計算する。 The low value selector 370 has a function of preventing the correction signal 387 from becoming a value greater than a certain threshold value. The low value selector 370 calculates the correction signal 387 based on the function of the equation (6) incorporated in the low value selector 370.

ここでＰ₆は制御パラメータであり、χ₁₆は補正信号３８７、χ₁₇は信号３８６の値である。Ｐ₆は上限パラメータと呼ぶ。低値選択器３７０を用いることにより、信号３８７の値がＰ₆の値以上にならないようにすることができる。 Here, P ₆ is a control parameter, χ ₁₆ is the value of the correction signal 387, and χ ₁₇ is the value of the signal 386. P ₆ is called an upper limit parameter. By using the low value selector 370, the value of the signal 387 can be prevented from becoming greater than or equal to the value of P _6.

図４では、変化率制限器（ＲＬ）、高値選択器（ＨＬ）、低値選択器（ＬＬ）が複数用いられているが、動作内容は（４）式〜（６）式の関数と同じである。尚、変化率制限器３５０、３５１、高値選択器３６０、３６１、低値選択器３７０、３７１の制御パラメータは個別に設定することができる。 In FIG. 4, a plurality of change rate limiters (RL), high value selectors (HL), and low value selectors (LL) are used, but the operation content is the same as the functions of equations (4) to (6). It is. The control parameters of the change rate limiters 350 and 351, the high value selectors 360 and 361, and the low value selectors 370 and 371 can be set individually.

これらの制御パラメータの設定は、プラント１００の運転員が外部入力装置９００、保守ツール９１０、及び画像表示装置９５０を用いて設定する。 These control parameters are set by an operator of the plant 100 using the external input device 900, the maintenance tool 910, and the image display device 950.

以上の各機器で計算で算出された基準信号３８２と補正信号３８７を用いて、加減算器３３２ではこの２つの信号を加算して信号３８８を計算する。変化率制限器３５１を用いて信号３８８から信号３８９を計算し、高値選択器３６１を用いて信号３８９から信号３９０を計算し、最後に低値選択器３７１を用いて信号３９０から操作信号２３が計算され、この操作信号２３が外部インタフェース２０２からプラント１００に対する指令信号２４となって制御装置２００から出力される。 Using the reference signal 382 and the correction signal 387 calculated by the above devices, the adder / subtractor 332 adds the two signals to calculate the signal 388. The signal 389 is calculated from the signal 388 using the change rate limiter 351, the signal 390 is calculated from the signal 389 using the high value selector 361, and finally the operation signal 23 is calculated from the signal 390 using the low value selector 371. The operation signal 23 is calculated and output from the control device 200 as the command signal 24 to the plant 100 from the external interface 202.

制御装置２００の操作信号生成部３００を図４で示したように構成することで、以下に述べる作用効果が得られる。 By configuring the operation signal generation unit 300 of the control device 200 as shown in FIG. 4, the following effects can be obtained.

まず、操作信号生成部３００に変化率制限器３５１、高値選択器３６１、低値選択器３６２を備えることにより、操作信号２３が予め設定された許容範囲内に制限され、さらに予め設定された値以上に急激に変化することを抑止できる。 First, the operation signal generator 300 is provided with a change rate limiter 351, a high value selector 361, and a low value selector 362, so that the operation signal 23 is limited within a preset allowable range, and further a preset value. It can suppress that it changes rapidly more than the above.

従って、操作端の動作速度、動作範囲を逸脱した操作信号２３が計算されて指令信号２４として出力されることを防止できる。 Accordingly, it is possible to prevent the operation signal 23 deviating from the operation speed and operation range of the operation end from being calculated and output as the command signal 24.

また、プラント１００の運転状況によっては、指令信号２４となる操作信号２３を大きく変化させるとプラント１００の安全運転に支障が出る場合がある。このような場合でも、変化率制限器３５１の制御パラメータを適切に設定することにより、プラント１００を安全に運転することができる。 In addition, depending on the operation status of the plant 100, if the operation signal 23 serving as the command signal 24 is greatly changed, the safe operation of the plant 100 may be hindered. Even in such a case, the plant 100 can be safely operated by appropriately setting the control parameter of the change rate limiter 351.

ところで、図４に示した操作信号生成部３００では、学習信号生成部３１０にて計算した推奨信号３８３を用いて直接操作信号２３を計算せずに、加減算器３３１にて推奨信号３８３から基準信号３８２を減算し、変化率制限器３５０、高値選択器３６０、低値選択器３７０を適用した後、再び基準信号３８２を加算している。 4 does not directly calculate the operation signal 23 by using the recommended signal 383 calculated by the learning signal generation unit 310, but the adder / subtracter 331 generates the reference signal from the recommended signal 383. 382 is subtracted, and after applying the change rate limiter 350, the high value selector 360, and the low value selector 370, the reference signal 382 is added again.

学習信号生成部３１０では、モデル５００を用いて学習した結果が保存されている学習情報データベース２８０を参照して推奨信号３８３を生成しているので、仮にモデル５００とプラント１００の特性が異なる場合には推奨信号３８３を指令信号２４としてプラント１００に与えても、所望の性能を得ることができない可能性がある。 In the learning signal generation unit 310, the recommended signal 383 is generated with reference to the learning information database 280 in which the result of learning using the model 500 is stored. Therefore, if the characteristics of the model 500 and the plant 100 are different. Even if the recommendation signal 383 is given to the plant 100 as the command signal 24, there is a possibility that the desired performance cannot be obtained.

また、推奨信号３８３を指令信号２４としてプラント１００に与えることにより、プラント１００を安全に運転できなくなる可能性もある。 Further, by giving the recommendation signal 383 to the plant 100 as the command signal 24, there is a possibility that the plant 100 cannot be operated safely.

このような事態を回避するため、操作信号生成部３００では、変化率制限器３５０、高値選択器３６０、低値選択器３７０を用い、この制御パラメータを適切に設定することにより、学習信号生成部３１０が生成する推奨信号３８３が操作信号２３に寄与する度合いを調整できるように構成している。 In order to avoid such a situation, the operation signal generation unit 300 uses the change rate limiter 350, the high value selector 360, and the low value selector 370, and appropriately sets this control parameter, whereby the learning signal generation unit The degree of contribution of the recommended signal 383 generated by 310 to the operation signal 23 can be adjusted.

例えば、学習信号生成部３１０を導入した当初は、モデル５００とプラント１００の特性の違いに関する情報がないので、推奨信号３８３が操作信号２３に与える影響が小さくなるように制御パラメータを設定しておき、特性が一致することを確認した後、推奨信号３８３が操作信号２３に与える影響が大きくなるように制御パラメータを再設定するなどの対応を実施できる。 For example, at the beginning of the introduction of the learning signal generation unit 310, there is no information regarding the difference in characteristics between the model 500 and the plant 100, so the control parameters are set so that the influence of the recommendation signal 383 on the operation signal 23 is reduced. After confirming that the characteristics match, it is possible to implement a countermeasure such as resetting the control parameter so that the influence of the recommendation signal 383 on the operation signal 23 is increased.

火力発電プラント１００aでは、発電出力を一定に保つ発電出力一定運転、発電出力を変化させる発電出力変化運転、ボイラ１０１のバーナーの点火を切り替えるバーナー切り替え運転、燃料とする石炭の種類を切り替える炭種切り替え運転など、様々な運転形態がある。また、発電出力一定運転であっても、燃料とする炭種が異なる場合もある。 In the thermal power plant 100a, the power generation output constant operation that keeps the power generation output constant, the power generation output change operation that changes the power generation output, the burner switching operation that switches the ignition of the burner of the boiler 101, and the coal type switching that switches the type of coal used as fuel There are various driving modes such as driving. Moreover, even if it is a power generation output fixed operation, the charcoal used as fuel may be different.

本発明の実施例である火力発電プラント１００aの制御装置２００では、このような様々な運転形態毎に、制御パラメータを決定できるため、プラントの運転形態に合致した指令信号を生成できる。 In the control device 200 of the thermal power plant 100a which is an embodiment of the present invention, the control parameter can be determined for each of such various operation modes, so that a command signal that matches the plant operation mode can be generated.

図５は、本発明の実施例であるプラントの制御装置２００による制御パラメータ設定画面の１例を示している。図５では、火力発電プラント１００aの制御装置２００が備えている操作信号生成部３００が有する変化率制限器３５０において、制御パラメータを設定する画面を示している。 FIG. 5 shows an example of a control parameter setting screen by the plant control apparatus 200 according to the embodiment of the present invention. FIG. 5 shows a screen for setting control parameters in the change rate limiter 350 included in the operation signal generation unit 300 provided in the control device 200 of the thermal power plant 100a.

図５に示すように、操作信号生成部３００が有する変化率制限器３５０において、火力発電プラント１００aの運転形態毎に増レートの各パラメータ、及び減レートの各パラメータを設定する状況を表している。 As shown in FIG. 5, the change rate limiter 350 included in the operation signal generation unit 300 represents a situation in which each parameter for increasing rate and each parameter for decreasing rate are set for each operation mode of the thermal power plant 100 a. .

次に、図１に示す制御装置２００が備えている学習パラメータデータベース２５０に保存される学習パラメータを決定する学習条件決定部７００について説明する。学習条件決定部７００では、学習部４００が学習を実施する際に参照する学習パラメータ１４を決定する。 Next, a learning condition determination unit 700 that determines learning parameters stored in the learning parameter database 250 provided in the control device 200 shown in FIG. 1 will be described. The learning condition determination unit 700 determines a learning parameter 14 that is referred to when the learning unit 400 performs learning.

学習部４００が学習を実施する際には、１サンプリング制御周期あたりに動かすことのできるモデル入力１７の変化幅、モデル入力１７の上限値、モデル入力１７の下限値が夫々必要である。 When the learning unit 400 performs learning, a change range of the model input 17 that can be moved per sampling control period, an upper limit value of the model input 17, and a lower limit value of the model input 17 are required.

制御装置２００の学習条件決定部７００では、制御ロジックデータベース２４０に保存されている制御ロジックデータ６、操作端仕様データベース２２０に保存されている操作端仕様データ４、及び計測信号データベース２１０に保存されている計測信号データ３を参照して、学習パラメータデータベース２５０に保存する学習パラメータ８を決定する。 In the learning condition determination unit 700 of the control device 200, the control logic data 6 stored in the control logic database 240, the operation end specification data 4 stored in the operation end specification database 220, and the measurement signal database 210 are stored. The learning parameter 8 to be stored in the learning parameter database 250 is determined with reference to the measured signal data 3.

プラント１００を運転する前は計測信号を得ることはできないので、学習条件決定部７００では制御ロジックデータ６、及び操作端仕様データ４から学習パラメータ８の初期値を決定し、プラント１００を運転し、計測信号を得た後は計測信号データ３も用いて、学習パラメータ８を更新していく。 Since the measurement signal cannot be obtained before the plant 100 is operated, the learning condition determination unit 700 determines the initial value of the learning parameter 8 from the control logic data 6 and the operation end specification data 4, operates the plant 100, After obtaining the measurement signal, the learning parameter 8 is updated using the measurement signal data 3 as well.

図６は、本発明の実施例であるプラントの制御装置２００が備えている学習条件決定部７００において、学習パラメータ８の初期値を決定する方法を説明する図である。 FIG. 6 is a diagram illustrating a method for determining the initial value of the learning parameter 8 in the learning condition determination unit 700 provided in the plant control apparatus 200 according to the embodiment of the present invention.

図６では、操作端毎に、その変化率制限、上限、及び下限に関するデータが記載されている。制御ロジックデータ６の値はＲＬ、ＬＬ、ＨＬの欄に反映されて表示され、操作端仕様データ４の値は仕様の欄に反映されて表示されている。制御ロジックデータ６の値とは、例えば図５で示した操作信号生成部３００が有する変化率制限器３５０にて設定された制御パラメータのことである。また、操作端仕様データ４の値とは、例えば操作端の動作限界速度、上限値、下限値のことであり、これらの値はプラント１００の運転員によって設定される。 In FIG. 6, the data regarding the rate-of-change restriction | limiting, an upper limit, and a lower limit are described for every operation end. The value of the control logic data 6 is reflected and displayed in the RL, LL, and HL fields, and the value of the operation end specification data 4 is reflected and displayed in the specifications field. The value of the control logic data 6 is, for example, a control parameter set by the change rate limiter 350 included in the operation signal generation unit 300 illustrated in FIG. The values of the operating end specification data 4 are, for example, the operation limit speed, the upper limit value, and the lower limit value of the operating end, and these values are set by the operator of the plant 100.

学習条件決定部７００では、図６に記載された値の中から、モデル入力１７を生成する際に自由度が最小となる値を選択し、この値を学習パラメータ８の初期値として学習パラメータデータベース２５０に送信する。例えば、変化率制限パラメータの増レート、及び減レートは、その絶対値が大きいほど１サンプル制御周期で動かせるモデル入力の変化幅を大きくすることができるので、自由度も大きくなる。 The learning condition determination unit 700 selects a value having the minimum degree of freedom when generating the model input 17 from the values described in FIG. 6, and uses this value as an initial value of the learning parameter 8 as a learning parameter database. 250. For example, the increase rate and the decrease rate of the change rate limiting parameter can increase the degree of change of the model input that can be moved in one sample control period as the absolute value increases, and the degree of freedom also increases.

逆に、変化率制限パラメータの絶対値が小さいと、自由度も小さくなる。従って、変化率制限パラメータの増レート、及び減レートは、その絶対値が小さい値を学習パラメータ８の初期値として、学習パラメータデータベース２５０に送信する。 Conversely, when the absolute value of the change rate limiting parameter is small, the degree of freedom is also small. Therefore, the increase rate and the decrease rate of the change rate limiting parameter are transmitted to the learning parameter database 250 with a value having a small absolute value as an initial value of the learning parameter 8.

また、上限値については最小値、下限値については最大値を選択することで、モデル入力１７を生成する際の自由度を最小にできる。 Further, by selecting the minimum value for the upper limit value and the maximum value for the lower limit value, the degree of freedom in generating the model input 17 can be minimized.

尚、本実施例ではモデル入力１７を生成する際の自由度が最小となる値を選択し、学習パラメータ８の初期値を決定したが、操作端仕様データベース２２０に保存されている操作端仕様データ４の値をそのまま学習パラメータ８の初期値に決定するなど、様々な選択方法を設定することもできる。 In this embodiment, a value that minimizes the degree of freedom in generating the model input 17 is selected and the initial value of the learning parameter 8 is determined. However, the operation end specification data stored in the operation end specification database 220 is selected. Various selection methods can be set, such as determining the value of 4 as the initial value of the learning parameter 8 as it is.

また、学習条件決定部７００では、制御ロジックデータ６に含まれている信号、あるいは計測信号データ３を処理することにより、現状のプラント１００の運転形態を推定する機能がある。この機能を用いることにより、プラントの運転形態別に設定されている制御パラメータのうち、現在どの値が使用されているかを判定できる。 The learning condition determination unit 700 has a function of estimating the current operation mode of the plant 100 by processing the signal included in the control logic data 6 or the measurement signal data 3. By using this function, it is possible to determine which value is currently used among the control parameters set for each operation mode of the plant.

次に、学習パラメータ８の更新方法について説明する。まず、プラント１００の運転形態が変化し、制御ロジックデータ６の値が変化した場合、この変化した制御ロジックデータ６の値を用いて、図６にて説明した方法を用いて学習パラメータ８を決定する。 Next, a method for updating the learning parameter 8 will be described. First, when the operation mode of the plant 100 is changed and the value of the control logic data 6 is changed, the learning parameter 8 is determined by using the changed value of the control logic data 6 by using the method described in FIG. To do.

また、学習条件決定部７００では、計測信号データ３と操作信号データ５を用いて学習パラメータ８を更新する。この学習条件決定部７００における学習パラメータ８の更新方法について、図７を用いて説明する。 Further, the learning condition determination unit 700 updates the learning parameter 8 using the measurement signal data 3 and the operation signal data 5. A method for updating the learning parameter 8 in the learning condition determination unit 700 will be described with reference to FIG.

図７は学習条件決定部７００における学習パラメータ８の更新方法の一例を示すものであり、図７では時刻ｔ₁、ｔ₂における操作端Ａに関する操作信号データ３と計測信号データ５を表している。Δｔは１サンプル制御周期の時間であり、Ｃ₁は時刻ｔ₁における操作信号Ａの値、Ｃ₂は時刻ｔ₂における操作信号データ３の値、Ｃ₃は時刻ｔ₂における計測信号データ５の値である。 FIG. 7 shows an example of a method for updating the learning parameter 8 in the learning condition determination unit 700. FIG. 7 shows the operation signal data 3 and the measurement signal data 5 related to the operation end A at times t ₁ and t ₂ . . Δt is 1 sample control cycle time, C ₁ is the value of the operation signal A at time t _1, C ₂ is the value of the operation signal data 3 at time t _2, C ₃ is the measured signal data 5 at time t ₂ Value.

図７において、時刻ｔ₁から時刻ｔ₂に至るΔｔの時間の間に操作信号Ａである操作信号データ３はＣ₂−Ｃ₁の差信号分だけ変化しているのに対して、計測信号データ５はＣ₃−Ｃ₁の差信号分しか変化しておらず、操作信号データの変化幅に比べて計測信号データの変化幅が小さい。 In FIG. 7, the operation signal data 3 as the operation signal A changes by the difference signal of C ₂ −C ₁ during the time Δt from the time t ₁ to the time t ₂ , whereas the measurement signal The data 5 changes only for the difference signal of C ₃ -C ₁ , and the change width of the measurement signal data is smaller than the change width of the operation signal data.

これは、操作端Ａが１サンプル制御周期あたりの動作限界速度よりも、操作信号の変化幅の方が大きい場合に生じる事象である。このような場合、操作信号Ａの増レートに関する学習パラメータ８の値を、Ｃ₃−Ｃ₁の差信号の値に設定する。 This is an event that occurs when the change width of the operation signal of the operation end A is larger than the operation limit speed per one sample control period. In such a case, the value of the learning parameter 8 regarding the increase rate of the operation signal A is set to the value of the difference signal of C ₃ -C ₁ .

以上の方法で学習条件決定部７００にて学習パラメータ８を決定し、この学習パラメータ８を学習パラメータデータベース２５０に保存する。また、運転形態が変化して、制御パラメータが変化した場合も、学習パラメータ８を更新する。 The learning parameter 8 is determined by the learning condition determination unit 700 by the above method, and the learning parameter 8 is stored in the learning parameter database 250. The learning parameter 8 is also updated when the operation mode changes and the control parameter changes.

次に、制御装置２００の学習部４００において、モデル５００に対するモデル入力１７を決定して、モデル５００から出力するモデル出力１８の１つである窒素酸化物（ＮＯｘ）を低減することを例として説明する。 Next, an example in which the learning unit 400 of the control device 200 determines the model input 17 for the model 500 and reduces nitrogen oxide (NOx) that is one of the model outputs 18 output from the model 500 will be described. To do.

尚、モデル出力１８として、窒素酸化物のほかにも一酸化炭素（ＣＯ）、二酸化炭素濃度、硫化酸化物、水銀、蒸気温度、蒸気圧力などを所望の値に制御する場合にも、本発明の実施例のプラントの制御装置を用いることにより制御可能である。 In addition to the nitrogen oxide, the model output 18 includes the case where carbon monoxide (CO), carbon dioxide concentration, sulfide oxide, mercury, vapor temperature, vapor pressure, and the like are controlled to desired values. It is controllable by using the plant control apparatus of the embodiment.

図８は、モデル５００に入力するモデル入力１７と、モデル５００から出力するモデル出力１８との関係を図示したものである。尚、図８ではモデル入力Ａとモデル入力Ｂの２種類をモデル入力１７とし、ＮＯｘをモデル出力１８としている。 FIG. 8 illustrates the relationship between the model input 17 input to the model 500 and the model output 18 output from the model 500. In FIG. 8, two types of model input A and model input B are used as model input 17, and NOx is used as model output 18.

図８のように、モデル入力ＡをＡ₁、モデル入力ＢをＢ₁とすると、モデル出力１８のＮＯｘはＮＯｘ高となり、モデル入力ＡをＡ₂、モデル入力ＢをＢ₂とすると、モデル出力１８のＮＯｘはＮＯｘ低となる。このように、学習部４００では、図８に示すように、初期状態からＮＯｘ低の領域に到達するための方法を学習することができる。 As shown in FIG. 8, if the model input A is A ₁ and the model input B is B ₁ , the NOx of the model output 18 is NOx high, and if the model input A is A ₂ and the model input B is B ₂ , the model output 18 NOx becomes NOx low. As described above, the learning unit 400 can learn a method for reaching the NOx low region from the initial state as shown in FIG.

図９は、学習部４００にてモデルを対象にモデル入力の生成方法を学習した結果の一例を図示したものであり、図９では可能な限り少ない操作回数でＮＯｘ低の領域に到達し、かつＮＯｘ高の領域に状態遷移しないという条件で学習した結果を表している。 FIG. 9 illustrates an example of the result of learning the model input generation method for the model by the learning unit 400. In FIG. 9, the NOx low region is reached with the smallest possible number of operations, and The result of learning under the condition that state transition does not occur in the NOx high region is shown.

尚、一度の操作で直接ＮＯｘ低の領域に到達しないのは、１サンプル制御周期あたりに動かすことのできるモデル入力Ａとモデル入力Ｂの値が制限されているためである。 The reason why the NOx low region is not reached directly by one operation is that the values of the model input A and the model input B that can be moved per one sample control period are limited.

１サンプルあたりに動かすことのできるモデル入力１７の値は、図６で説明した操作端の増レート、減レートなどの学習パラメータ８（学習パラメータ１４）に基づいて、操作端とモデル入力の項目が対応するように決定される。 The value of the model input 17 that can be moved per sample is based on the learning parameter 8 (learning parameter 14) such as the increase rate and decrease rate of the operation end described in FIG. It is determined to correspond.

図９に示すように、１回操作後の状態を経て２回操作後の状態でＮＯｘ低の領域に達したことを示すように、学習部４００では２回の操作でＮＯｘ低の領域に到達する方法を学習した。 As shown in FIG. 9, the learning unit 400 reaches the NOx low region by two operations so as to indicate that the NOx low region has been reached after the second operation after the first operation. Learned how to do.

図１０は図９と同様に学習部４００にて操作信号の生成方法を学習した結果の一例である操作信号Ａと操作信号Ｂの関係を図示したものであり、モデル力Ａと操作信号Ａ、モデル入力Ｂと操作信号Ｂがそれぞれ対応している。 FIG. 10 illustrates the relationship between the operation signal A and the operation signal B, which is an example of the result of learning the operation signal generation method by the learning unit 400 as in FIG. The model input B and the operation signal B correspond to each other.

図１０にて点線矢印で示した操作方法が、制御装置２００の学習部４００にて学習した結果を表すものである。図１０では操作信号Ａの動作速度が小さい場合、１回の操作後にＮＯｘ高の領域に状態遷移してしまう。 The operation method indicated by the dotted arrow in FIG. 10 represents the result learned by the learning unit 400 of the control device 200. In FIG. 10, when the operation speed of the operation signal A is low, the state transitions to a region where the NOx is high after one operation.

これは、操作信号２４とモデル入力１７の動作限界速度が異なる場合には、学習部４００にて可能な限り少ない操作回数でＮＯｘ低の領域に到達し、かつＮＯｘ高の領域に状態遷移しないという条件でモデル入力１７の生成方法を学習した結果に従って操作信号２４を生成し、これをプラントに与えてしまうと、学習の際に設定した条件を満足できなくなる可能性があることを意味している。 This means that when the operation limit speeds of the operation signal 24 and the model input 17 are different, the learning unit 400 reaches the NOx low region with as few operations as possible and does not transition to the NOx high region. If the operation signal 24 is generated according to the learning result of the generation method of the model input 17 under the condition and given to the plant, this means that the condition set at the time of learning may not be satisfied. .

本発明の実施例では、このような事態を回避するため、次のような工夫がなされている。つまり、本実施例では、制御装置２００に学習条件決定部７００が設けられており、プラント１００の操作端の動作限界速度を含む学習パラメータ８を前述したように決定し、学習パラメータ８を学習パラメータデータベース２５０に保存する。学習部４００において、学習パラメータデータベース２５０に保存されている学習パラメータ１４を参照することにより、操作信号２４とモデル入力１７の動作限界速度が一致することを前提に学習を実施する。 In the embodiment of the present invention, in order to avoid such a situation, the following measures are taken. That is, in the present embodiment, the learning condition determining unit 700 is provided in the control device 200, the learning parameter 8 including the operation limit speed at the operation end of the plant 100 is determined as described above, and the learning parameter 8 is determined as the learning parameter. Save in the database 250. The learning unit 400 refers to the learning parameter 14 stored in the learning parameter database 250 to perform learning on the assumption that the operation signal 24 and the operation limit speed of the model input 17 match.

次に、制御装置２００の制御動作を図１１に示すフローチャートを用いて説明する。 Next, the control operation of the control device 200 will be described using the flowchart shown in FIG.

図１１は、図１に記載の本発明の実施例におけるプラントの制御装置２００でのプラントのモデルの模擬と学習の内容についての演算プロセスを示すフローチャートである。 FIG. 11 is a flowchart showing a calculation process for the simulation and learning contents of the plant model in the plant control apparatus 200 in the embodiment of the present invention shown in FIG.

図１１に示めした制御装置２００の制御動作のフローチャートは、図１に記載の学習情報追加部８００が備えられていない場合にも適用することができる。学習部情報追加部８００の動作内容と、これが備えられている場合のフローチャートについては、後述する。 The flowchart of the control operation of the control device 200 shown in FIG. 11 can be applied even when the learning information adding unit 800 shown in FIG. 1 is not provided. The operation content of the learning unit information adding unit 800 and a flowchart in the case where the learning unit information adding unit 800 is provided will be described later.

図１１に示したように、制御装置２００の制御動作のフローチャートは、ステップ１０１０、１０２０、１０３０、１０４０、１０５０、及び１０６０を組み合わせて実行する。以下ではそれぞれのステップについて、説明する。 As illustrated in FIG. 11, the flowchart of the control operation of the control device 200 is executed by combining steps 1010, 1020, 1030, 1040, 1050, and 1060. Each step will be described below.

まず、ステップ１０１０では、学習部４００とモデル５００を動作させ、モデル出力１８がモデル出力目標値を達成するようなモデル入力１７の生成方法を学習する。 First, in step 1010, the learning unit 400 and the model 500 are operated to learn a method for generating the model input 17 so that the model output 18 achieves the model output target value.

尚、評価値計算部６００において、評価値計算パラメータデータ１５を使用しながら、モデル出力１８がモデル出力目標値を達成しているかどうか、もしくはモデル出力１８とモデル出力目標値が近い値となっているかどうかについて、定量的に評価された値である評価値１９を用いて学習を実施してもよい。 The evaluation value calculation unit 600 uses the evaluation value calculation parameter data 15 to determine whether the model output 18 has achieved the model output target value, or the model output 18 and the model output target value are close to each other. The learning may be performed using an evaluation value 19 that is a value evaluated quantitatively.

評価値計算パラメータデータベース２６０には、モデル出力目標値など、評価値１９を計算するのに必要なパラメータ値が保存されている。学習には、遺伝的アルゴリズム、動的計画法、強化学習法などの最適化手法を適用することができる。 The evaluation value calculation parameter database 260 stores parameter values necessary for calculating the evaluation value 19 such as a model output target value. Optimization methods such as genetic algorithms, dynamic programming, and reinforcement learning can be applied to learning.

次に、ステップ１０２０では、学習部４００を動作させ、ステップ１０１０にて学習した結果を学習情報データ２０として学習部４００から学習情報データベース２８０に送信する。この学習情報データ２０とは、例えばモデル出力１８からモデル入力１７を生成するのに必要な関数に関する情報である。 Next, in step 1020, the learning unit 400 is operated, and the result learned in step 1010 is transmitted from the learning unit 400 to the learning information database 280 as learning information data 20. The learning information data 20 is information relating to a function necessary for generating the model input 17 from the model output 18, for example.

次に、ステップ１０３０において、操作信号生成部３００を動作させて操作信号２３を生成する。操作信号２３は操作信号データベース２３０と外部出力インターフェイス２０２に送信され、外部出力インターフェイス２０２からプラント１００に制御指令となる操作信号２４が与えられる。 Next, in step 1030, the operation signal generator 300 is operated to generate the operation signal 23. The operation signal 23 is transmitted to the operation signal database 230 and the external output interface 202, and the operation signal 24 serving as a control command is given from the external output interface 202 to the plant 100.

次に、ステップ１０４０では、外部入力インターフェイス２０１を動作させ、プラント１００の制御出力である計測信号１を制御装置２００の内部に取り込み、計測信号２を操作信号生成部３００と計測信号データベース２１０に送信する。 Next, in step 1040, the external input interface 201 is operated, the measurement signal 1 that is the control output of the plant 100 is taken into the control device 200, and the measurement signal 2 is transmitted to the operation signal generator 300 and the measurement signal database 210. To do.

次に、ステップ１０５０では、学習条件決定部７００にて学習条件となる学習パラメータ８を決定し、この学習パラメータ８を学習パラメータデータベース２６０に送信する。 Next, in step 1050, the learning parameter determining unit 700 determines the learning parameter 8 that becomes the learning condition, and transmits the learning parameter 8 to the learning parameter database 260.

そして、ステップ１０６０では、学習条件決定部７００において、学習パラメータデータベース２５０に保存されている学習パラメータの前回値である学習パラメータ９と学習パラメータ８を比較し、その値が同じ場合には学習トリガ７を「０」、異なる場合は学習トリガ７を「１」とし、学習部４００に送信する。 In step 1060, the learning condition determination unit 700 compares the learning parameter 9, which is the previous value of the learning parameter stored in the learning parameter database 250, with the learning parameter 8. Is set to “0”, and if different, the learning trigger 7 is set to “1” and transmitted to the learning unit 400.

学習トリガ７が「１」となることは、学習パラメータの値が変更されたことを意味しており、ステップ１０１０に戻って新しい学習パラメータ１４を用いて学習を実施する。これを再学習と呼ぶ。 When the learning trigger 7 is “1”, it means that the value of the learning parameter has been changed, and the processing returns to step 1010 and learning is performed using the new learning parameter 14. This is called relearning.

尚、学習部４００では前回の学習結果である学習情報データ２１を用いて、再学習することもできる。学習トリガ７が「０」で、再学習しない場合には、ステップ１０３０に戻る。 Note that the learning unit 400 can perform relearning using the learning information data 21 that is the previous learning result. When the learning trigger 7 is “0” and the relearning is not performed, the process returns to step 1030.

図１２は、図１１に示した本発明の一実施例である制御装置２００による制御動作のフローチャートに示す演算方法を用いて学習した学習効果を説明する図である。 FIG. 12 is a diagram for explaining the learning effect learned using the calculation method shown in the flowchart of the control operation by the control device 200 according to the embodiment of the present invention shown in FIG.

図１２において、制御装置２００の学習条件決定部７００では、操作信号２４の動作限界速度を考慮して、モデル入力１７の動作限界速度を学習パラメータ８とする。そのため、制御装置２００のモデル５００を用いて学習部４００にて学習したモデル入力１７の生成方法（図１２の上図）に従って制御指令となる操作信号２４をプラント１００に与えることで、図１２の下図に示すようにＮＯｘ高の領域に状態遷移することなく、初期状態から４回操作後の状態でＮＯｘ低の領域に到達することができることを示している。 In FIG. 12, the learning condition determination unit 700 of the control device 200 sets the operation limit speed of the model input 17 as the learning parameter 8 in consideration of the operation limit speed of the operation signal 24. For this reason, by giving the plant 100 an operation signal 24 as a control command in accordance with a method of generating the model input 17 (upper diagram in FIG. 12) learned by the learning unit 400 using the model 500 of the control device 200, FIG. As shown in the figure below, it is shown that the NOx low region can be reached in the state after four operations from the initial state without making a state transition to the NOx high region.

また、同じ設計仕様データの操作端を複数使用しているものの、実際の動作速度にはばらつきがある場合でも、個々の操作端の動作限界速度を考慮して学習できるようになる。また、操作端が経年劣化し動作速度が低下した場合も、低下した動作速度を学習する際の条件とすることができる。 In addition, although a plurality of operation ends having the same design specification data are used, even when the actual operation speed varies, learning can be performed in consideration of the operation limit speed of each operation end. In addition, even when the operating end deteriorates over time and the operation speed decreases, it can be set as a condition for learning the decreased operation speed.

さらに、発電出力変化運転、バーナー切り替え運転、炭種切り替え運転など、プラントの運転状態が変化して、変化率制限器などの制御パラメータが変更された場合も、その変更された条件で学習することができる。また、制御パラメータをプラント１００の運転員が変更した場合にも、その変更された条件で学習することができる。 Furthermore, even when the operating parameters of the plant change, such as the power generation output change operation, burner switching operation, and coal type switching operation, and the control parameters such as the change rate limiter are changed, learning should be performed under the changed conditions. Can do. Further, even when the operator of the plant 100 changes the control parameter, it is possible to learn with the changed condition.

その結果、学習したモデル入力１７の生成方法に従って生成した操作信号２４をプラント１００に制御指令として与えることによって、プラントの制御として所望の制御結果を得ることができる。 As a result, by giving the operation signal 24 generated according to the learned generation method of the model input 17 to the plant 100 as a control command, a desired control result can be obtained as plant control.

また、制御装置２００の学習条件決定部７００において学習の拘束条件を自動的に決定するため、プラントの運転員が学習の拘束条件を決定する作業が不要になり、制御装置の使い勝手が向上する、学習のための条件設定期間が短縮できる、という効果も得られる。 In addition, since the learning condition determination unit 700 of the control device 200 automatically determines the learning constraint condition, the operation of the plant operator to determine the learning constraint condition becomes unnecessary, and the usability of the control device is improved. There is also an effect that the condition setting period for learning can be shortened.

ところで、図１１に示した制御装置２００の制御動作のフローチャートでは、学習条件決定部７００にて学習パラメータがその前回値と違う値となった場合に、ステップ１０１０にて再学習を実施する必要がある。この学習には計算資源を要するので、高速演算可能な制御装置を用いるか、学習に時間をかける必要がある。 By the way, in the flowchart of the control operation of the control device 200 shown in FIG. 11, when the learning parameter becomes a value different from the previous value in the learning condition determination unit 700, it is necessary to perform re-learning in step 1010. is there. Since this learning requires computational resources, it is necessary to use a control device capable of high-speed computation or to spend time on learning.

高速演算可能な制御装置を使用するにはコストがかかる。また、学習に時間をかける場合、学習している期間は学習信号生成部３１０の動作を停止する必要があり、学習部４００とモデル５００で学習した結果を操作信号２４の生成に反映できなくなる。 Using a control device capable of high-speed computation is expensive. Further, when taking time for learning, it is necessary to stop the operation of the learning signal generation unit 310 during the learning period, and the result learned by the learning unit 400 and the model 500 cannot be reflected in the generation of the operation signal 24.

そこで、その対策として、本発明の実施例では、図１に示す制御装置２００に学習情報追加部８００を設けている。学習情報追加部８００では、学習トリガ７が「１」となった場合に、学習パラメータデータ１４と学習情報データ１２を用いて、学習情報データ１３を生成し、学習情報データベース２８０に送信する。学習情報追加部８００を用いることで、再学習を実施することなく、学習パラメータ１４を学習の条件とした場合の学習結果である学習情報データ１３を生成することができる。 Therefore, as a countermeasure, in the embodiment of the present invention, the learning information adding unit 800 is provided in the control device 200 shown in FIG. When the learning trigger 7 becomes “1”, the learning information adding unit 800 generates learning information data 13 using the learning parameter data 14 and the learning information data 12 and transmits the learning information data 280 to the learning information database 280. By using the learning information adding unit 800, it is possible to generate learning information data 13 that is a learning result when the learning parameter 14 is used as a learning condition without performing relearning.

従って、学習条件決定部７００にて学習パラメータが変更された場合を考慮して、高速演算可能な制御装置を用いることや、あるいは学習条件決定部７００にて学習パラメータが変更された場合に学習信号生成部３１０の機能が停止することはない。 Therefore, in consideration of the case where the learning parameter determination unit 700 changes the learning parameter, a control device capable of high-speed calculation is used, or the learning signal is changed when the learning parameter determination unit 700 changes the learning parameter. The function of the generation unit 310 does not stop.

次に、制御装置２００に学習情報追加部８００を設けた場合における制御動作を図１３に示すフローチャートを用いて説明する。 Next, a control operation when the learning information adding unit 800 is provided in the control device 200 will be described with reference to a flowchart shown in FIG.

図１３は、本発明の一実施例であるプラントの制御装置に学習情報追加部８００を設置した場合における制御装置２００でのプラントのモデルの模擬と学習の内容についての演算処理内容を示すフローチャートである。 FIG. 13 is a flowchart showing the contents of calculation processing for the simulation and learning contents of the plant model in the control device 200 when the learning information adding unit 800 is installed in the plant control device according to the embodiment of the present invention. is there.

図１３に示したように、制御装置２００の制御動作のフローチャートは、ステップ１１１０、１１２０、１１３０、１１４０、１１５０、１１６０、１１７０を組み合わせて実行する。以下ではそれぞれのステップについて、説明する。 As shown in FIG. 13, the flowchart of the control operation of the control device 200 is executed by combining steps 1110, 1120, 1130, 1140, 1150, 1160, 1170. Each step will be described below.

まず、ステップ１１１０では、学習部４００において、モデル５００を対象にモデル出力１８がモデル出力目標値を達成するようなモデル入力１７の生成方法を学習する。尚、図１１のローチャートのステップ１０１０と同じように、評価値計算部６００を用いて学習してもよい。また、ステップ１０１０と同じような最適化手法を用いることもできる。 First, in step 1110, the learning unit 400 learns a generation method of the model input 17 such that the model output 18 achieves the model output target value for the model 500. Note that learning may be performed using the evaluation value calculation unit 600 in the same manner as in the step 1010 of the flowchart of FIG. An optimization method similar to that in step 1010 can also be used.

ステップ１１１０で学習する際に、モデル入力１７の変化幅の最小設定値を用いて入力空間を領域に分割して学習を実施する。モデル入力１７の変化幅の最小設定値は、プラント１００の運転員が設定する値である。 When learning is performed in step 1110, learning is performed by dividing the input space into regions using the minimum setting value of the change width of the model input 17. The minimum set value of the change width of the model input 17 is a value set by the operator of the plant 100.

図１４はステップ１１１０で学習部４００においてモデル入力１７の生成方法を学習する際に、その入力空間を領域に分割した場合の説明図である。 FIG. 14 is an explanatory diagram when the learning unit 400 learns the generation method of the model input 17 in step 1110 and divides the input space into regions.

図１４に示すように、学習部４００では、モデル入力Ａ、及びモデル入力Ｂの動作可能範囲をモデル入力変化幅の最小設定値に分割する。次に、１回の操作で変化できるモデル入力の変化幅を、モデル入力変化幅の最小設定値に制限して学習を実施する。 As illustrated in FIG. 14, the learning unit 400 divides the operable range of the model input A and the model input B into the minimum set value of the model input change width. Next, learning is performed by limiting the change width of the model input that can be changed by one operation to the minimum setting value of the model input change width.

つまり、個々の領域では、隣接する領域に移動する操作方法を学習することになる。例えば、操作回数が最小で、ＮＯｘ高の領域に状態遷移しない条件で学習した結果を用いて初期状態から操作を開始すると、図１４に示した経路である、操作回数最小でＮＯｘ低の領域に到達、の経路をたどってＮＯｘ低の領域に到達する。 That is, in each area, an operation method for moving to an adjacent area is learned. For example, when the operation is started from the initial state using the result of learning under the condition that the number of operations is the minimum and the state is not shifted to the region where the NOx is high, the route shown in FIG. Follow the arrival route to reach the low NOx region.

次に、ステップ１１２０では、学習部４００を動作させ、ステップ１２１０にて学習した結果を学習情報データ２０として学習部４００から学習情報データベース２８０に送信する。 Next, in step 1120, the learning unit 400 is operated, and the result learned in step 1210 is transmitted from the learning unit 400 to the learning information database 280 as learning information data 20.

次に、ステップ１１３０では、学習条件決定部７００を動作させて学習条件を決定し、学習パラメータ８を学習パラメータデータベース２５０に送信する。 Next, in step 1130, the learning condition determination unit 700 is operated to determine the learning condition, and the learning parameter 8 is transmitted to the learning parameter database 250.

ステップ１１４０では、学習条件決定部７００において、学習パラメータデータベース２５０に保存されている学習パラメータの前回値である学習パラメータ９と学習パラメータ８を比較し、その値が同じ場合には学習トリガ７を「０」、異なる場合は学習トリガ７を「１」とし、学習トリガが「１」の場合はステップ１１５０に、学習トリガ７が「０」の場合はステップ１１６０に進む。 In step 1140, the learning condition determination unit 700 compares the learning parameter 9, which is the previous value of the learning parameter stored in the learning parameter database 250, with the learning parameter 8. If the learning trigger 7 is “1”, the process proceeds to step 1150. If the learning trigger 7 is “0”, the process proceeds to step 1160.

次に、ステップ１１５０では、学習情報追加部８００を動作させ、学習情報データベースに保存されている学習情報データ１２と、学習パラメータデータベース２５０に保存されている学習パラメータ１０を用いて、追加学習情報データ１３を生成し、学習情報データベース２８０に送信する。 Next, in step 1150, the learning information adding unit 800 is operated to use the learning information data 12 stored in the learning information database and the learning parameter 10 stored in the learning parameter database 250, and the additional learning information data. 13 is generated and transmitted to the learning information database 280.

尚、ステップ１１５０で用いる学習情報データ１２は、ステップ１１１０にて学習した結果である。 Note that the learning information data 12 used in step 1150 is the result of learning in step 1110.

次に、制御装置２００に設けた学習情報追加部８００の制御動作について説明する。 Next, the control operation of the learning information adding unit 800 provided in the control device 200 will be described.

図１５は、図１に示す制御装置２００に設けた学習情報追加部８００の動作内容を説明するもので、図１３に示すフローチャートにおけるステップ１１５０の詳細を説明するフローチャートである。 FIG. 15 is a flowchart for explaining the details of the operation of the learning information adding unit 800 provided in the control device 200 shown in FIG. 1, and explaining the details of step 1150 in the flowchart shown in FIG.

図１５において、ステップ８１０では、ステップ１１１０にて学習した結果である学習情報データ１２を用いて、領域毎に目標状態に到達するのに要する操作回数を導出する。これは、ある領域を初期状態に設定し、そこから目標状態に到達するまでの操作回数を求める、という作業を全ての領域で実行すること等により、導出することができる。 In FIG. 15, in step 810, the number of operations required to reach the target state for each region is derived using the learning information data 12 obtained as a result of learning in step 1110. This can be derived, for example, by performing the operation of setting a certain area as an initial state and obtaining the number of operations from that point until reaching the target state in all areas.

次に、ステップ８２０において、領域毎に、学習パラメータ１０を用いて１回の操作で遷移できる状態の範囲（操作可能範囲）を決定し、操作可能範囲内の領域について、ステップ８１０で求めた操作回数の値を全て抽出する。 Next, in step 820, for each region, the learning parameter 10 is used to determine a state range (operational range) that can be changed by a single operation, and the operation obtained in step 810 for the region within the operationable range. Extract all count values.

次に、ステップ８３０では、ある１つの領域において、ステップ８２０で抽出した操作回数の値が最小となる領域に遷移する操作方法が最適な操作方法であると判断し、その操作方法を追加学習情報データ１３として、学習情報追加部８００から学習情報データベース２８０に送信するものである。 Next, in step 830, it is determined that the operation method for transitioning to the region in which the value of the number of operations extracted in step 820 is the smallest in a certain region is the optimal operation method, and the operation method is added to the additional learning information. The data 13 is transmitted from the learning information adding unit 800 to the learning information database 280.

図１６は学習情報追加部８００の動作内容を説明した図１５のフローチャートにて学習した結果を説明する説明図である。図１６に示すように、初期状態では図中の矢印のように操作することが、学習情報追加部８００にて生成される追加学習情報データ１３に含まれる。 FIG. 16 is an explanatory diagram for explaining the learning result in the flowchart of FIG. 15 for explaining the operation content of the learning information adding unit 800. As shown in FIG. 16, the additional learning information data 13 generated by the learning information adding unit 800 includes the operation as indicated by the arrow in the drawing in the initial state.

図１６の初期状態からの矢印に従って操作すると、初期状態における操作可能範囲の中から、ＮＯｘ低の領域に到達するのに要する操作回数が最小となる領域に到達できる。 When an operation is performed in accordance with the arrow from the initial state in FIG. 16, it is possible to reach an area where the number of operations required to reach the NOx low area is minimized from the operable range in the initial state.

以上の説明内容が図１３に示すステップ１１５０の動作説明である。 The above description is the operation description of step 1150 shown in FIG.

次に、ステップ１１６０では、操作信号生成部３００を動作させ、ステップ１１５０にて生成された学習情報データ２２と制御ロジックデータ１１を用いて操作信号２３を生成する。この操作信号２３は外部出力インターフェイス２０２を介して、制御指令となる操作信号２４としてプラント１００に送信される。 Next, in step 1160, the operation signal generator 300 is operated to generate the operation signal 23 using the learning information data 22 and the control logic data 11 generated in step 1150. This operation signal 23 is transmitted to the plant 100 through the external output interface 202 as an operation signal 24 serving as a control command.

次に、ステップ１１７０では、外部入力インターフェイス２０１を動作させ、プラントの制御出力である計測信号１を制御装置２００の内部に取り込む。その後、ステップ１１３０に進み、上記したステップ１１３０〜ステップ１１７０の動作を繰り返す。 Next, in step 1170, the external input interface 201 is operated, and the measurement signal 1 which is the plant control output is taken into the control device 200. Then, it progresses to step 1130 and repeats operation | movement of the above-mentioned step 1130-step 1170.

ところで、図１１に示す制御装置２００の制御動作のフローチャートでは、制御装置２００の学習条件決定部７００にて学習トリガ７が「１」となった場合に、ステップ１０１０に進み再学習する必要があった。 By the way, in the flowchart of the control operation of the control device 200 shown in FIG. 11, when the learning trigger 7 becomes “1” in the learning condition determination unit 700 of the control device 200, it is necessary to proceed to step 1010 and perform relearning again. It was.

これに対して、図１３に示す制御装置２００の制御動作のフローチャート図では、学習トリガ７が「１」となった場合でも、ステップ１１１０にて学習した結果を用いて学習情報追加部８００を動作させることで、学習パラメータ１４（学習パラメータ１０）を学習条件とした場合のモデル入力１７の生成方法を学習した場合と同じ学習情報データを生成できる。 In contrast, in the flowchart of the control operation of the control device 200 shown in FIG. 13, even when the learning trigger 7 is “1”, the learning information adding unit 800 is operated using the result learned in step 1110. By doing this, it is possible to generate the same learning information data as when learning the generation method of the model input 17 when the learning parameter 14 (learning parameter 10) is used as a learning condition.

その結果、図１１のフローチャートを用いることによる効果のほかに、高速演算可能な制御装置を用いない場合でも、学習信号生成部３１０の機能を停止させずにプラントを制御することが可能になるとの効果が得られる。 As a result, in addition to the effect obtained by using the flowchart of FIG. 11, the plant can be controlled without stopping the function of the learning signal generation unit 310 even when a control device capable of high-speed calculation is not used. An effect is obtained.

本発明のプラントの制御装置及び制御方法を火力発電プラントに適用する実施例の効果として、火力発電プラントから排出される排ガス中のＮＯｘの濃度を低減できることがあげられる。 As an effect of the embodiment in which the plant control device and the control method of the present invention are applied to a thermal power plant, the concentration of NOx in exhaust gas discharged from the thermal power plant can be reduced.

更に、ＮＯｘの濃度の低減に伴って、排ガス中からＮＯｘを低減するために必要な脱硝装置でのアンモニアの使用量が削減でき、脱硝装置の触媒活性が長時間持続できる効果も得られる。 Further, as the NOx concentration is reduced, the amount of ammonia used in the denitration apparatus necessary for reducing NOx from the exhaust gas can be reduced, and the catalyst activity of the denitration apparatus can be maintained for a long time.

また、本発明の実施例のプラントの制御装置によれば、学習の拘束条件の決定に用いる学習パラメータの初期値を操作端の動作限界速度に関する事前情報（仕様）を用いて決定する。また、計測信号を用いてこの学習パラメータを逐次修正するため、プラントの操作端の動作速度を学習パラメータに反映することができる。 Further, according to the plant control apparatus of the embodiment of the present invention, the initial value of the learning parameter used for determining the learning constraint condition is determined using the prior information (specification) regarding the operation limit speed of the operation end. Moreover, since the learning parameter is sequentially corrected using the measurement signal, the operation speed of the operation end of the plant can be reflected in the learning parameter.

例えば、設計仕様の操作端を複数使用し、実際の動作速度にはばらつきがある場合は、個々の操作端の動作速度を考慮した学習を実施できるようになる。また、操作端が経年劣化し動作速度が低下した場合でも、低下した動作速度を拘束条件として学習してプラントを良好に制御することができるので、プラントを安全に運転することが可能となるという効果が得られる。 For example, when a plurality of operation terminals with design specifications are used and the actual operation speed varies, learning can be performed in consideration of the operation speed of each operation terminal. In addition, even if the operating end deteriorates over time and the operating speed decreases, it is possible to learn the reduced operating speed as a constraint and control the plant well, so that the plant can be operated safely. An effect is obtained.

また、本実施例のプラントの制御装置を用いることにより、プラントの運転員が学習の拘束条件を決定する作業が不要になるため、制御装置の使い勝手の向上、学習のための条件設定期間の短縮という効果も得られる。 In addition, the use of the plant control device of this embodiment eliminates the need for the plant operator to determine learning constraint conditions, thereby improving the usability of the control device and shortening the condition setting period for learning. The effect is also obtained.

本発明は火力発電プラント等のプラントの制御装置及びプラントの制御方法に適用可能である。
The present invention can be applied to a plant control apparatus and a plant control method such as a thermal power plant.

本発明の一実施例であるプラントの制御装置の全体構成を示すブロック図。The block diagram which shows the whole structure of the control apparatus of the plant which is one Example of this invention. 本発明の一実施例であるプラントの制御装置が適用される火力発電プラントの構成図。The block diagram of the thermal power plant to which the control apparatus of the plant which is one Example of this invention is applied. 図２に示した火力発電プラントの配管部とエアーヒーター部の拡大図。The enlarged view of the piping part and air heater part of the thermal power plant shown in FIG. 図１に示したプラントの制御装置における操作信号生成部のブロック図。The block diagram of the operation signal production | generation part in the control apparatus of the plant shown in FIG. 図１に示したプラントの制御装置における制御パラメータ設定画面の説明図。Explanatory drawing of the control parameter setting screen in the control apparatus of the plant shown in FIG. 図１に示したプラントの制御装置における学習条件決定部の機能の説明図。Explanatory drawing of the function of the learning condition determination part in the control apparatus of the plant shown in FIG. 図１に示したプラントの制御装置における学習条件決定部の学習パラメータ更新方法の一例を示す説明図。Explanatory drawing which shows an example of the learning parameter update method of the learning condition determination part in the control apparatus of the plant shown in FIG. 図１に示したプラントの制御装置におけるモデルのモデル入力とモデル出力の関係を示す説明図。Explanatory drawing which shows the relationship between the model input and model output of the model in the control apparatus of the plant shown in FIG. 図１に示したプラントの制御装置における学習部のモデルを対象にモデル入力の生成方法を学習した学習結果を示す説明図。Explanatory drawing which shows the learning result which learned the production | generation method of model input for the model of the learning part in the control apparatus of the plant shown in FIG. 図１に示したプラントの制御装置における学習部で学習して生成した操作信号の学習結果を示す説明図。Explanatory drawing which shows the learning result of the operation signal learned and produced | generated by the learning part in the control apparatus of the plant shown in FIG. 本発明の一実施例であるプラントの制御装置の演算処理内容を示すフローチャート。The flowchart which shows the arithmetic processing content of the control apparatus of the plant which is one Example of this invention. 図１１に示すフローチャートに基づいて学習したモデル入力及び操作信号の学習結果を示す説明図。Explanatory drawing which shows the learning result of the model input and operation signal which were learned based on the flowchart shown in FIG. 本発明の一実施例であるプラントの制御装置に学習情報追加部を設置した場合の演算処理内容を示すフローチャート。The flowchart which shows the arithmetic processing content at the time of installing the learning information addition part in the control apparatus of the plant which is one Example of this invention. 図１３に示すフローチャートに基づいて学習したモデル入力の入力空間を領域に分割する方法の説明図。Explanatory drawing of the method of dividing | segmenting the input space of the model input learned based on the flowchart shown in FIG. 13 into an area | region. 図１３に示すフローチャートにおけるステップ１１５０の詳細を示すフローチャート。14 is a flowchart showing details of step 1150 in the flowchart shown in FIG. 13. 図１５に示すフローチャートを用いて学習した学習結果を示す説明図。Explanatory drawing which shows the learning result learned using the flowchart shown in FIG.

Explanation of symbols

１、２：計測信号、３：計測信号データ、８、９、１０：学習パラメータ、１７：モデル入力、１８：モデル出力、１９：評価値、２３：操作信号、２４：指令信号、１００：プラント、１００a：火力発電プラント、１０１：微粉炭をボイラ、２００：制御装置、２０１：外部入力インターフェイス、２０２：外部出力インターフェイス、２１０：計測信号データベース、２２０：操作端仕様データベース、２３０：操作信号データベース、２４０：制御ロジックデータベース、２５０：学習パラメータデータベース、２６０：評価値計算パラメータデータベース、２７０：モデルパラメータデータベース、２８０：学習情報データベース、３００：操作信号生成部、４００：学習部、５００：モデル、６００：評価値計算部、７００：学習条件決定部、８００：学習情報追加部、９００：外部入力装置、９０１：キーボード、９０２：マウス、９１０：保守ツール、９２０：外部入力インターフェイス、９３０：データ送受信処理部、９４０：外部出力インターフェイス、９５０：画像表示装置。 1, 2: Measurement signal, 3: Measurement signal data, 8, 9, 10: Learning parameter, 17: Model input, 18: Model output, 19: Evaluation value, 23: Operation signal, 24: Command signal, 100: Plant 100a: Thermal power plant, 101: Boiler for pulverized coal, 200: Control device, 201: External input interface, 202: External output interface, 210: Measurement signal database, 220: Operation end specification database, 230: Operation signal database, 240: control logic database, 250: learning parameter database, 260: evaluation value calculation parameter database, 270: model parameter database, 280: learning information database, 300: operation signal generation unit, 400: learning unit, 500: model, 600: Evaluation value calculation unit, 700: learning Condition determining unit, 800: learning information adding unit, 900: external input device, 901: keyboard, 902: mouse, 910: maintenance tool, 920: external input interface, 930: data transmission / reception processing unit, 940: external output interface, 950 : Image display device.

Claims

In a plant control apparatus including an operation signal generation unit that calculates an operation signal that is a control command to be given to the plant using a measurement signal that is an operation state quantity of the plant,
The control device includes a model that simulates the control characteristics of the plant to be controlled, a control logic database that stores control logic data including control parameters used to calculate operation signals in the operation signal generation unit, An operation end specification database storing operation end specification data of an operation end for controlling the state quantity, an operation signal database storing past operation signals, and a measurement signal database storing past measurement signals ,
The operation limit speed, upper limit value, and lower limit value of the operation end that is the limit value of the operation signal change width per unit time using the control logic data and the operation end specification data stored in the control logic database and the operation end specification database. The measurement signal data per unit time, which is the sample control cycle, is determined using the function for determining the initial value of the learning parameter including the operation signal data and the measurement signal data stored in the operation signal database and the measurement signal database . Learning with a function of updating the operation limit speed of the operation end included in the learning parameter, which is control logic data, to the value of the change amount of the measurement signal data when the change amount is smaller than the change amount of the operation signal data A condition determination unit;
Model input that the model output simulated by the model achieves the target value of the model output using the model with the limit value of the operation signal change width per unit time included in the learning parameter set as the learning constraint condition A learning unit that learns how to generate
Each of the learning units is equipped with a learning information database storing learning information data that is a result of learning how to generate a model input.
The operation signal generation unit includes a learning signal generation unit that calculates an operation signal for the plant using a measurement signal that is an operation state quantity of the plant and learning information data stored in a learning information database. Plant control device.

The plant control apparatus according to claim 1,
When the learning parameter is changed to a value different from the learning parameter of the previous value, the control device uses the learning information data stored in the learning information database to per unit time included in the learning parameter. A function for generating learning information data when learning is performed in the learning unit by setting the limit value of the operation signal change width in the learning constraint and transmitting the generated additional learning information data to the learning information database. A plant control apparatus comprising a learning information adding unit.

In a plant control device for controlling a thermal power plant by calculating an operation signal as a control command to be given to the thermal power plant using a measurement signal which is an operation state quantity of the thermal power plant,
The control device simulates the control characteristics of a thermal power plant to be controlled, and an operation signal generation unit that calculates an operation signal as a control command to be given to the plant using a measurement signal that is an operation state quantity of the thermal power plant. Control logic database that stores the model and control logic data including control parameters used to calculate the operation signal in the operation signal generator, and operation end specification data for the operation end that controls the state quantity of the thermal power plant Operation terminal specification database, operation signal database in which past operation signals are stored, measurement signal database in which past measurement signals are stored,
Using the control logic data and operation end usage data stored in the control logic database and the operation end specification database, the operation limit speed, upper limit value, and lower limit value of the operation end that is the limit value of the operation signal change width per unit time are Changes in measurement signal data per unit time, which is the sample control cycle, using the function to determine the initial values of the included learning parameters and the operation signal data and measurement signal data stored in the operation signal database and measurement signal database When the amount is smaller than the change amount of the operation signal data, the learning condition determination has a function of updating the operation limit speed of the operation end included in the learning parameter that is the control logic data to the value of the change amount of the measurement signal data. And
Model input that the model output simulated by the model achieves the target value of the model output using the model with the limit value of the operation signal change width per unit time included in the learning parameter set as the learning constraint condition A learning unit that learns how to generate
Each of the learning units is equipped with a learning information database storing learning information data that is a result of learning how to generate a model input.
The operation signal generation unit includes a learning signal generation unit that calculates an operation signal for a thermal power plant using a measurement signal that is an operation state quantity of the plant and learning information data stored in a learning information database. A plant control device.

In the plant control apparatus according to claim 3,
The measurement signal includes at least one of nitrogen oxide concentration, carbon monoxide concentration, carbon dioxide concentration, sulfide oxide, and mercury, and the operation signal determines at least one of the air damper opening, air flow rate, and fuel flow rate The thermal power plant uses the data stored in the control logic database, the operation signal database, and the measurement signal database in the learning condition determination unit provided in the control device, and the burner switching operation, the coal type switching operation, and the load A function that estimates whether or not driving including at least one of the changing driving is performed and updates the learning parameter based on the estimation result, and the operation using the data stored in the operation signal database and the measurement signal database. The function to update the learning parameters based on the estimation result by estimating the movement speed of the edge The plant control system according to claim.

In the plant control apparatus according to claim 4,
When the learning parameter is changed to a value different from the previous learning parameter by the control device, the operation per unit time included in the learning parameter using the learning information data stored in the learning information database It has a function to generate learning information data when learning is performed in the learning unit by setting the limit value of the signal change width as a learning constraint condition, and to transmit the generated additional learning information data to the learning information database A plant comprising a learning information adding unit, wherein the learning signal generating unit of the operation signal generating unit is configured to calculate an operation signal using additional learning information data stored in a learning information database. Control device.

In the plant control apparatus according to claim 1 or 3,
Of the control parameters stored in the control logic database, the parameters set to limit the signal change width per unit time and the operation end specification database are stored in the learning condition determining unit of the control device. A plant control device characterized in that it has a function of comparing values of operation speeds at the operating end and setting a value having a small absolute value as an initial value of a learning parameter.

In the plant control apparatus according to claim 3,
A plant characterized in that a user interface for setting control parameters used in the control device is provided for each of normal operation, burner switching operation, coal type switching operation or load change operation which is an operation form of a thermal power plant. Control device.

In a plant control method for controlling a plant by calculating an operation signal as a control command to be given to the plant using a measurement signal that is an operation state quantity of the plant,
Simulate the control characteristics of the plant to be controlled by the model provided in the plant control device and control the control logic data including the control parameters used to calculate the operation signal by the operation signal generator provided in the control device. Save in the logic database, save the operating end specification data of the operating end that controls the state quantity of the plant in the operating end specification database provided in the control device, and save the past operation signals in the operation signal database provided in the control device. , Store past measurement signals in the measurement signal database provided in the control device,
By using the control logic data and operation end usage data stored in the control logic database and the operation end specification database by the learning condition determination unit provided in the control device, the operation end that is the limit value of the operation signal change width per unit time is set. The initial value of the learning parameter including the operation limit speed, the upper limit value, and the lower limit value is determined, and the operation signal data and the measurement signal data stored in the operation signal database and the measurement signal database are used as a sample control cycle. When the change amount of the measurement signal data per unit time is smaller than the change amount of the operation signal data, the value of the operation limit speed of the operation end included in the learning parameter, which is control logic data, is changed. Update to the value of
The model output simulated by the model using the model by setting the limit value of the operation signal change width per unit time included in the learning parameter as a learning constraint by the learning unit provided in the control device is the model output. In order to achieve the target value, the learning unit simulates the characteristics of the plant to learn the model input generation method,
The learning information data that is the result of learning how to generate the model input in the learning unit is stored in the learning information database,
A learning signal generation unit provided in the operation signal generation unit calculates an operation signal serving as a control command to be given to the plant using a measurement signal that is a plant operating state quantity and learning information data stored in a learning information database. A plant control method characterized by controlling the plant.

The plant control method according to claim 8,
When learning the model input generation method by simulating the characteristics of the plant, the learning information data stored in the learning information database is used when the learning parameter is changed to a value different from the previous learning parameter. The learning information data is generated when learning is performed with the limit value of the operation signal change width per unit time included in the learning parameter set as the learning constraint, and the generated additional learning information data is learned. A plant control method characterized in that an operation signal serving as a control command given to a plant is calculated in addition to learning information data in an information database to control the plant.

In a plant control method for controlling a thermal power plant by calculating an operation signal as a control command to be given to the thermal power plant using a measurement signal that is an operation state quantity of the thermal power plant,
The control device is provided with control logic data including control parameters used to calculate the operation signal by the operation signal generation unit provided in the control device, simulating the control characteristics of the plant to be controlled by the model provided in the plant control device. Stored in the control logic database, the operation end specification data of the operation end for controlling the state quantity of the plant is stored in the operation end specification database provided in the control device, and the past operation signals are stored in the operation signal database provided in the control device. Save and save past measurement signals in the measurement signal database provided in the control device,
The operation limit speed limit and the upper limit value of the operation end, which is the limit value of the operation signal change width per unit time, using the data stored in the control logic database and the operation end specification database by the learning condition determination unit provided in the control device. And an initial value of the learning parameter including the lower limit value, and the operation signal data and measurement signal data stored in the operation signal database and the measurement signal database are used per unit time which is a sample control period. When the change amount of the measurement signal data is smaller than the change amount of the operation signal data, the value of the operation limit speed of the operation end included in the learning parameter that is the control logic data is set to the change amount value of the measurement signal data. To update,
The model output simulated by the model using the model by setting the limit value of the operation signal change width per unit time included in the learning parameter as a learning constraint by the learning unit provided in the control device is the model output. In order to achieve the target value, the learning unit simulates the characteristics of the plant and learns how to generate the model input,
The learning information data, which is the result of learning how to generate the model input in the learning unit, is stored in the learning information database,
A learning signal generation unit provided in the operation signal generation unit calculates an operation signal serving as a control command to be given to the plant using a measurement signal that is a plant operating state quantity and learning information data stored in a learning information database. A plant control method characterized by controlling the plant.

The plant control method according to claim 10,
Use the data stored in the control logic database and operation end specification database, including the concentration of the components contained in the combustion gas in the measurement signal, the signal necessary for operating the equipment constituting the operation end in the operation signal Determine the initial values of the learning parameters, and determine the control parameters for each of the operation forms of the power generation output constant operation, the power generation output change operation, the burner switching operation, and the coal type switching operation, which are the operation forms of the thermal power plant. A control method for a plant, wherein the learning parameter is updated based on the control parameter.

The plant control method according to claim 11,
When the learning parameter is changed to a value different from the learning parameter of the previous value, using the learning information data stored in the learning information database, an operation signal change width per unit time included in the learning parameter To generate learning information data when learning is performed in the learning unit with the limit value of learning set as a learning constraint condition, and an operation signal for the thermal power plant is calculated using the generated additional learning information data A plant control method characterized by that.

In the plant control method according to claim 8 or 10,
Of the control parameters stored in the control logic database, compare the parameter set to limit the change width of the signal per unit time with the operation speed value stored in the operation terminal specification database. A plant control method characterized in that a value having a small absolute value is set as an initial value of a learning parameter.