JPWO2020241657A1

JPWO2020241657A1 - Optimal control device, optimal control method and computer program

Info

Publication number: JPWO2020241657A1
Application number: JP2021522795A
Authority: JP
Inventors: 理山中; 祐太大西; 由紀夫平岡
Original assignee: Toshiba Corp; Toshiba Infrastructure Systems and Solutions Corp
Current assignee: Toshiba Corp; Toshiba Infrastructure Systems and Solutions Corp
Priority date: 2019-05-29
Filing date: 2020-05-27
Publication date: 2021-11-18
Anticipated expiration: 2040-05-27
Also published as: CN113874794A; JP7183411B2; WO2020241657A1

Abstract

実施形態は、制御対象プロセスのダイナミクスに適応して極値制御をより安定的に動作させることができる最適制御装置、最適制御方法及びコンピュータプログラムを提供する。実施形態の最適制御装置は、第１の勾配推定部と、操作量決定部と、第２の勾配推定部と、パラメータ調整部と、を持つ。第１の勾配推定部は、評価量を示す信号に基づいて評価関数のヤコビアンを推定する。操作量決定部は、ヤコビアンの推定値を積分することにより操作量を動かすべき方向及び量を決定する。第２の勾配推定部は、評価量を示す信号に基づいて評価関数のヘシアンを推定する。パラメータ調整部は、操作量決定部に入力されるヤコビアンの推定値を、評価関数のヤコビアン又はヘシアンの推定値に基づく値であって０とならないように調整された正則化信号で除することにより、操作量決定部の積分ゲインを評価関数の変化に応じて調整する。The embodiment provides an optimum control device, an optimum control method, and a computer program capable of more stably operating extreme value control by adapting to the dynamics of the controlled process. The optimum control device of the embodiment includes a first gradient estimation unit, a manipulated variable determination unit, a second gradient estimation unit, and a parameter adjustment unit. The first gradient estimation unit estimates the Jacobian of the evaluation function based on the signal indicating the evaluation quantity. The manipulated variable determination unit determines the direction and amount in which the manipulated variable should be moved by integrating the Jacobian estimates. The second gradient estimation unit estimates the evaluation function Hesian based on the signal indicating the evaluation quantity. The parameter adjustment unit divides the Jacobian estimated value input to the manipulated variable determination unit by a regularized signal that is based on the Jacobian or Hesian estimation value of the evaluation function and is adjusted so as not to be 0. , Adjust the integral gain of the manipulated variable determination unit according to the change of the evaluation function.

Description

本発明の実施形態は、最適制御装置、最適制御方法及びコンピュータプログラムに関する。 Embodiments of the present invention relate to an optimal control device, an optimal control method, and a computer program.

近年、プラント制御の方法として、極値制御と呼ばれる技術が注目されている。極値制御は、プラントの複雑なモデルを用いないモデルフリーのリアルタイム最適制御技術である。極値制御の概要は、操作量を強制的に変化させることにより、制御対象プロセスの制御量に基づく評価量が最適化される操作量を探索していくものである。このような極値制御をプラント制御に適用する場合、極値制御に係る各種のパラメータ（以下「極値制御パラメータ」という。）を制御対象プロセスの特性に応じて適切に設定する必要がある。従来、極値制御パラメータの設計に関する指針がいくつか示されているが、そのいずれも制御対象プロセスの時間的な変化（以下「ダイナミクス」という。）に適応して極値制御を安定的に動作させることができるまでには至っていない。 In recent years, a technique called extreme value control has been attracting attention as a method of plant control. Extreme value control is a model-free real-time optimal control technology that does not use a complex model of the plant. The outline of extreme value control is to search for an operation amount for which an evaluation amount based on a control amount of a controlled target process is optimized by forcibly changing the operation amount. When such extreme value control is applied to plant control, it is necessary to appropriately set various parameters related to extreme value control (hereinafter referred to as "extreme value control parameters") according to the characteristics of the controlled process. Conventionally, some guidelines for designing extreme value control parameters have been shown, but all of them are adapted to the temporal changes of the controlled process (hereinafter referred to as "dynamics") and the extreme value control operates stably. It has not reached the point where it can be made to do.

日本国特開２０１７−０３３１０４号公報Japanese Patent Application Laid-Open No. 2017-033104

D.Nesic et. al., ‘A Unifying Approach to Extremum Seeking: Adaptive Schemes Based on Estimation of Derivatives’, Proc. 49th IEEE Confe rence on Decision and Control, December 15-17, 2010）D. Nesic et. Al., ‘A Unifying Approach to Extremum Seeking: Adaptive Schemes Based on Estimation of Derivatives’, Proc. 49th IEEE Confe rence on Decision and Control, December 15-17, 2010) Yan et al, On the choice of dither in extremum seeking syste ms: A case study, Automatica, 44, pp.1446-1450 (2008)Yan et al, On the choice of dither in extremum seeking syste ms: A case study, Automatica, 44, pp.1446-1450 (2008)

本発明が解決しようとする課題は、制御対象プロセスのダイナミクスに適応して極値制御をより安定的に動作させることができる最適制御装置、最適制御方法及びコンピュータプログラムを提供することである。 An object to be solved by the present invention is to provide an optimum control device, an optimum control method, and a computer program capable of operating extreme value control more stably by adapting to the dynamics of a controlled process.

実施形態の最適制御装置は、制御対象プロセスの制御量に基づく値であって前記制御対象プロセスの操作量に対して未知の評価関数によって表される値である評価量を、前記評価関数の最適値に近づけるように前記操作量を更新する極値制御を実行する最適制御装置である。最適制御装置は、第１の勾配推定部と、操作量決定部と、第２の勾配推定部と、パラメータ調整部と、を持つ。第１の勾配推定部は、前記評価量を示す信号に基づいて前記評価関数のヤコビアンを推定する。操作量決定部は、前記ヤコビアンの推定値を積分することにより前記操作量を動かすべき方向及び量を決定する。第２の勾配推定部は、前記評価量を示す信号に基づいて前記評価関数のヘシアンを推定する。パラメータ調整部は、前記操作量決定部に入力される前記ヤコビアンの推定値を、前記評価関数のヤコビアン又はヘシアンの推定値に基づく値であって０とならないように調整された正則化信号で除することにより、前記操作量決定部の積分ゲインを前記評価関数の変化に応じて調整する。 The optimum control device of the embodiment optimizes the evaluation amount, which is a value based on the control amount of the controlled object process and is a value represented by an unknown evaluation function with respect to the operation amount of the controlled object process. It is an optimal control device that executes extreme value control that updates the operation amount so as to approach the value. The optimum control device has a first gradient estimation unit, a manipulated variable determination unit, a second gradient estimation unit, and a parameter adjustment unit. The first gradient estimation unit estimates the Jacobian of the evaluation function based on the signal indicating the evaluation amount. The manipulated variable determination unit determines the direction and amount in which the manipulated variable should be moved by integrating the estimated value of the Jacobian. The second gradient estimation unit estimates the Hessian of the evaluation function based on the signal indicating the evaluation amount. The parameter adjusting unit divides the estimated value of the Jacobian input to the manipulated variable determination unit by a regularized signal that is a value based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0. By doing so, the integrated gain of the manipulated variable determination unit is adjusted according to the change of the evaluation function.

図１Ａは、第１の実施形態において、極値制御の基本的な概念を説明する図である。FIG. 1A is a diagram illustrating a basic concept of extreme value control in the first embodiment. 図１Ｂは、第１の実施形態において、極値制御の基本的な概念を説明する図である。FIG. 1B is a diagram illustrating a basic concept of extreme value control in the first embodiment. 図１Ｃは、第１の実施形態において、極値制御の基本的な概念を説明する図である。FIG. 1C is a diagram illustrating a basic concept of extreme value control in the first embodiment. 図２は、第１の実施形態において、極値制御システムの基本的な構成例を示す図である。FIG. 2 is a diagram showing a basic configuration example of an extreme value control system in the first embodiment. 図３は、第１の実施形態の極値制御システムの構成例を示す図である。FIG. 3 is a diagram showing a configuration example of the extreme value control system of the first embodiment. 図４は、第１の実施形態における極値制御パラメータの調整方法の具体例を示す図である。FIG. 4 is a diagram showing a specific example of the method for adjusting the extreme value control parameter in the first embodiment. 図５は、第１の実施形態における勾配推定器の第１の構成例を示す図である。FIG. 5 is a diagram showing a first configuration example of the gradient estimator according to the first embodiment. 図６は、第１の実施形態における勾配推定器の第２の構成例を示す図である。FIG. 6 is a diagram showing a second configuration example of the gradient estimator according to the first embodiment. 図７は、第１の実施形態において、第１の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 7 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the first method in the first embodiment. 図８は、第１の実施形態において、第１の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 8 is a diagram showing a configuration example of an extreme value control system that generates a regularized signal by the first method in the first embodiment. 図９は、第１の実施形態において、第２の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 9 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the first embodiment. 図１０は、第１の実施形態において、第２の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 10 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the first embodiment. 図１１は、第１の実施形態において、第３の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 11 is a diagram showing a configuration example of an extreme value control system that generates a regularized signal by the third method in the first embodiment. 図１２は、第１の実施形態において、第４の方法によって生成された正則化信号の一例について説明するための図である。FIG. 12 is a diagram for explaining an example of a regularization signal generated by the fourth method in the first embodiment. 図１３は、第１の実施形態において、第４の方法によって生成された正則化信号の他の例について説明するための図である。FIG. 13 is a diagram for explaining another example of the regularization signal generated by the fourth method in the first embodiment. 図１４Ａは、第１の実施形態において、正則化信号を用いることなく操作量の応答をシミュレーションした結果の一例を説明するための図である。FIG. 14A is a diagram for explaining an example of the result of simulating the response of the manipulated variable without using the regularization signal in the first embodiment. 図１４Ｂは、第１の実施形態において、正則化信号として正則化信号勾配の符号信号を用いて操作量の応答をシミュレーションした結果の一例を説明するための図である。FIG. 14B is a diagram for explaining an example of the result of simulating the response of the manipulated variable using the coded signal of the regularized signal gradient as the regularized signal in the first embodiment. 図１４Ｃは、第１の実施形態において、第４の方法によって生成された正則化信号を用いたときの効果の一例を説明するための図である。FIG. 14C is a diagram for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment. 図１４Ｄは、第１の実施形態において、第４の方法によって生成された正則化信号を用いたときの効果の一例を説明するための図である。FIG. 14D is a diagram for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment. 図１５は、第２の実施形態の極値制御システムの構成例を示す図である。FIG. 15 is a diagram showing a configuration example of the extreme value control system of the second embodiment. 図１６は、第２の実施形態において、第２の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 16 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the second embodiment. 図１７は、第２の実施形態において、第３の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 17 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the third method in the second embodiment. 図１８は、第１の実施形態又は第２の実施形態の極値制御システムの適用例を示す図である。FIG. 18 is a diagram showing an application example of the extremum control system of the first embodiment or the second embodiment.

Embodiment

以下、実施形態の最適制御装置、最適制御方法及びコンピュータプログラムを、図面を参照して説明する。 Hereinafter, the optimum control device, the optimum control method, and the computer program of the embodiment will be described with reference to the drawings.

（第１の実施形態）［極値制御の概略］
図１Ａ乃至図１Ｃは、極値制御の基本的な概念を説明する図である。
極値制御は、操作量に対する評価量の変化を観測しながら、評価量を最適値に近づける方向に操作量を更新していく制御方法である。評価量は、制御対象となるプロセス（以下「制御対象プロセス」という。）の最適化の指標となる値であり、制御対象プロセスの制御量に基づいて決定される。例えば、評価量は制御量を変数とする所定の評価関数によって表される。評価量は、制御量に基づく値であればどのような評価基準に基づいて定義されてもよい。例えば、評価量は制御量そのものであってもよい。一般に、極値制御において、制御対象プロセスの評価関数は操作量に対して未知の関数であってよい。(First Embodiment) [Outline of extreme value control]
1A to 1C are diagrams illustrating a basic concept of extreme value control.
Extreme value control is a control method in which the manipulated variable is updated in the direction of approaching the optimum value while observing the change in the evaluated variable with respect to the manipulated variable. The evaluation amount is a value that is an index for optimization of the process to be controlled (hereinafter referred to as “controlled process”), and is determined based on the controlled amount of the controlled process. For example, the evaluation quantity is represented by a predetermined evaluation function having a control quantity as a variable. The evaluation amount may be defined based on any evaluation standard as long as it is a value based on the control amount. For example, the evaluation amount may be the control amount itself. Generally, in extreme value control, the evaluation function of the controlled process may be an unknown function with respect to the manipulated variable.

具体的には、極値制御では、操作量を示す信号にディザー信号を作用させることによって操作量を変化させる。ディザー信号は、値が周期的に変化する信号であり、通常は正弦波で与えられることが多い。極値制御では、ディザー信号によって操作量を継続的に振動させ、それによって生じる評価量の変化（増減）を観測する。そして、観測された評価量の変化に基づいて、評価関数の最適値（最大値又は最小値）に近づくように評価量を変化させる新たな操作量を算出し、算出した新たな操作量で現在の操作量を更新する。極値制御は、このような評価量の観測及び操作量の更新を繰り返すことによって評価関数の最適値を探索していく制御方法である。 Specifically, in extreme value control, the manipulated variable is changed by applying a dither signal to the signal indicating the manipulated variable. The dither signal is a signal whose value changes periodically, and is usually given as a sine wave. In extreme value control, the manipulated variable is continuously vibrated by the dither signal, and the change (increase / decrease) in the evaluated variable caused by it is observed. Then, based on the change in the observed evaluation amount, a new operation amount that changes the evaluation amount so as to approach the optimum value (maximum value or minimum value) of the evaluation function is calculated, and the calculated new operation amount is currently used. Update the operation amount of. Extreme value control is a control method for searching for the optimum value of the evaluation function by repeating such observation of the evaluation quantity and update of the manipulated quantity.

例えば、図１Ａは、操作量に対して未知の評価関数の一例として下に凸の二次関数を想定した評価関数曲線ＥＶを示す。また、図１Ｂは、制御対象プロセスの操作量をディザー信号で振動させた結果、評価量を示す信号がディザー信号とは逆位相で変化した場合（例えば操作量の増加に対して評価量が減少する）を示す。このような変化は、動作点が例えば評価関数曲線ＥＶの極小点Ｐ１０より左側の領域で変化する場合（例えば動作点Ｐ１１から極小点Ｐ１０に向かって変化する場合）に起こる。 For example, FIG. 1A shows an evaluation function curve EV assuming a downwardly convex quadratic function as an example of an evaluation function unknown to the manipulated variable. Further, FIG. 1B shows a case where the signal indicating the evaluation amount changes in the opposite phase to the dither signal as a result of vibrating the operation amount of the controlled process with the dither signal (for example, the evaluation amount decreases with the increase of the operation amount). To). Such a change occurs, for example, when the operating point changes in the region to the left of the minimum point P10 of the evaluation function curve EV (for example, when the operating point changes from the operating point P11 toward the minimum point P10).

一方、図１Ｃは、図１Ｂと同様のディザー信号で制御対象プロセスの操作量を変化させた結果、評価量を示す信号がディザー信号と同位相で変化した場合（例えば操作量の増加に対して評価量も増加する）を示す。このような変化は、動作点が例えば評価関数曲線ＥＶの極小点Ｐ１０より右側の領域で変化する場合（例えば動作点Ｐ１２から極小点Ｐ１０に向かって変化する場合）に起こる。 On the other hand, FIG. 1C shows a case where the signal indicating the evaluation amount changes in the same phase as the dither signal as a result of changing the operation amount of the controlled process with the same dither signal as in FIG. 1B (for example, with respect to an increase in the operation amount). The evaluation amount also increases). Such a change occurs, for example, when the operating point changes in the region to the right of the minimum point P10 of the evaluation function curve EV (for example, when the operating point changes from the operating point P12 toward the minimum point P10).

したがって、操作量を周期的に増減させた結果、評価量が操作量と同位相で増減する場合には操作量を減少させ、評価量が操作量と逆位相で増減する場合には操作量を増加させることによって、評価量を最適値に近づけることができる。従来、産業用プラントの制御方式として一般的に用いられてきたＰＩＤ制御（Proportional-Integral-Derivative Control）は、制御量が予め設定された目標値に追従するように操作量を制御する目標値追従型の制御方式であった。これに対して、極値制御は、評価量を最適化する最適値探索型の制御方式であるため、ＰＩＤ制御のように制御対象プロセスについて操作量と制御量との関係性を表すプロセスモデルを予め作成しておく必要がない。このような性質を有する極値制御は、目標値を予め設定できないような制御対象プロセスについても有効に機能させることができるため今後広く普及する可能性を秘めている。その一方で、極値制御を実現する極値制御システムは、次の図２に示すように比較的簡単な構成で実現することができる。 Therefore, as a result of periodically increasing or decreasing the operation amount, the operation amount is decreased when the evaluation amount increases or decreases in the same phase as the operation amount, and the operation amount is increased when the evaluation amount increases or decreases in the opposite phase to the operation amount. By increasing the amount, the evaluation amount can be brought closer to the optimum value. PID control (Proportional-Integral-Derivative Control), which has been generally used as a control method for industrial plants, controls the operation amount so that the control amount follows a preset target value. It was a type control method. On the other hand, since extreme value control is an optimum value search type control method that optimizes the evaluation amount, a process model that expresses the relationship between the operation amount and the control amount for the controlled target process like PID control is used. There is no need to create it in advance. Extreme value control having such a property has the potential to become widespread in the future because it can effectively function even for a controlled target process in which a target value cannot be set in advance. On the other hand, an extreme value control system that realizes extreme value control can be realized with a relatively simple configuration as shown in FIG. 2 below.

図２は、極値制御システムの基本的な構成例を示す図である。
図２の極値制御システム９は、変調用ディザー信号出力部１１、ハイパスフィルタ１２（ＨＰＦ:High-Pass Filter）、復調用ディザー信号出力部１３、ローパスフィルタ１４（ＬＰＦ：Low-Pass Filter）、及び積分器１５を備える。このように極値制御システム９の構成は、従来のＰＩＤ制御コントローラと比較しても同程度の複雑さである。そのため、極値制御システム９は、ＰＩＤ制御コントローラと同様に、ＰＬＣ（Programmable Logic Controller）等のハードウェアを用いて容易に実装可能である。以下、図２の極値制御システム９の動作の概要について説明する。なお、ここでは、最適値として評価関数の極小値を探索する場合を例に説明する。FIG. 2 is a diagram showing a basic configuration example of an extreme value control system.
The extreme value control system 9 of FIG. 2 includes a modulation dither signal output unit 11, a high-pass filter 12 (HPF: High-Pass Filter), a demodulation dither signal output unit 13, and a low-pass filter 14 (LPF: Low-Pass Filter). And an integrator 15. As described above, the configuration of the extreme value control system 9 is as complicated as that of the conventional PID control controller. Therefore, the extreme value control system 9 can be easily implemented by using hardware such as a PLC (Programmable Logic Controller), similarly to the PID control controller. Hereinafter, an outline of the operation of the extreme value control system 9 of FIG. 2 will be described. Here, a case of searching for the minimum value of the evaluation function as the optimum value will be described as an example.

まず、変調用ディザー信号出力部１１は、ディザー信号を作用させることにより、制御対象プロセスの操作量に対して強制的な変化を与える。例えば、変調用ディザー信号出力部１１は、正弦波等のディザー信号を作用させることにより、制御対象プロセスの操作量を周期的に変化させる。以下、この操作をモジュレーション（Modulation：変調）といい、モジュレーションに用いられるディザー信号を変調用ディザー信号という。このモジュレーションによる操作量の変化に応じて制御量が変化する。制御対象プロセスは、このように変化する制御量に基づいて評価量を取得し、取得した評価量を極値制御システム９にフィードバックする。 First, the modulation dither signal output unit 11 applies a dither signal to forcibly change the manipulated variable of the controlled process. For example, the modulation dither signal output unit 11 periodically changes the operation amount of the controlled target process by applying a dither signal such as a sine wave. Hereinafter, this operation is referred to as modulation, and the dither signal used for modulation is referred to as a modulation dither signal. The control amount changes according to the change in the operation amount due to this modulation. The controlled target process acquires an evaluation amount based on the control amount changing in this way, and feeds back the acquired evaluation amount to the extreme value control system 9.

一般に、制御量は操作量の変化に対してある程度の時間遅れを伴って変化することが多いため、制御量に基づいて取得される評価量も操作量の変化に対してある程度の時間遅れを伴って変化するものとなる。なお、制御量に基づいて評価量を取得する機能は、必ずしも制御対象プロセスに含まれる必要はない。例えば、評価量を取得する機能は、極値制御システム９に含まれてもよいし、制御対象プロセスと極値制御システム９との間に介在しうる他の装置によって実現されてもよい。 In general, the controlled variable often changes with a certain time delay with respect to the change in the manipulated variable, so the evaluation quantity acquired based on the controlled variable also has a certain time delay with respect to the change in the manipulated variable. Will change. It should be noted that the function of acquiring the evaluation quantity based on the control quantity does not necessarily have to be included in the controlled target process. For example, the function of acquiring the evaluation amount may be included in the extremum control system 9, or may be realized by another device that may intervene between the controlled target process and the extremum control system 9.

極値制御システム９は、このようにフィードバックされる評価量に基づいて、評価量を評価関数の極値に近づけるように操作量を更新する。この場合、制御対象プロセスの評価関数が極小値を持つことが前提となるが、上述のとおり、評価関数は操作量に対して未知の関数であるため、その極値も操作量に対して未知である。そのため、極値制御システム９は、モジュレーションに応じて変化した評価量の変化の大きさ及び方向をフィードバックされる評価量の信号に基づいて観測し、観測された変化の大きさ及び方向に基づいて新たな操作量を決定する。 The extremum control system 9 updates the operation amount so that the evaluation amount approaches the extremum value of the evaluation function based on the evaluation amount fed back in this way. In this case, it is assumed that the evaluation function of the controlled process has a minimum value, but as described above, since the evaluation function is an unknown function with respect to the manipulated variable, its extreme value is also unknown with respect to the manipulated variable. Is. Therefore, the extreme value control system 9 observes the magnitude and direction of the change of the evaluation quantity changed according to the modulation based on the signal of the evaluation quantity fed back, and based on the magnitude and direction of the observed change. Determine a new amount of operation.

具体的には、この新たな操作量の決定は、ハイパスフィルタ１２、復調用ディザー信号出力部１３、ローパスフィルタ１４、及び積分器１５が以下の各機能を有することによって実現される。 Specifically, the determination of this new manipulated variable is realized by the high-pass filter 12, the demodulation dither signal output unit 13, the low-pass filter 14, and the integrator 15 having the following functions.

ハイパスフィルタ１２は、フィードバックされる評価量の信号から未知の極小値に応じた一定値のバイアスを除去する。この処理はすなわち、未知の極小値を常にゼロに調整するための処理であり、後述する積分器１５が操作量を更新する方向（増加又は減少）を決定するための前処理である。 The high-pass filter 12 removes a constant value bias according to an unknown minimum value from the signal of the evaluation amount to be fed back. This process is a process for always adjusting the unknown minimum value to zero, and is a preprocess for determining the direction (increase or decrease) for updating the manipulated variable by the integrator 15 described later.

復調用ディザー信号出力部１３は、このように調整された評価量の信号に対して復調用のディザー信号を作用させることにより、操作量のモジュレーションに応じて変化した評価量から変調用ディザー信号と同じ周波数成分を抽出する。以下、この操作をデモジュレーション（Demodulation：復調）といい、デモジュレーションに用いられるディザー信号を復調用ディザー信号という。デモジュレーションの役割は以下のとおりである。 The demodulation dither signal output unit 13 causes the demodulation dither signal to act on the signal of the evaluation amount adjusted in this way, so that the evaluation amount changed according to the modulation of the operation amount is changed to the modulation dither signal. Extract the same frequency component. Hereinafter, this operation is referred to as demodulation, and the dither signal used for demodulation is referred to as a demodulation dither signal. The role of demodulation is as follows.

操作量に対して未知の評価関数には非線形要素が含まれている場合がある。この場合、評価関数は下に凸（極大値探索の場合は上に凸）の非線形関数であると想定される。このような非線形要素に起因して、評価量には変調用ディザー信号の周波数ωに応じた高調波成分や分調波成分が現れる可能性が高いと考えられる。デモジュレーションは、このような高調波や分調波の影響を取り除くための処理である。このデモジュレーションにより、評価量の信号に含まれる成分のうち、評価量を変化させた変調用ディザー信号と同じ周波数ωの成分が抽出される。 Evaluation functions unknown to the manipulated variable may contain non-linear elements. In this case, the evaluation function is assumed to be a non-linear function that is convex downward (convex upward in the case of maximal value search). Due to such a non-linear element, it is highly likely that a harmonic component or a harmonic component corresponding to the frequency ω of the modulation dither signal appears in the evaluation quantity. Demodulation is a process for removing the effects of such harmonics and harmonics. By this modulation, among the components included in the evaluation amount signal, the component having the same frequency ω as the modulation dither signal in which the evaluation amount is changed is extracted.

復調された評価量の信号は、ローパスフィルタ１４に入力される。ローパスフィルタ１４によって、評価量の信号から定常成分（低周波成分）が抽出される。具体的には、定常成分は、評価関数の一階微分値（以下「ヤコビアン」という。）を示し、モジュレーションによる評価量の変化の方向（増加又は減少）を表すと考えられる。 The demodulated evaluation amount signal is input to the low-pass filter 14. The low-pass filter 14 extracts a steady component (low frequency component) from the signal of the evaluation amount. Specifically, the steady-state component indicates the first derivative value (hereinafter referred to as "Jacobian") of the evaluation function, and is considered to indicate the direction (increase or decrease) of the change in the evaluation amount due to modulation.

積分器１５は、ローパスフィルタ１４によって抽出された定常成分を積分する。積分器１５は、定常成分の積分値に基づいて、評価量を極小値に近づけるために動かすべき操作量の方向（以下「探索方向」という。）を推定する推定器として機能する。このようにして探索方向を推定する方法は一般に勾配法と呼ばれ、適応制御系において探索方向を推定する基本的な方法の１つである。 The integrator 15 integrates the steady-state components extracted by the low-pass filter 14. The integrator 15 functions as an estimator that estimates the direction of the manipulated variable (hereinafter referred to as “search direction”) to be moved in order to bring the evaluated quantity close to the minimum value based on the integrated value of the constant component. The method of estimating the search direction in this way is generally called the gradient method, and is one of the basic methods of estimating the search direction in the adaptive control system.

具体的には、積分器１５は、定常成分の積分値に基づいて評価関数の勾配を推定し、推定した勾配の値に基づいて操作量の探索方向、及び探索方向に動かす操作量の大きさ（操作量を動かす量）を調整する。このように調整された操作量は、変調用ディザー信号によって変調されて制御対象プロセスに入力される。 Specifically, the integrator 15 estimates the gradient of the evaluation function based on the integrated value of the constant component, and based on the estimated gradient value, the search direction of the manipulated variable and the magnitude of the manipulated variable moved in the search direction. Adjust (the amount of movement of the operation amount). The manipulated variable adjusted in this way is modulated by the modulation dither signal and input to the controlled process.

なお、ここでは、極値制御システム９が極小値を探索する場合を想定して、その構成を説明したが、極値制御システム９により極大値を探索する場合には、積分器１５が推定する勾配の符号を反転させればよい。また、一般に、積分器はローパス特性を有するため、積分器１５が十分なローパス特性を有する場合には、極値制御システム９は必ずしもローパスフィルタ１４を備える必要はない。 Here, the configuration has been described on the assumption that the extreme value control system 9 searches for the minimum value, but when the extreme value control system 9 searches for the maximum value, the integrator 15 estimates. The sign of the gradient may be inverted. Further, since the integrator generally has a low-pass characteristic, the extreme value control system 9 does not necessarily have to include the low-pass filter 14 when the integrator 15 has a sufficient low-pass characteristic.

このような構成により実現される極値制御システム９は、従来のプロセス制御において一般的であったＰＩＤ制御システムと比較しても同程度の複雑さであるため、ＰＩＤ制御システムと同様にＰＬＣ（Programmable Logic Controller）等のハードウェアを用いて容易に実装可能である。 Since the extreme value control system 9 realized by such a configuration is as complicated as the PID control system that is generally used in the conventional process control, the PLC (PLC) is the same as the PID control system. It can be easily implemented using hardware such as Programmable Logic Controller).

以上、極値制御システムの基本的な構成について説明したが、このような従来の極値制御システムには、必ずしも制御対象プロセスのダイナミクスに適応した極値制御を実現することができないという課題があった。そこで以下では、このような課題を解決することができる実施形態の極値制御システムの構成について詳細に説明する。 The basic configuration of the extreme value control system has been described above, but there is a problem that such a conventional extreme value control system cannot always realize extreme value control adapted to the dynamics of the controlled process. rice field. Therefore, in the following, the configuration of the extreme value control system of the embodiment capable of solving such a problem will be described in detail.

［実施形態の詳細］
図３は、第１の実施形態の極値制御システム１の構成例を示す図である。
図３に示すプラントＰは制御対象プロセスを実現する手段の一例であり、例えば、生物学的排水処理プロセスを実現する水処理プラントである。プラントＰは、制御対象プロセスを実現する各種のプロセス機器を備え、極値制御システム１から入力する操作量に基づいてプロセス機器を動作させる。また、プラントＰは、制御対象プロセスの制御量を計測する各種の計測機器を含み、その計測値を示す情報（以下「計測情報」という。）を極値制御システム１に出力する。極値制御システム１は、プラントＰから取得される計測情報に基づいて、制御対象プロセスの評価量を最適値に近づける方向（探索方向）に操作量を更新していく。[Details of the embodiment]
FIG. 3 is a diagram showing a configuration example of the extreme value control system 1 of the first embodiment.
The plant P shown in FIG. 3 is an example of a means for realizing a controlled process, and is, for example, a water treatment plant for realizing a biological wastewater treatment process. The plant P includes various process devices that realize the controlled process, and operates the process devices based on the operation amount input from the extremum control system 1. Further, the plant P includes various measuring instruments for measuring the controlled amount of the controlled process, and outputs information indicating the measured values (hereinafter referred to as “measurement information”) to the extreme value control system 1. The extreme value control system 1 updates the operation amount in the direction (search direction) that brings the evaluation amount of the controlled target process closer to the optimum value based on the measurement information acquired from the plant P.

このような極値制御の基本的な動作は、実施形態の極値制御システム１が、従来構成の極値制御システム９と同様の変調用ディザー信号出力部１１、ハイパスフィルタ１２、復調用ディザー信号出力部１３、ローパスフィルタ１４、積分器１５を備えることによって実現される。その一方で、実施形態の極値制御システム１は、評価量信号に基づいて極値制御パラメータを調整するパラメータ調整部２を備える点で従来構成の極値制御システム９と異なる。 The basic operation of such extreme value control is that the extreme value control system 1 of the embodiment has the same modulation dither signal output unit 11, high-pass filter 12, and demodulation dither signal as the extreme value control system 9 of the conventional configuration. It is realized by including an output unit 13, a low-pass filter 14, and an integrator 15. On the other hand, the extremum control system 1 of the embodiment is different from the conventional extremum control system 9 in that it includes a parameter adjusting unit 2 that adjusts the extremum control parameters based on the evaluation quantity signal.

例えば、極値制御システム１は、バスで接続されたＣＰＵ（Central Processing Unit）やメモリや補助記憶装置などを備え、極値制御プログラムを実行する。極値制御システム１は、極値制御プログラムの実行によって上記の各機能部を備える装置又はシステムとして機能する。なお、極値制御システム１の各機能の全て又は一部は、ＡＳＩＣ（Application Specific Integrated Circuit）やＰＬＤ（Programmable Logic Device）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアを用いて実現されてもよい。プログラムは、コンピュータ読み取り可能な記録媒体に記録されてもよい。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置である。プログラムは、電気通信回線を介して送信されてもよい。 For example, the extremum control system 1 includes a CPU (Central Processing Unit), a memory, an auxiliary storage device, and the like connected by a bus, and executes an extremum control program. The extremum control system 1 functions as a device or system including each of the above-mentioned functional units by executing an extremum control program. Even if all or part of each function of the extreme value control system 1 is realized by using hardware such as ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device) and FPGA (Field Programmable Gate Array). good. The program may be recorded on a computer-readable recording medium. The computer-readable recording medium is, for example, a flexible disk, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, or a storage device such as a hard disk built in a computer system. The program may be transmitted over a telecommunication line.

パラメータ調整部２は、制御対象プロセスのダイナミクスに対して適応的に極値制御パラメータを調整する機能を有する。具体的には、パラメータ調整部２は、ダイナミクスにより時々刻々と変化する評価関数の勾配推定値に基づいて積分器１５の積分ゲインを適応的に調整する。 The parameter adjustment unit 2 has a function of adaptively adjusting extreme value control parameters with respect to the dynamics of the controlled process. Specifically, the parameter adjusting unit 2 adaptively adjusts the integral gain of the integrator 15 based on the gradient estimation value of the evaluation function that changes from moment to moment due to the dynamics.

図４は、第１の実施形態における極値制御パラメータの調整方法の具体例を示す図である。具体的には、図４は、特許文献１に記載の調整方法を引用したものである。
図４のＮｏ．５に記載のとおり、本実施形態におけるパラメータ調整部２は、制御対象プロセスからフィードバックされる評価量に基づいて評価関数の二階微分値（以下「ヘシアン」という。）を推定し、推定したヘシアンの値を用いて新たな積分ゲインを決定する。このような積分ゲインの調整のため、パラメータ調整部２は、第１乗算器２１、勾配推定部２２、正則化信号出力部２３、及び第２乗算器２４を備える。FIG. 4 is a diagram showing a specific example of the method for adjusting the extreme value control parameter in the first embodiment. Specifically, FIG. 4 cites the adjustment method described in Patent Document 1.
No. 4 in FIG. As described in 5, the parameter adjusting unit 2 in the present embodiment estimates the second-order differential value (hereinafter referred to as “hesian”) of the evaluation function based on the evaluation amount fed back from the controlled target process, and estimates the hesian. The value is used to determine the new integrated gain. For such adjustment of the integrated gain, the parameter adjusting unit 2 includes a first multiplier 21, a gradient estimation unit 22, a regularized signal output unit 23, and a second multiplier 24.

第１乗算器２１は、ハイパスフィルタ１２から入力する評価量信号にディザー信号（の二乗信号）を乗算して勾配推定部２２に出力する。勾配推定部２２は、第１乗算器２１の出力信号からヘシアン信号Ｈ（ｔ）を抽出して正則化信号出力部２３に出力する。この場合、例えば勾配推定部２２は、非特許文献１に記載されている方法を用いて評価関数の０階以上の微分値を推定することができる。すなわち、勾配推定部２２は、評価関数のヤコビアンを推定する第１の勾配推定部、又は評価関数のヘシアンを推定する第２の勾配推定部として機能し得る。具体的には、非特許文献１には、ローパスフィルタを用いて評価関数の０階以上の微分値を推定する構成が記載されており、その基本的な考え方は以下のとおりである。 The first multiplier 21 multiplies the evaluation amount signal input from the high-pass filter 12 by the dither signal (square signal) and outputs the result to the gradient estimation unit 22. The gradient estimation unit 22 extracts the hesian signal H (t) from the output signal of the first multiplier 21 and outputs it to the regularized signal output unit 23. In this case, for example, the gradient estimation unit 22 can estimate the differential value of the 0th or higher order of the evaluation function by using the method described in Non-Patent Document 1. That is, the gradient estimation unit 22 can function as a first gradient estimation unit for estimating the Jacobian of the evaluation function or a second gradient estimation unit for estimating the Hessian of the evaluation function. Specifically, Non-Patent Document 1 describes a configuration for estimating the differential value of the 0th or higher order of the evaluation function using a low-pass filter, and the basic concept thereof is as follows.

一般に、操作量には高調波成分や分調波成分が含まれる場合があるが、ディザー信号が正弦波で与えられる場合、変調後の操作量も概ねディザー信号と同じ周波数で正弦波状に変化する。そこで、操作量ＵがＵ（ｔ）＝Ｕ０＋ａ×sinωtという正弦波状に変化すると仮定し、その操作量に応じて変化する評価量が式（１）に示す評価関数Ｊで表されると仮定する。 Generally, the manipulated amount may include harmonic components and demultiplexing components, but when the dither signal is given as a sine wave, the manipulated amount after modulation also changes like a sine wave at approximately the same frequency as the dither signal. .. Therefore, it is assumed that the manipulated variable U changes in a sine and cosine shape of U (t) = U0 + a × sinωt, and the evaluated quantity that changes according to the manipulated variable is represented by the evaluation function J shown in the equation (1). ..

式（１）は、評価量Ｊ（ｔ）を操作量Ｕ（ｔ）についての（未知の）関数として定義するものである。プラントのダイナミクスを考慮すれば、正確にはｆは関数ではなく動的システムの作用素（オペレータ）とされるべきであるが、ディザー信号の周波数ωがプラントのダイナミクスに対して十分に緩やかな変化をもたらす場合にはｆを近似的に関数とみなすことができる。本実施形態では、このような前提のもとでfを関数とみなす。式（１）をテーラー展開することにより式（２）が得られる。 Equation (1) defines the evaluation quantity J (t) as an (unknown) function for the manipulation quantity U (t). Considering the dynamics of the plant, to be precise, f should be an operator of the dynamic system, not a function, but the frequency ω of the dither signal changes slowly enough with respect to the dynamics of the plant. When it is brought about, f can be regarded as an approximate function. In this embodiment, f is regarded as a function under such a premise. Equation (2) is obtained by Taylor-expanding equation (1).

ここで、Ｄｋｆ（ｋは１以上の整数）は、関数fのＵに関するｋ階微分を意味する。この式（２）にｓｉｎｎωｔ（ｎは１以上の整数）をかけることにより式（３）が得られる。さらに、式（３）に対して周期平均処理を施す（又は時間積分する）と、正弦波の直交性によりｓｉｎｎωｔに関する成分のみが残り、式（４）が得られる。 Here, Dkf (k is an integer of 1 or more) means the k-th derivative of the function f with respect to U. Equation (3) is obtained by multiplying this equation (2) by sinnωt (n is an integer of 1 or more). Further, when the periodic averaging process (or time integration) is applied to the equation (3), only the component related to sinnωt remains due to the orthogonality of the sine wave, and the equation (4) is obtained.

ここで、ディザー信号の振幅ａと冪数ｎが定数であることから、ｎ階微分Ｄｎｆの値が１制御周期で大きく変化しないと仮定すれば、式（４）は式（５）及び（６）のように表すことができる。そして、式（５）から逆算することによって、ｎ階微分Ｄｎｆを表す式（７）を得ることができる。 Here, since the amplitude a of the dither signal and the exponentiation n are constants, assuming that the value of the nth derivative Dnf does not change significantly in one control period, the equations (4) are the equations (5) and (6). ) Can be expressed as. Then, by back-calculating from the equation (5), the equation (7) representing the nth derivative Dnf can be obtained.

図５及び図６は、式（７）で表される勾配推定器の構成例を示す図である。
具体的には、図５はヤコビアン推定器の構成例（すなわちｎ＝１の場合）を示し、ディザー信号ｓｉｎωｔを作用させた評価量信号Ｊ（ｔ）をローパスフィルタで処理した後に２／ａ倍することによって評価関数のヤコビアンを得る構成を表している。なお、この構成は、積分ゲインＫＩを有する積分器１５に対して新たな積分ゲインＫＩｍｏｄ＝ＫＩ×（ａ／２）を定義したことに相当するため、図２に示した従来の基本的な極値制御システムは、ローパスフィルタ１４によって評価関数のヤコビアンを推定する構成であるとみなすことができる。5 and 6 are diagrams showing a configuration example of the gradient estimator represented by the equation (7).
Specifically, FIG. 5 shows a configuration example of the Jacobian estimator (that is, when n = 1), and the evaluation amount signal J (t) on which the dither signal sinωt is applied is processed by a low-pass filter and then multiplied by 2 / a. By doing so, the Jacobian of the evaluation function is obtained. Since this configuration corresponds to defining a new integral gain KImod = KI × (a / 2) for the integrator 15 having an integral gain KI, the conventional basic poles shown in FIG. 2 are used. The value control system can be regarded as a configuration for estimating the Jacobian of the evaluation function by the low-pass filter 14.

一方図６はヘシアン推定器の構成例（すなわちｎ＝２の場合）を示し、ディザー信号の二乗信号ｓｉｎ^２ωｔを作用させた評価量信号Ｊ（ｔ）をローパスフィルタで処理した後に１６倍した第１の信号から、評価量信号Ｊ（ｔ）をローパスフィルタで処理して８倍した第２の信号を減算して１／ａ２倍することによって評価関数のヘシアンを得る構成を表している。On the other hand, FIG. 6 shows a configuration example of the Hessian estimator (that is, when n = 2), ^{and the evaluation amount signal J (t) on which the squared signal sin 2} ωt of the dither signal is applied is processed by a low-pass filter and then multiplied by 16. It represents a configuration in which a hesian of an evaluation function is obtained by processing an evaluation amount signal J (t) with a low-pass filter and subtracting a second signal multiplied by 8 from the first signal and multiplying by 1 / a2.

なお、パラメータ調整部２は、このような方法で推定した評価関数のヘシアンをそのまま用いて積分ゲインを調整することもできるが、その場合、後述する理由により極値制御が不安定化する可能性がある。そこで、本実施形態の極値制御システム１では、ローパスフィルタ１４によって推定されたヤコビアンをヘシアンの推定値に基づいて正則化し、正則化後のヤコビアン信号を積分器１５に供給する。これにより、本実施形態の極値制御システム１は、極値制御の不安定化を回避しつつ、積分ゲインを適応的に更新することが可能となる。 The parameter adjusting unit 2 can adjust the integral gain by using the Hessian of the evaluation function estimated by such a method as it is, but in that case, the extreme value control may become unstable due to the reason described later. There is. Therefore, in the extreme value control system 1 of the present embodiment, the Jacobian estimated by the low-pass filter 14 is regularized based on the estimated value of Hessian, and the regularized Jacobian signal is supplied to the integrator 15. As a result, the extreme value control system 1 of the present embodiment can adaptively update the integrated gain while avoiding the destabilization of the extreme value control.

具体的には、正則化信号出力部２３が、勾配推定部２２の出力するヤコビアン信号を正則化（Regularization）する信号（以下「正則化信号」という。）を生成して第２乗算器２４に出力する。第２乗算器２４は、ローパスフィルタ１４からヤコビアン信号Ｇ（ｔ）を、正則化信号出力部２３から正則化信号を、それぞれ入力し、ヤコビアン信号に正則化信号を掛け合わせることによりヤコビアン信号を正則化する。第２乗算器２４は、正則化後のヤコビアン信号Ｇｎ（ｔ）を積分器１５に供給する。 Specifically, the regularization signal output unit 23 generates a signal (hereinafter referred to as “regularization signal”) for regularizing the Jacobian signal output by the gradient estimation unit 22, and causes the second multiplier 24. Output. The second multiplier 24 inputs a Jacobian signal G (t) from the low-pass filter 14 and a regularization signal from the regularization signal output unit 23, and multiplies the Jacobian signal by the regularization signal to make the Jacobian signal regular. To become. The second multiplier 24 supplies the regularized Jacobian signal Gn (t) to the integrator 15.

［正則化信号を生成する第１の方法］
一般に信号の「正則化」とは、対象の信号に対して何らかの逆算を行おうとした場合に、逆が存在せず、逆算ができなくなるといった悪条件(ill-condition)を回避することを意味する。例えば、このような悪条件の一例として、割り算における「ゼロ割り」などが挙げられる。[First method of generating a regularized signal]
In general, "regularization" of a signal means avoiding a bad condition (ill-condition) that when an attempt is made to perform some kind of back calculation on a target signal, the back does not exist and the back calculation cannot be performed. .. For example, as an example of such a bad condition, there is "zero division" in division.

一方で、図４に示したとおり、本実施形態の極値制御システム１は評価関数のヘシアンを用いて積分器１５の積分ゲインを適応的に更新していくものであるが、これは各制御周期の積分ゲインを、評価関数のヤコビアンをヘシアンで割る（正規化する）ことによって得られる一定値に固定して極値制御を実行することと等価的に置き換えることができる。すなわち、本実施形態における極値制御システム１の構成は、図２に示した基本的な構成に対して、ヤコビアンをヘシアンで正規化する構成を付加したものとみなすことができる。 On the other hand, as shown in FIG. 4, the extremum control system 1 of the present embodiment adaptively updates the integral gain of the integrator 15 by using the evaluation function Hessian. The integral gain of the period can be replaced with the equivalent of performing extreme value control by fixing the integral gain of the period to a constant value obtained by dividing (normalizing) the Jacobian of the evaluation function by Hessian. That is, the configuration of the extremum control system 1 in the present embodiment can be regarded as adding a configuration for normalizing the Jacobian with Hessian to the basic configuration shown in FIG.

そこで、本実施形態では、ヤコビアン信号の「正規化（Normalization）」において「ゼロ割り」等の悪条件を回避することを「正則化」と定義し、ヤコビアン信号に作用してこのような正則化を実現する信号を正則化信号として生成する。具体的には、正則化信号出力部２３は、以下の各条件を満たす信号変換（⇔）を実現する信号を正則化信号として生成する。 Therefore, in the present embodiment, avoiding adverse conditions such as "zero division" in "normalization" of the Jacobian signal is defined as "regularization", and it acts on the Jacobian signal to perform such regularization. Is generated as a regularized signal. Specifically, the regularization signal output unit 23 generates a signal that realizes signal conversion (⇔) satisfying each of the following conditions as a regularization signal.

［条件１］Ｇ（ｔ）＝０ ⇔ Ｇｎ（ｔ）＝０
［条件２］Ｇ（ｔ）が正（負） ⇔ Ｇｎ（ｔ）が正（負）
［条件３］Ｇ（ｔ）＜∞ ⇔ Ｇｎ（ｔ）＜∞
［条件４］Ｇ（ｔ）→∞ ⇔ Ｇｎ（ｔ）→ｋ（０＜ｋ＜∞）[Condition 1] G (t) = 0 ⇔ Gn (t) = 0
[Condition 2] G (t) is positive (negative) ⇔ Gn (t) is positive (negative)
[Condition 3] G (t) <∞ ⇔ Gn (t) <∞
[Condition 4] G (t) → ∞ ⇔ Gn (t) → k (0 <k <∞)

Ｇ（ｔ）はヤコビアン信号を表し、Ｇｎ（ｔ）は正則化信号の作用によって正則化された（すなわちヘシアンで割り算された）ヤコビアン信号を表す。［条件１］はＧ（ｔ）が０のときに限りＧｎ（ｔ）も０となるという条件を表している。［条件２］はＧ（ｔ）とＧｎ（ｔ）の符号は同じであるという条件を表している。［条件３］はＧ（ｔ）が有限のときはＧｎ（ｔ）も有限となる（すなわちゼロ割りが起こらない）という条件を表している。［条件４］はＧ（ｔ）が∞に発散したときにはＧｎ（ｔ）は∞に発散せず、ある正の有限値に収束するという条件を表している。このような性質を有する正則化信号は、例えば式（８）で表される。 G (t) represents a Jacobian signal and Gn (t) represents a Jacobian signal that has been regularized (ie, divided by Hessian) by the action of the regularized signal. [Condition 1] represents a condition that Gn (t) is also 0 only when G (t) is 0. [Condition 2] represents a condition that the signs of G (t) and Gn (t) are the same. [Condition 3] represents a condition that when G (t) is finite, Gn (t) is also finite (that is, zero division does not occur). [Condition 4] represents a condition that when G (t) diverges to ∞, Gn (t) does not diverge to ∞ and converges to a certain positive finite value. The regularized signal having such a property is represented by, for example, the equation (8).

式（８）におけるδは正の定数（δ＞０）であり、いわゆる正則化定数を表す。Ｈ（ｔ）は推定された評価関数のヘシアンを表す。式（８）によって表される正則化信号は、ヘシアン信号に正則化定数δを加える処理と、ヘシアン信号の絶対値をとる処理とによって生成することができるため、装置の大型化を抑制しつつ極値制御の安定性を向上させることができる。 Δ in the equation (8) is a positive constant (δ> 0) and represents a so-called regularization constant. H (t) represents the Hessian of the estimated evaluation function. Since the regularization signal represented by the equation (8) can be generated by the process of adding the regularization constant δ to the hesian signal and the process of taking the absolute value of the hesian signal, it is possible to suppress the increase in size of the device. The stability of extreme value control can be improved.

例えば、極値制御によって最小値（極小値）を探索する場合、極値近傍でヘシアンが正の値をとる下に凸なＭ次（Ｍ＞１）の評価関数や、極値近傍でヘシアンが負の値となる上に凸なＭ次（０＜Ｍ＜１）の評価関数についても安定して極値を探索することが可能になる。 For example, when searching for the minimum value (minimum value) by extremum control, a downwardly convex M-th order (M> 1) evaluation function in which hesian takes a positive value near the extremum, or hesian is located near the extremum. It is possible to stably search for extreme values even for evaluation functions of the Mth order (0 <M <1), which are negative and convex.

なお、ヘシアンを用いて生成される正則化信号は、必ずしも式（８）のようなヘシアンの一乗関数によって表されるものである必要は無い。例えば、正則化信号は式（９）や式（１０）のようなヘシアンの二乗関数によって表されるものであってもよいし、式（１１）や式（１２）のようなヘシアンのＭ乗（Ｍ＝１，２，３，…）関数によって表されるものであってもよい。 The regularization signal generated by using Hessian does not necessarily have to be represented by the Hessian's first power function as in Eq. (8). For example, the regularization signal may be represented by a Hessian squared function such as Eq. (9) or (10), or a Hessian M-th power such as Eq. (11) or Eq. (12). (M = 1, 2, 3, ...) It may be represented by a function.

図７及び図８は、第１の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。具体的には、図７は、式（８）で表される正則化信号によって正則化されたヤコビアン信号Ｇｎ（ｔ）を積分器１５に入力する構成例を示す。また、図８は、式（９）で表される正則化信号によって正則化されたヤコビアン信号Ｇｎ（ｔ）を積分器１５に入力する構成例を示す。 7 and 8 are diagrams showing a configuration example of an extreme value control system that generates a regularized signal by the first method. Specifically, FIG. 7 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (8) is input to the integrator 15. Further, FIG. 8 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (9) is input to the integrator 15.

［正則化信号を生成する第２の方法］
第１の方法では、正則化信号の生成にヘシアン信号を用いるため、評価関数のヘシアンを推定する何らかの機構が必要となる。例えば、図７及び図８に示した極値制御システム１においては、評価量の信号からヘシアン信号を生成するための第１乗算器２１及び勾配推定部２２（例えばローパスフィルタ）が必要となる。これに対して、第２の方法は、ヘシアンの推定値に代えてヤコビアンの推定値を用いることで、第１乗算器２１及び勾配推定部２２を備えずに正則化信号を生成する方法である。例えば、第２の方法によって生成される正則化信号は式（１３）で表される。[Second method to generate regularized signal]
In the first method, since the Hessian signal is used to generate the regularized signal, some mechanism for estimating the Hessian of the evaluation function is required. For example, in the extremum control system 1 shown in FIGS. 7 and 8, a first multiplier 21 and a gradient estimation unit 22 (for example, a low-pass filter) for generating a Hessian signal from an evaluation amount signal are required. On the other hand, the second method is a method of generating a regularized signal without the first multiplier 21 and the gradient estimation unit 22 by using the Jacobian estimation value instead of the Hessian estimation value. .. For example, the regularization signal generated by the second method is represented by the equation (13).

ここでＧ（ｔ）は、勾配推定部２２によって推定された評価関数のヤコビアン、又はヤコビアンに比例する量を表す。なお、第２の方法で生成される正則化信号は、必ずしもヤコビアンの一乗関数で表されるものである必要はない。例えば、第２の方法で生成される正則化信号は、第１の方法と同様に、ヤコビアンのＭ乗（Ｍ＝１，２，３，…）関数によって表されるものであってもよい（例えば式（１４）及び（１５））。 Here, G (t) represents the Jacobian of the evaluation function estimated by the gradient estimation unit 22, or a quantity proportional to the Jacobian. The regularization signal generated by the second method does not necessarily have to be represented by the Jacobian first power function. For example, the regularization signal generated by the second method may be represented by the Jacobian M-th power (M = 1, 2, 3, ...) Function as in the first method (. For example, equations (14) and (15)).

図９及び図１０は、第２の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。具体的には、図９は、式（１３）で表される正則化信号によって正則化されたヤコビアン信号Ｇｎ（ｔ）を積分器１５に入力する構成例を示す。また、図１０は、式（１４）で表される正則化信号（Ｍ＝２の場合）によって正則化されたヤコビアン信号Ｇｎ（ｔ）を積分器１５に入力する構成例を示す。このような第２の方法によれば、ヤコビアン信号の正則化のために極値制御システムの構成が複雑化することを抑制することができる。 9 and 10 are diagrams showing a configuration example of an extreme value control system that generates a regularized signal by the second method. Specifically, FIG. 9 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (13) is input to the integrator 15. Further, FIG. 10 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal (in the case of M = 2) represented by the equation (14) is input to the integrator 15. According to such a second method, it is possible to prevent the configuration of the extremum control system from becoming complicated due to the regularization of the Jacobian signal.

［正則化信号を生成する第３の方法］
第３の方法は、式（１３）〜（１５）においてδ＝０とすることにより第２の方法をより簡略化した方法である。この場合、第３の方法によって生成される正則化信号は式（１６）で表される。[Third method of generating a regularized signal]
The third method is a method in which the second method is further simplified by setting δ = 0 in the equations (13) to (15). In this case, the regularization signal generated by the third method is represented by the equation (16).

第１及び第２の方法においてδ（＞０）はゼロ割りを回避するように働くため、このδを０とすると、Ｇ（ｔ）＝０のときにＲＳが無限大に発散してしまい、正則化後のヤコビアン信号Ｇｎ（ｔ）も同様に発散してしまう。このような信号は、そもそも正則化信号としての条件１〜４を満たしていないため、本来は正則化信号として用いるべきものではないが、ヤコビアン信号の正則化を式（１７）のように定義する場合には正則化信号として用いることができる。 In the first and second methods, δ (> 0) works to avoid zero division, so if this δ is set to 0, RS will diverge to infinity when G (t) = 0. The Jacobian signal Gn (t) after regularization also diverges. Since such a signal does not satisfy the conditions 1 to 4 as a regularization signal in the first place, it should not be used as a regularization signal in the first place, but the regularization of the Jacobian signal is defined as in the equation (17). In some cases, it can be used as a regularization signal.

式（１７）は、ヤコビアン信号を式（１６）の正則化信号ＲＳで割ることをヤコビアン信号の正則化と定義することを表している。これは、ヤコビアン信号をヤコビアン信号の絶対値で割る操作であるから、式（１７）におけるＧｎ（ｔ）は−１又は＋１のいずれかの値をとる。すなわち、第３の方法によって生成される正則化信号とは、ヤコビアン信号に作用してその符号情報（−１又は＋１）を抽出する信号であると言える。このため、式（１７）に定義した正則化を行う場合には、式（１６）によって表される信号は原理的には正則化信号として機能する。 Equation (17) expresses that dividing the Jacobian signal by the regularization signal RS of equation (16) is defined as regularization of the Jacobian signal. Since this is an operation of dividing the Jacobian signal by the absolute value of the Jacobian signal, Gn (t) in the equation (17) takes either a value of -1 or +1. That is, it can be said that the regularized signal generated by the third method is a signal that acts on the Jacobian signal and extracts its code information (-1 or +1). Therefore, when the regularization defined in the equation (17) is performed, the signal represented by the equation (16) functions as a regularization signal in principle.

なお、このような正則化の目的が符号情報を抽出することであることからすれば、ヤコビアン信号の符号情報を抽出することができれば、実際にＧ（ｔ）をその絶対値｜Ｇ（ｔ）｜で割り算する処理は必ずしも必要ない。そのため、この場合の正則化信号出力部２３は、ｓｇｎ（Ｇ（ｔ））を直接的に計算するように構成されてもよい。また、このような構成にすることによりゼロ割りの発生を回避することができる。ここでｓｇｎ（ｘ）は値ｘの符号を返す関数を表している。 Since the purpose of such regularization is to extract the code information, if the code information of the Jacobian signal can be extracted, G (t) is actually set to its absolute value | G (t). The process of dividing by | is not always necessary. Therefore, the regularized signal output unit 23 in this case may be configured to directly calculate sgn (G (t)). Further, by making such a configuration, it is possible to avoid the occurrence of zero division. Here, sgn (x) represents a function that returns the sign of the value x.

図１１は、第１の実施形態において、第３の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。
このように抽出される符号情報は、単純に操作量を動かすべき方向（増加又は減少）のみを示す信号となる。そのため、このような正則化の構成を備えることにより、例えば図１１に示すような簡易な極値制御システムを構成することが可能になる。なお、この場合、各制御周期において操作量を変化させる量は一定値となるため、ｓｇｎ（Ｇ（ｔ））に係数を乗じるなどして、その変化量が所望量となるように調整してもよい。FIG. 11 is a diagram showing a configuration example of an extreme value control system that generates a regularized signal by the third method in the first embodiment.
The code information extracted in this way is simply a signal indicating only the direction (increase or decrease) in which the manipulated variable should be moved. Therefore, by providing such a regularization configuration, it becomes possible to configure a simple extremum control system as shown in FIG. 11, for example. In this case, since the amount of change in the operation amount in each control cycle is a constant value, the amount of change is adjusted to be a desired amount by multiplying sgn (G (t)) by a coefficient. May be good.

なお、式（１３）だけでなく、式（１４）や式（１５）においてもδ＝０とすることで式（１６）が得られることからすれば、式（１３）〜（１５）はδ＞０をパラメータとして式（１６）の作用を調整するものと考えることができる。実際、ｓｇｎ（）による正則化では操作量を動かすべき方向のみが与えられるため、パラメータが十分適切に調整されていない場合には極値近傍でチャタリングを生じる可能性がある。この場合、式（１６）をδ＞０によって調整した式（１３）〜（１５）を用いて生成した正則化信号は、極値近傍でのチャタリングを抑制するようにヤコビアン信号に作用すると考えられる。 Since the equation (16) can be obtained by setting δ = 0 not only in the equation (13) but also in the equation (14) and the equation (15), the equations (13) to (15) are δ. It can be considered that the action of the equation (16) is adjusted with> 0 as a parameter. In fact, regularization by sgn () gives only the direction in which the manipulated variable should be moved, so chattering may occur near the extremum if the parameters are not adjusted adequately. In this case, it is considered that the regularized signal generated by using the equations (13) to (15) in which the equation (16) is adjusted by δ> 0 acts on the Jacobian signal so as to suppress chattering near the extreme value. ..

第２乗算器２４は、このようにして生成される正則化信号を乗じることによりヤコビアン信号Ｇ（ｔ）を正則化し、正則化後のヤコビアン信号Ｇｎ（ｔ）を積分器１５に出力する。このようなパラメータ調整部２によるヤコビアン信号の正則化は式（１８）によって表される。 The second multiplier 24 regularizes the Jacobian signal G (t) by multiplying the regularized signal thus generated, and outputs the regularized Jacobian signal Gn (t) to the integrator 15. The regularization of the Jacobian signal by the parameter adjusting unit 2 is expressed by the equation (18).

パラメータ調整部２は、このようにして正則化されたヤコビアン信号を用いて積分器１５の積分ゲインを適応的に更新する。一般に、極値制御において操作量を動かすべき方向は、操作量に対して未知である評価関数について推定された勾配（ヤコビアン）の符号によって表され、その動かす量は積分ゲインによって調整される。ここでは、非特許文献２などに記載されているアベレージシステム（平均システム）の考え方に基づいて積分ゲインを調整する方法について説明する。 The parameter adjusting unit 2 adaptively updates the integrated gain of the integrator 15 using the Jacobian signal normalized in this way. In general, the direction in which the manipulated variable should be moved in extreme value control is represented by the sign of the gradient (Jacobian) estimated for the evaluation function unknown to the manipulated variable, and the amount to be moved is adjusted by the integral gain. Here, a method of adjusting the integrated gain based on the concept of the average system (average system) described in Non-Patent Document 2 and the like will be described.

アベレージシステムとは、あるシステムに周期的な入力が加えられたときに、その周期平均（アベレージ）をとったシステムの動的な挙動を表すシステムである。一般に、アベレージシステムは極値制御システムの安定性解析などにおいて用いられる。例えば、非特許文献２には、ダイナミクスを持たないスタティックなプラントの極値制御システムについて、そのアベレージシステムのダイナミクスが具体的に記載されている。そのアベレージシステムは式（１９）及び（２０）で表される。 The average system is a system that represents the dynamic behavior of a system that takes the periodic average (average) when a periodic input is applied to a certain system. Generally, the average system is used in the stability analysis of the extreme value control system. For example, Non-Patent Document 2 specifically describes the dynamics of the average system of an extreme value control system of a static plant having no dynamics. The average system is represented by equations (19) and (20).

式（１９）においてＧ（ｕ）は操作量ｕに対して未知である評価関数のヤコビアンの推定値を表す。ａはディザー信号の振幅を表す。Ｐはディザー信号のパワーを表し、正弦波をディザー信号とする場合にはＰ＝１／２であり、三角波をディザー信号とする場合にはＰ＝１／３であり、矩形波をディザー信号とする場合にはＰ＝１である。τは実時間ｔをディザー信号の周波数ωでスケール変換した時間（τ＝ωｔ）を表し、ＫＩ０はτの時間軸上における積分ゲインを表す。ＫＩ０は式（２０）によって実時間ｔの時間軸上における積分ゲインＫＩに変換される。 In equation (19), G (u) represents a Jacobian estimate of the evaluation function that is unknown to the manipulated variable u. a represents the amplitude of the dither signal. P represents the power of the dither signal, P = 1/2 when the sine wave is the dither signal, P = 1/3 when the triangular wave is the dither signal, and the square wave is the dither signal. If so, P = 1. τ represents the time (τ = ωt) obtained by scaling the real time t with the frequency ω of the dither signal, and KI0 represents the integrated gain on the time axis of τ. KI0 is converted into an integrated gain KI on the time axis of real time t by the equation (20).

さらに式（１９）について、操作量ｕの平衡点をｕ＊として周期平均ｕ〜＝ｕ−ｕ＊をとることにより式（２１）が得られる。『ｕ〜』は『ｕ』の真上に『〜』を冠した記号を意味している。 Further, with respect to the equation (19), the equation (21) can be obtained by taking the periodic average u ~ = u−u * with the equilibrium point of the manipulated variable u as u *. "U ~" means a symbol with "~" directly above "u".

式（１９）及び（２１）によって表されるアベレージシステムは、ディザー信号による操作量の振動に応じて、評価量がどのような速度で最小値（極小値）に収束していくかという極値制御における収束のダイナミクスを表現したものである。非特許文献２では、制御対象プロセスがスタティックである場合を仮定しているが、ディザー信号の周期がプラントの時定数よりも十分に長く設定されている。これはすなわちディザー信号の周波数ωが制御対象プロセスのカットオフ周波数よりも十分に小さく設定されている場合には、ダイナミクスを持つ制御対象プロセスを近似的にスタティックな制御対象プロセスとみなすことができるということである。これは、極値制御の安定性解析で用いられる特異摂動論によっても裏付けられる。 In the average system represented by the equations (19) and (21), the extreme value of how quickly the evaluation amount converges to the minimum value (minimum value) according to the vibration of the operation amount due to the dither signal. It expresses the dynamics of convergence in control. In Non-Patent Document 2, it is assumed that the controlled process is static, but the period of the dither signal is set to be sufficiently longer than the time constant of the plant. This means that if the frequency ω of the dither signal is set sufficiently smaller than the cutoff frequency of the controlled process, the controlled process with dynamics can be regarded as an approximately static controlled process. That is. This is also supported by the singular perturbation theory used in the stability analysis of extremum control.

また、図２のようにハイパスフィルタやローパスフィルタを備えて構成される基本的な構成の極値制御システムの場合であっても、これらのフィルタのカットオフ周波数が適切に設定され、支配的なダイナミクス（最も遅いダイナミクス）が積分器１５（積分器）となる場合には、式（１９）に示すアベレージシステムで極値制御システムの全体の挙動を特徴づけることができる。そのため、このような場合には式（１９）を用いて積分ゲインを調整することができる。 Further, even in the case of an extreme value control system having a basic configuration including a high-pass filter and a low-pass filter as shown in FIG. 2, the cutoff frequencies of these filters are appropriately set and are dominant. When the dynamics (slowest dynamics) is the integrator 15 (integrator), the average system represented by the equation (19) can characterize the overall behavior of the extreme value control system. Therefore, in such a case, the integral gain can be adjusted by using the equation (19).

式（１９）における評価関数のヤコビアンＧ（ｕ）はｕに関する非線形関数となることが多いため、一般に式（１９）は非線形微分方程式となる。ここで、式（１９）のｕに関して適当な動作点ｕ０の周辺で線形化したアベレージシステムは式（２２）で表すことができる。 Since the Jacobian G (u) of the evaluation function in the equation (19) is often a nonlinear function with respect to u, the equation (19) is generally a nonlinear differential equation. Here, the average system linearized around the appropriate operating point u0 with respect to u in the equation (19) can be expressed by the equation (22).

ここで、『ｕ＾』は『ｕ』の真上に『＾』を冠した記号を意味している。ｕ＾＝ｕ−ｕ０であり、Ｈ（ｕ０）はＧ（ｕ）のヤコビアンを表す。すなわちＨ（ｕ０）は評価関数のヘシアンである。そのため、本実施形態では積分ゲインＫＩ０をＨ（ｕ０）の逆数に比例するように適応的に調整することで極値制御の収束速度を調整する。このような積分ゲインの調整において、第１の方法によって生成される正則化信号（式（８）〜（１２）参照）は、ヘシアンの推定値によるゼロ割りを回避するとともに、急激な符号変化を抑制するように作用する。 Here, "u ^" means a symbol with "^" directly above "u". u ^ = u-u0, where H (u0) represents the Jacobian of G (u). That is, H (u0) is the Hessian of the evaluation function. Therefore, in the present embodiment, the convergence speed of the extreme value control is adjusted by adaptively adjusting the integral gain KI0 so as to be proportional to the reciprocal of H (u0). In such adjustment of the integrated gain, the regularized signal (see equations (8) to (12)) generated by the first method avoids zero division by the estimated value of Hessian and causes a sudden sign change. It acts to suppress.

これに加えて第１の方法による正則化信号は、時間とともに変動するヘシアンの推定値を積分ゲインの算出式から追い出し、新たな信号として定義している。このようにして積分ゲインを可変とする要因を算出式から追い出すことにより、算出式から１／Ｈ（ｕ０）の項が除かれることになり、積分ゲインＫＩ０を固定値として調整することができる。また、正則化信号の定義式に含まれる微小定数δ（＞０）は、ヘシアンによるゼロ割りを回避するためものであり、ヘシアンの推定値が０になった場合にＫＩ０が最大値（定数項の１／δ倍）をとる。そのため、ＫＩ０について想定する最大値に基づいてδを決定することにより積分ゲインの調整が可能になる。 In addition to this, the regularized signal by the first method is defined as a new signal by expelling the estimated value of Hessian that fluctuates with time from the calculation formula of the integrated gain. By expelling the factor that makes the integral gain variable from the calculation formula in this way, the term 1 / H (u0) is removed from the calculation formula, and the integral gain KI0 can be adjusted as a fixed value. Further, the minute constant δ (> 0) included in the definition formula of the regularization signal is for avoiding zero division by Hessian, and KI0 is the maximum value (constant term) when the estimated value of Hessian becomes 0. 1 / δ times). Therefore, the integral gain can be adjusted by determining δ based on the maximum value assumed for KI0.

式（２２）は式（１９）の非線形方程式を線形近似したものであるが、式（１９）の非線形要素であるＧ（ｕ）の影響を抑制する方法として、Ｇ（ｕ）を消去するというより直接的な方法が考えられる。上述の第２及び第３の方法は、このような考え方に沿った正則化信号を生成する方法である。式（１９）における積分ゲインＫＩ０は調整対象のパラメータであり、設計者が決定することのできるものであるから、これを式（１９）の微分方程式を変形する一種の操作量とみなし、式（２３）を満たすような積分ゲインＫＩ０’を新たに定義することにより、式（１９）における非線形要素を消去することができる。 Equation (22) is a linear approximation of the nonlinear equation of equation (19), but it is said that G (u) is eliminated as a method of suppressing the influence of G (u), which is a nonlinear element of equation (19). A more direct method can be considered. The above-mentioned second and third methods are methods for generating a regularized signal according to such an idea. Since the integral gain KI0 in the equation (19) is a parameter to be adjusted and can be determined by the designer, it is regarded as a kind of operation amount that transforms the differential equation of the equation (19), and the equation (19) is used. By newly defining the integral gain KI0'that satisfies 23), the non-linear element in the equation (19) can be eliminated.

しかしながら式（２３）を適用するとヤコビアンに関する情報が式（１９）に含まれなくなるため、操作量を動かすべき方向を決定するために必要な情報が失われてしまうことになる。そこで、式（２３）に代えて、式（２４）のようにヤコビアンの絶対値をとることにより、式（１９）にヤコビアンの符号情報を残すことができる。この場合、式（１９）は式（２５）のように表される。 However, when the equation (23) is applied, the information about the Jacobian is not included in the equation (19), so that the information necessary for determining the direction in which the manipulated variable should be moved is lost. Therefore, by taking the absolute value of the Jacobian as in the equation (24) instead of the equation (23), the sign information of the Jacobian can be left in the equation (19). In this case, equation (19) is expressed as equation (25).

このようにすると、操作量の探索方向の決定に必要な情報を残しつつ、ヤコビアンの非線形性を消去することができる。すなわち第３の方法による正則化信号は、ヤコビアンの非線形性を積分ゲインの算出式から追い出すための信号であるということができる。式（２５）にはヤコビアンの符号情報のみが含まれるため新たな積分ゲインＫＩ０'の調整は簡単になる。なお、ＫＩ０’の調整は、式（２５）に基づく極値制御の過渡的な挙動が直線的になることを踏まえ、図５に示した調整方法を参考に行えば良い。 In this way, the Jacobian non-linearity can be eliminated while leaving the information necessary for determining the search direction of the manipulated variable. That is, it can be said that the regularized signal by the third method is a signal for expelling the Jacobian non-linearity from the calculation formula of the integrated gain. Since the equation (25) contains only the Jacobian code information, the adjustment of the new integrated gain KI0'is easy. The adjustment of KI0'may be performed with reference to the adjustment method shown in FIG. 5, considering that the transient behavior of the extreme value control based on the equation (25) becomes linear.

第３の方法に係る正則化は、非常に簡単に表される一方で、ヤコビアンについては符号情報のみを含むため極値近傍においてチャタリングが起こる可能性がある。第２の方法に係る正則化は、このチャタリングを回避するために、第３の方法に微少定数δ＞０を導入したものと考えることができる。この場合、積分ゲインを式（２６）のように変換すると考えると、式（２５）に相当するアベレージシステムは式（２７）で表される。 While the regularization according to the third method is expressed very simply, chattering may occur near the extremum because Jacobian contains only code information. It can be considered that the regularization according to the second method introduces a minute constant δ> 0 into the third method in order to avoid this chattering. In this case, considering that the integrated gain is converted as shown in the equation (26), the average system corresponding to the equation (25) is expressed by the equation (27).

式（２７）のアベレージシステムは、ヤコビアンの値が大きいときにはδの影響が小さくなるため、式（２５）のアベレージシステムと同様の動きをする。一方で、ヤコビアンの値が小さいときにはδの影響が大きくなるため、式（２７）のアベレージシステムはヤコビアンに比例するような動きをすることになる。そのため、第２の方法に係る正則化においても、極値近傍におけるチャタリングの防止のためにδを設定するだけで、基本的には式（２５）と同様の考え方で積分ゲインＫＩ０’を調整することができる。 The average system of the equation (27) behaves in the same manner as the average system of the equation (25) because the influence of δ becomes small when the Jacobian value is large. On the other hand, when the Jacobian value is small, the influence of δ becomes large, so that the average system of Eq. (27) behaves in proportion to the Jacobian. Therefore, even in the regularization according to the second method, the integral gain KI0'is basically adjusted in the same way as in the equation (25) only by setting δ to prevent chattering in the vicinity of the extreme value. be able to.

［正則化信号を生成する第４の方法］
第４の方法による正則化信号は、ヤコビアン信号の符号推定値を連続関数で近似した近似符号推定値によって表される信号である。
第４の方法により生成された正則化信号によりヤコビアン信号を正則化すると、第３の方法による正則化信号により正則化後のヤコビアン信号（符号信号）を滑らかな連続関数で近似したものであって、第２の方法による正則化信号により正則化後のヤコビアン信号を一般化したものに相当する信号となる。[Fourth method of generating a regularized signal]
The regularized signal according to the fourth method is a signal represented by an approximate code estimated value obtained by approximating the code estimated value of the Jacobian signal with a continuous function.
When the regularized Jacobian signal is normalized by the regularized signal generated by the fourth method, the regularized Jacobian signal (signed signal) by the regularized signal by the third method is approximated by a smooth continuous function. , The regularized signal by the second method is a signal corresponding to a generalized version of the regularized Jacobian signal.

第２の方法においてδ＝０としたものが第３の方法による正則化信号である。換言すると、第３の方法において微小定数δ＞０を導入したものが第２の方法による正則化信号である。しかし、第３の方法による正則化信号である符号関数を近似しながら、先述の正則化信号の条件１−４を満たす近似関数は第２の方法による正則化信号に限定されない。 In the second method, δ = 0 is the regularization signal by the third method. In other words, the regularization signal by the second method is the one in which the minute constant δ> 0 is introduced in the third method. However, the approximation function that satisfies the above-mentioned conditions 1-4 of the regularization signal while approximating the sign function that is the regularization signal by the third method is not limited to the regularization signal by the second method.

例えば、符号関数を連続関数、あるいは、滑らかな連続関数で近似するものであれば、条件１−４の正則化信号の定義を満たす。このような符号関数の（滑らかな）近似関数は多数存在するが、上記の第２の方法による正則化信号による近似関数に限らず、例えば、以下の様な近似関数が考えられる。 For example, if the sign function is approximated by a continuous function or a smooth continuous function, the definition of the regularized signal of the condition 1-4 is satisfied. There are many (smooth) approximation functions of such a sign function, but the approximation function is not limited to the regularization signal by the above-mentioned second method, and for example, the following approximation functions can be considered.

Ａ．飽和関数

ここで、ｍ（Ｇ（ｔ））は、ｍ（０）＝０を満たすＧ（ｔ）の厳密な単調増加関数であり、典型的な例としては、−ｓｇｎ（Ｇ（ｔ））／α・｜Ｇ（ｔ）｜^ρ，α＞０，ρ＞０の様なＧ（ｔ）のべき乗関数であり、例えばρ＝１の場合は、勾配Ｇ（ｔ）を±１で打ち切ったものに相当する。A. Saturation function

Here, m (G (t)) is a strict monotonically increasing function of G (t) satisfying m (0) = 0, and as a typical example, −sgn (G (t)) / α. · | G (t) | ^{It is a power function of G (t) such as ρ} , α> 0, ρ> 0. For example, when ρ = 1, the gradient G (t) is cut off by ± 1. Equivalent to.

Ｂ．シグモイド関数（ハイパボリックタンジェント）

Ｃ．アークタンジェント

B. Sigmoid function (hyperbolic tangent)

C. Arctangent

上記Ａ−Ｃの例の他、広義のシグモイド関数に含まれる、累積正規分布関数、ゴンペルツ関数、グーデルマン関数などを原点が中心になる様に平行移動し、値域が±１に適当にスケール変換した関数なども含まれる。あるいは、安定でオーバーシュートや振動が生じない伝達関数（例：高次遅れ系）のステップ応答（例：１次遅れ系の場合１−ｅｘｐ（ｔ））の時間ｔをＧ（ｔ）に置換（例：１次遅れ系の場合１−ｅｘｐ（Ｇ（ｔ）））して、原点を中心に点対称になる様に折り返して接合した関数（例：１次遅れ系の場合ｓｇｎ（Ｇ（ｔ））（１−ｅｘｐ（｜Ｇ（ｔ）｜））なども符号関数の滑らかな近似関数として機能するので、正則化信号として利用することができる。 In addition to the above examples of AC, the cumulative normal distribution function, Gompertz function, Gudermannian function, etc. included in the sigmoid function in a broad sense were translated so that the origin was at the center, and the range was appropriately scale-converted to ± 1. Functions etc. are also included. Alternatively, replace the time t of the step response (eg, 1-exp (t) in the case of the 1st-order lag system) of the transfer function (eg, higher-order lag system) that is stable and does not cause overshoot or vibration with G (t). (Example: 1-exp (G (t)) in the case of a first-order lag system), and the function is folded back and joined so as to be point-symmetrical about the origin (example: in the case of a first-order lag system, sgn (G (G (t))). Since t)) (1-exp (| G (t) |)) also functions as a smooth approximation function of the sign function, they can be used as a regularized signal.

図１２は、第１の実施形態において、第４の方法によって生成された正則化信号の一例について説明するための図である。図１２は、勾配（ヤコビアン信号）Ｇ（ｔ）と正則化後のヤコビアン信号Ｇｎ（ｔ）との関係を図示したものであるが、Ｇ（ｔ）＝Ｇｎ（ｔ）＝０の原点が極値探索の極値に対応する。 FIG. 12 is a diagram for explaining an example of a regularization signal generated by the fourth method in the first embodiment. FIG. 12 illustrates the relationship between the gradient (Jacobian signal) G (t) and the regularized Jacobian signal Gn (t), and the origin of G (t) = Gn (t) = 0 is the pole. Corresponds to the extremum of value search.

第３の方法による正則化後のヤコビアン信号（符号関数）を連続関数で近似するということは、極値近傍での正則化信号の挙動を調整していることに相当する。すなわち、単純に勾配を±１で打ち切った飽和関数を適用すると、極値近傍において従来の勾配型の極値制御をそのまま適用することとなる。なお、従来の極値制御は、評価関数がちょうど（操作量に関して）２次関数であるときにこれを微分した勾配が１次関数（線形関数）となる。このとき、勾配法を適用すると、ちょうど極値近傍で線形システムの収束特性（＝指数関数的な収束）を持つ。このような収束は一般に好ましいと考えられる。 Approximating the Jacobian signal (sign function) after regularization by the third method with a continuous function corresponds to adjusting the behavior of the regularized signal in the vicinity of the extremum. That is, if a saturation function in which the gradient is simply cut off by ± 1 is applied, the conventional gradient type extreme value control is applied as it is in the vicinity of the extreme value. In the conventional extreme value control, when the evaluation function is just a quadratic function (with respect to the manipulated variable), the gradient obtained by differentiating it becomes a linear function (linear function). At this time, when the gradient method is applied, it has the convergence characteristic (= exponential convergence) of the linear system just near the extremum. Such convergence is generally considered preferable.

そこで、もし評価関数の勾配の形状がＧ（ｔ）∝ｕ^ｎの様に操作量ｕのべき乗に比例している（評価関数はｕ^{（ｎ＋１）}に比例する）とすると、Ｇｎ（ｔ）＝Ｇ（ｔ）^{（１／ｎ）}とすると、Ｇｎ（ｔ）∝ｕとなるので、極値近傍において指数関数的に収束させることができる。実際には、極値近傍での評価関数形状は未知であるため、ｎを理論的に求めることはできないが、極値近傍の挙動を確認しながら、符号関数の近似関数の選択やそのパラメータを調整することで、極値近傍の挙動のファインチューニングが可能になる。
例えば、図１２に示すように、シグモイド関数においてαの値を変えると、正則化信号Ｇｎ（ｔ）の絶対値が１よりも小さな値になるときのＧ（ｔ）の値が変化するため、極値の「近傍」の範囲Ａ自身を調整することが可能になる。Therefore, if the shape of the gradient of the evaluation function and G is proportional to a power of the manipulated variable u as a (t) αu ⁿ (evaluation function is proportional to ^{u (n + 1)),} Gn (t) = If G (t) ^{(1 / n)} is set, then Gn (t) ∝u, so that it can be converged exponentially in the vicinity of the extremum. Actually, since the shape of the evaluation function near the extremum is unknown, n cannot be theoretically obtained, but while checking the behavior near the extremum, the selection of the approximation function of the sign function and its parameters are selected. By adjusting, fine tuning of the behavior near the extreme value becomes possible.
For example, as shown in FIG. 12, when the value of α is changed in the sigmoid function, the value of G (t) changes when the absolute value of the regularization signal Gn (t) becomes smaller than 1. It is possible to adjust the range A itself in the "neighborhood" of the extremum.

図１３は、第１の実施形態において、第４の方法によって生成された正則化信号の他の例について説明するための図である。図１３も図１２と同様に、ヤコビアン信号Ｇ（ｔ）と正則化後のヤコビアン信号Ｇｎ（ｔ）との関係を図示したものであるが、Ｇ（ｔ）＝Ｇｎ（ｔ）＝０の原点が極値探索の極値に対応する。 FIG. 13 is a diagram for explaining another example of the regularization signal generated by the fourth method in the first embodiment. Similar to FIG. 12, FIG. 13 also illustrates the relationship between the Jacobian signal G (t) and the regularized Jacobian signal Gn (t), but the origin of G (t) = Gn (t) = 0. Corresponds to the extremum of the extremum search.

例えば、図１３に示すように、極値近傍Ｂにおいて、上に凸な形状の連続関数と下に凸な形状の連続関数とを切り替えると、極値近傍でチャタリングを生じる状況を抑制したり、極値探索制御で極値に収束せずにその付近で止まってしまう現象を改善させたりすることができる。 For example, as shown in FIG. 13, when the continuous function having an upwardly convex shape and the continuous function having a downwardly convex shape are switched in the extreme value neighborhood B, the situation where chattering occurs near the extreme value can be suppressed. It is possible to improve the phenomenon that the extremum search control does not converge to the extremum and stops in the vicinity thereof.

上に凸な形状（例えばシグモイド関数にてα＝３）の正則化後のヤコビアン信号Ｇｎ（ｔ）を選ぶと、Ｇｎ（ｔ）は純粋な符号関数に近くなっていくので、収束値が極値に到達しない場合にこのような形状の関数を選択することで、真の極値に近い探索が可能となる。ただし、上に凸の傾向が強すぎる（符号化関数に近づけすぎる）とチャタリングを起こす場合があることに留意すべきである。
一方、下に凸な形状（例えば飽和関数にてα＝１０,ρ＝１．５）の正則化信号Ｇｎ（ｔ）を選ぶと、極値近傍でチャタリングを起こしている様な場合に、それを抑制する効果が認められる。下に凸の傾向が弱くなると、極値の探索性能が劣化することに留意すべきである。If the Jacobian signal Gn (t) after regularization of an upwardly convex shape (for example, α = 3 in the sigmoid function) is selected, Gn (t) becomes closer to a pure sign function, so the convergence value is extreme. By selecting a function with such a shape when the value is not reached, it is possible to search near the true extremum. However, it should be noted that chattering may occur if the upward convex tendency is too strong (too close to the coding function).
On the other hand, if a regularized signal Gn (t) with a downwardly convex shape (for example, α = 10, ρ = 1.5 in the saturation function) is selected, chattering occurs near the extreme value. The effect of suppressing is recognized. It should be noted that the weakening of the downward convexity reduces the extremum search performance.

図１４Ａは、第１の実施形態において、正則化信号を用いることなく操作量の応答をシミュレーションした結果の一例を説明するための図である。ここでは、１次遅れ系にｙ＝ｕ^５という関数を付加した仮想的な制御対象に対して、様々な正則化信号を適用した場合の操作量の応答をシミュレーションした結果を例示している。
図１４Ａに示す例は、正則化信号を適用しなかった比較例であり、初期値が２の場合にうまく動作する様に調整している。この時、初期値を３に変更すると極値制御は発散してしまう。これは、ｙ＝ｕ^５の一次微分はｙ´＝５ｕ^４であり、初期値が２のときにｙ´＝５・２^５＝１６０であるのに対し、初期値が３のときにはｙ´＝５・３^５＝１２１５となり、勾配が大きく変化してしまうためである。FIG. 14A is a diagram for explaining an example of the result of simulating the response of the manipulated variable without using the regularization signal in the first embodiment. Here, the result of simulating the response of the manipulated variable when various regularized signals are applied to the virtual control target in which the function ^{y = u 5} is added to the first-order lag system is illustrated.
The example shown in FIG. 14A is a comparative example to which the regularization signal is not applied, and is adjusted so as to operate well when the initial value is 2. At this time, if the initial value is changed to 3, the extreme value control will diverge. This is because ^{the first derivative of y = u 5} is y'= 5u ⁴ , and when the initial value is 2, y'= 5.2 ⁵ = 160, whereas when the initial value is 3, y'= This is because 5 ・ 3 ⁵ = 1215 and the gradient changes greatly.

また、初期値を２とした場合でも、操作量は、所定時間以内に極値（＝１）が極値に収束することがなかった。これは、ｕが０に近くなるとｙ´が急速に０に近くなるためである。すなわち、仮想的な制御対象において、極値の付近はほぼ勾配がフラットになるためである。 Further, even when the initial value was set to 2, the extreme value (= 1) of the manipulated variable did not converge to the extreme value within a predetermined time. This is because when u approaches 0, y'quickly approaches 0. That is, in a virtual control target, the gradient becomes almost flat near the extreme value.

図１４Ｂは、第１の実施形態において、正則化信号として勾配の符号信号を用いて操作量の応答をシミュレーションした結果の一例を説明するための図である。
図１４Ａに示す比較例に対し、図１４Ｂ−１４Ｄに示した例は正則化信号を適用したものである。図１４Ｂに示した例は、第３の方法による正則化信号である符号関数を適用した場合である。この時、初期値が２でも３でも同じような収束速度で極値の探索が可能になっているが、極値である１付近でチャタリングを起こしている。これは、符号関数が最も強い不連続なスイッチング関数になっているためである。FIG. 14B is a diagram for explaining an example of the result of simulating the response of the manipulated variable using the coded signal of the gradient as the regularization signal in the first embodiment.
In contrast to the comparative example shown in FIG. 14A, the example shown in FIGS. 14B-14D applies a regularization signal. The example shown in FIG. 14B is a case where a sign function which is a regularized signal by the third method is applied. At this time, it is possible to search for an extreme value at a similar convergence speed regardless of whether the initial value is 2 or 3, but chattering occurs near the extreme value of 1. This is because the sign function is the strongest discontinuous switching function.

図１４Ｃおよび１４Ｄは、第１の実施形態において、第４の方法によって生成された正則化信号を用いたときの効果の一例を説明するための図である。
図１４Ｃに示した例は、符号関数の連続近似関数として第４の方法のＡに示した飽和関数（ρ＝１）を適用したものである。図１４Ｂと同様に初期値に関わらず同じような収束速度で極値の探索が可能になるが、図１４Ａと同じように所定時間以内に極値に到達していない。これは、極値近傍で勾配そのものを用いているため、極値近傍では図１４Ａと同じ現象が生じるためである。14C and 14D are diagrams for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment.
In the example shown in FIG. 14C, the saturation function (ρ = 1) shown in A of the fourth method is applied as a continuous approximation function of the sign function. Similar to FIG. 14B, the extreme value can be searched at the same convergence speed regardless of the initial value, but the extreme value is not reached within a predetermined time as in FIG. 14A. This is because the gradient itself is used in the vicinity of the extremum, so that the same phenomenon as in FIG. 14A occurs in the vicinity of the extremum.

図１４Ｄに示した例は、符号関数の連続近似関数として例えば、第４の方法のＡに示した−ｓｇｎ（Ｇ（ｔ））／α・｜Ｇ（ｔ）｜^ρ，α＞０，ρ＞０の，ρ≠１様なＧ（ｔ）のべき乗関数である飽和関数を適用したものであって、α=1,ρ=1/5としたものである。この例では、初期値によらず極値の探索が可能であり、チャタリングを抑制できた。この例のように、符号関数を近似する連続関数をうまく利用することにより、極値探索制御の応答を調整することが可能である。The example shown in FIG. 14D is, for example, −sgn (G (t)) / α · | G (t) | ^ρ , α> 0, ρ shown in A of the fourth method as a continuous approximation function of the sign function. The saturation function, which is a power function of G (t) such as> 0, ρ ≠ 1, is applied, and α = 1, ρ = 1/5. In this example, it was possible to search for extreme values regardless of the initial value, and chattering could be suppressed. As in this example, it is possible to adjust the response of the extreme value search control by making good use of the continuous function that approximates the sign function.

第４の方法により生成された正則化信号を用いた極値制御システムは、例えば図１１に示す構成例において、符号関数ｓｇｎ（）をＧ（ｔ）による近似関数（一例として、上述の関数Ａ−Ｃ）とすることにより、実現することが出来る。
また、第４の方法により生成された正則化信号を用いた極値制御システムは、例えば図９及び図１０に示す構成例において、正則化信号出力部２３にて、近似関数に対応する演算を行うことにより、実現することが出来る。すなわち、正則化信号出力部２３の出力が、正則化後のヤコビアン信号をＧ（ｔ）で除した値となるように、正則化信号出力部２３での演算式を設定しておけばよい。In the extremum control system using the regularized signal generated by the fourth method, for example, in the configuration example shown in FIG. 11, the sign function sgn () is approximated by G (t) (as an example, the above-mentioned function A). -C) can be realized.
Further, in the extremum control system using the regularized signal generated by the fourth method, for example, in the configuration examples shown in FIGS. 9 and 10, the regularized signal output unit 23 performs an operation corresponding to the approximate function. By doing so, it can be realized. That is, the arithmetic expression in the regularized signal output unit 23 may be set so that the output of the regularized signal output unit 23 is a value obtained by dividing the regularized Jacobian signal by G (t).

上記第４の方法によれば、正則化関数は第３の方法によるものに限定されず、連続近似関数の選択やその関数が持つパラメータを調整することにより、探索された極値近傍での挙動を容易に調整することが可能になり、よりきめ細やかな極値探索制御を実現できる。 According to the fourth method, the regularization function is not limited to that of the third method, and the behavior near the searched extreme value is obtained by selecting a continuous approximation function and adjusting the parameters of the function. Can be easily adjusted, and more detailed extremum search control can be realized.

以上説明したように、正則化信号出力部２３が生成する正則化信号は、操作量を動かすべき方向及び量を与えるアベレージシステムから評価関数の非線形要素による複雑な挙動を追い出す役割を持ち、パラメータ調整部２は、このような性質を有する正則化信号を用いて勾配推定部２２の出力するヤコビアン信号を正則化する。そして、積分器１５が、調整された積分ゲインを用いて正則化されたヤコビアンを積分することにより、極値制御の探索方向をより適切に制御することが可能となる。 As described above, the regularized signal generated by the regularized signal output unit 23 has a role of expelling complicated behavior due to the non-linear element of the evaluation function from the average system that gives the direction and amount to move the manipulated variable, and parameter adjustment. The unit 2 regularizes the Jacobian signal output by the gradient estimation unit 22 by using the regularization signal having such a property. Then, the integrator 15 can more appropriately control the search direction of the extreme value control by integrating the regularized Jacobian using the adjusted integral gain.

このように構成された第１の実施形態の極値制御システム１は、ヤコビアン信号を正則化し、正則化したヤコビアン信号に基づいて積分ゲインを適応的に調整するパラメータ調整部を備えることにより、制御対象プロセスの極値制御を、そのダイナミクスに適応してより安定的に動作させることが可能となる。 The extremum control system 1 of the first embodiment configured in this way is controlled by regularizing the Jacobian signal and providing a parameter adjusting unit for adaptively adjusting the integrated gain based on the regularized Jacobian signal. It is possible to adapt the extreme value control of the target process to its dynamics and operate it more stably.

具体的には、第１の実施形態の極値制御システム１は、ヤコビアン信号の正則化によって以下の（１）及び（２）を実現することにより、極値制御の安定性を損なうことなく積分ゲインを容易に調整することを可能とする。（１）積分ゲインの算出時におけるゼロ割りを回避する。（２）ヤコビアン信号の符号情報を維持する（急激な符号の反転を回避する）。
これにより、制御対象プロセスのオペレータは極値制御の収束速度をより容易に、かつ安全に調整することが可能となる。Specifically, the extremum control system 1 of the first embodiment realizes the following (1) and (2) by regularizing the Jacobian signal, thereby integrating without impairing the stability of the extremum control. It makes it possible to easily adjust the gain. (1) Avoid zero division when calculating the integral gain. (2) Maintain the sign information of the Jacobian signal (avoid sudden sign inversion).
As a result, the operator of the controlled process can adjust the convergence speed of the extreme value control more easily and safely.

（第２の実施形態）
図１２は、第２の実施形態の極値制御システム１ａの構成例を示す図である。極値制御システム１ａは、操作量補正部３をさらに備える点で第１の実施形態の極値制御システム１と異なる。図１２に示す極値制御システム１ａは、第１の実施形態において第２の方法によって正則化信号を生成した極値制御システムの構成例のうち、図９に示した極値制御システム１に操作量補正部３を追加して構成したものである。極値制御システム１ａのそれ以外の構成は第１の実施形態の極値制御システム１と同様のため、ここでは図３と同じ符号を付すことにより、それらの同様の機能部についての説明を省略する。(Second embodiment)
FIG. 12 is a diagram showing a configuration example of the extreme value control system 1a of the second embodiment. The extremum control system 1a is different from the extremum control system 1 of the first embodiment in that it further includes an operation amount correction unit 3. The extreme value control system 1a shown in FIG. 12 is operated by the extreme value control system 1 shown in FIG. 9 among the configuration examples of the extreme value control system that generated the regularized signal by the second method in the first embodiment. It is configured by adding the amount correction unit 3. Since the other configurations of the extremum control system 1a are the same as those of the extremum control system 1 of the first embodiment, the same reference numerals as those in FIG. 3 are used here, and the description of those similar functional parts is omitted. do.

評価関数のヘシアンに基づく正則化信号を用いた場合、式（２２）のような線形近似システムが得られるため、ヘシアンの推定精度が高ければ、操作量や評価量を指数関数的に極値に収束させることができると考えられる。このような収束の態様は（一次の）線形システムの特徴でもあり、ヘシアンの推定精度を高めることによりこのような特徴が期待どおりに得られやすくなると考えられる。 When a regularized signal based on the evaluation function Hessian is used, a linear approximation system as shown in Eq. (22) can be obtained. Therefore, if the estimation accuracy of Hessian is high, the manipulated variable and the evaluation amount are exponentially extremized. It is thought that it can be converged. Such a mode of convergence is also a feature of a (first-order) linear system, and it is thought that increasing the estimation accuracy of Hessian makes it easier to obtain such a feature as expected.

一方、評価関数のヤコビアンに基づく正則化信号を用いた場合、式（２５）のようなアベレージシステムが得られるため、極値への収束は直線的になると考えられる。このような直線的な探索が求められる場合もあるが、一般には、操作量や評価量が目標とする極値から大きく離れている場合には速やかに極値に近づけ、極値近傍では少しずつ極値に近づけることが求められる場合が多い。 On the other hand, when the regularization signal based on the Jacobian evaluation function is used, the average system as shown in the equation (25) is obtained, so that the convergence to the extremum is considered to be linear. Such a linear search may be required, but in general, when the manipulated variable or evaluation amount is far from the target extremum, it quickly approaches the extremum, and gradually approaches the extremum. It is often required to approach the extreme value.

式（２５）のようなアベレージシステムで表される極値制御では、極値近傍でチャタリングを生じる可能性が高い。このようなチャタリングは、上述の式（２７）のように、式（２５）を微小な定数δ（＞０）で調整することで抑制することができるが、これは極値近傍における極値探索の挙動を調整するものであり、極値探索の動きを全体的に調整するものではない。 In the extreme value control represented by the average system as in the equation (25), there is a high possibility that chattering will occur in the vicinity of the extreme value. Such chattering can be suppressed by adjusting the equation (25) with a minute constant δ (> 0) as in the above equation (27), which is an extremum search near the extremum. It adjusts the behavior of the extremum, not the movement of the extremum search as a whole.

そこで、式（１９）又は（２２）のアベレージシステムの挙動を全体的に調整するためには、操作量ｕ又は操作量偏差ｕ−を右辺に持つようにすればよい。すなわち、式（２２）を式（２９）のように変形することができれば良い。 Therefore, in order to adjust the behavior of the average system of the equation (19) or (22) as a whole, it is sufficient to have the manipulated variable u or the manipulated variable deviation u- on the right side. That is, it suffices if the equation (22) can be transformed like the equation (29).

ここで、Ｆ（ｕ〜）はＦ（０）＝０を満たし、ｕ〜の符号とＦ（ｕ〜）の符号とが一致する関数である。すなわちＦ（ｕ〜）は、ｕ〜×Ｆ（ｕ〜）＞０を満たす。最も単純な例はＦ（ｕ〜）＝ｕ〜であり、この場合、式（２９）は線形微分方程式となり、ｕ〜は指数関数的にゼロに収束する。なお、ｕではなくｕ〜を考えるのは、ｕの最適値ｕ＊が未知であるのに対してｕ〜はゼロに収束すればよいため、式（２９）の平衡点を０とすればよいからである。また、ｕ〜の適当なべき乗関数を用いると、極値から離れている場合に、動作点をより早く極値方向に移動させ、極値近傍では緩やかに動作点を動かすようにすることもできる。 Here, F (u ~) is a function that satisfies F (0) = 0 and the sign of u ~ and the sign of F (u ~) match. That is, F (u ~) satisfies u ~ × F (u ~)> 0. The simplest example is F (u ~) = u ~, in which case Eq. (29) is a linear differential equation and u ~ exponentially converges to zero. It should be noted that the reason for considering u ~ instead of u is that the optimum value u * of u is unknown, whereas u ~ should converge to zero, so the equilibrium point of equation (29) may be set to 0. Because. In addition, by using an appropriate exponentiation function of u ~, it is possible to move the operating point toward the extremum faster when it is far from the extremum, and to move the operating point gently near the extremum. ..

操作量補正部３は、式（２９）のような挙動を持つアベレージシステムを得るためのものであり、操作量ｕをフィードバックして正則化信号と掛け合わせることにより、極値から離れた動作点ほどより大きく極値方向に動かすように操作量を補正する。例えば、第３の方法で正則化信号を生成する場合には式（２５）のアベレージシステムが得られるが、式（２９）式のようなシステムを得るためには、ｕの微分とｕ〜の微分が等しいことに着目して式（２５）の右辺にＦ（ｕ〜）を掛ければよい。 The operation amount correction unit 3 is for obtaining an average system having the behavior as shown in the equation (29), and by feeding back the operation amount u and multiplying it with the regularization signal, the operating point away from the extreme value. Correct the operation amount so that it moves in the extreme direction. For example, when the regularization signal is generated by the third method, the average system of the equation (25) can be obtained, but in order to obtain the system such as the equation (29), the derivative of u and u ~. Focusing on the fact that the derivatives are equal, the right side of the equation (25) may be multiplied by F (u ~).

ただし、ｕ〜＝ｕ−ｕ＊であり、操作量の最適値が未知であるため、ｕ〜を直接的に用いることはできない。しかしながら、操作量ｕは用いることができるので、ｕ〜の推定値としてｕにハイパスフィルタを作用させて未知の定数項ｕ＊を除去することで近似的にｕ〜を得ることができる。すなわち、ｕ〜を式（３０）で推定することができる。 However, since u to = u-u * and the optimum value of the manipulated variable is unknown, u to cannot be used directly. However, since the manipulated variable u can be used, u ~ can be approximately obtained by applying a high-pass filter to u as an estimated value of u ~ and removing the unknown constant term u *. That is, u ~ can be estimated by the equation (30).

式（３０）によって推定したｕ〜の信号を、正規化されたヤコビアン信号に掛け合わせることで、指数関数的な応答特性を持つように極値探索を修正することができる。なお、指数関数的な応答特性よりもさらに早い速度で動作点を極値近傍に移動させたい場合、例えば操作量ｕ〜の推定値を式（３０）で得られる値のべき乗値とすることで動作点の動き幅をより大きくするようにしてもよい。 By multiplying the signal of u to estimated by the equation (30) with the normalized Jacobian signal, the extremum search can be modified to have an exponential response characteristic. If you want to move the operating point to the vicinity of the extreme value at a speed faster than the exponential response characteristic, for example, by setting the estimated value of the manipulated variable u to the power value of the value obtained by the equation (30). The movement width of the operating point may be made larger.

図１３及び図１４は、第２の実施形態における極値制御システム１ａの構成例を示す図である。具体的には、図１３は、第１の実施形態において第２の方法によって正則化信号を生成した極値制御システム１の構成例のうち、図１０に示した極値制御システム１に操作量補正部３を追加して構成される極値制御システム１ａの例を示す。図１４は、第１の実施形態において第３の方法によって正則化信号を生成した極値制御システム１（図１１参照）に操作量補正部３を追加して構成される極値制御システム１ａの例を示す。 13 and 14 are diagrams showing a configuration example of the extreme value control system 1a according to the second embodiment. Specifically, FIG. 13 shows an operation amount in the extreme value control system 1 shown in FIG. 10 among the configuration examples of the extreme value control system 1 in which the regularization signal is generated by the second method in the first embodiment. An example of the extreme value control system 1a configured by adding the correction unit 3 is shown. FIG. 14 shows the extreme value control system 1a configured by adding the operation amount correction unit 3 to the extreme value control system 1 (see FIG. 11) that generated the regularized signal by the third method in the first embodiment. An example is shown.

このように構成された第２の実施形態の極値制御システム１ａは、第１の実施形態の極値制御システム１と同様の効果を奏することに加え、極値探索の速度をより細かく調整することが可能になる。これにより、動作点が極値から大きく離れているときには速やかに動作点を極値近傍に移動させ、極値近傍では細かく動作点を動かすことができるため、極値探索の速度及び精度を向上させることが可能となる。 The extremum control system 1a of the second embodiment configured in this way has the same effect as the extremum control system 1 of the first embodiment, and in addition, the speed of the extremum search is finely adjusted. Will be possible. As a result, when the operating point is far away from the extremum, the operating point can be quickly moved to the vicinity of the extremum, and the operating point can be finely moved near the extremum, thus improving the speed and accuracy of the extremum search. It becomes possible.

（適用例）
図１５は、第１の実施形態の極値制御システム又は第２の実施形態の極値制御システムの適用例を示す図である。図１５は、実施形態の極値制御システムを生物学的排水処理プロセスを実現する水処理プラント４に適用した例を示す。例えば、図１５に示す水処理プラント４は、嫌気槽４１、無酸素槽４２、好気槽４３及び最終沈澱池４４の各設備を備える。嫌気槽４１は、微生物を活性化させるための設備である。無酸素槽４２は、窒素を除去するための設備である。好気槽４３は有機物の分解やリンの除去、アンモニアの硝化を行うための設備である。最終沈澱池４４は、活性汚泥を沈殿させるための設備である。(Application example)
FIG. 15 is a diagram showing an application example of the extreme value control system of the first embodiment or the extreme value control system of the second embodiment. FIG. 15 shows an example in which the extreme value control system of the embodiment is applied to a water treatment plant 4 that realizes a biological wastewater treatment process. For example, the water treatment plant 4 shown in FIG. 15 includes equipment for an anaerobic tank 41, an oxygen-free tank 42, an aerobic tank 43, and a final settling pond 44. The anaerobic tank 41 is a facility for activating microorganisms. The oxygen-free tank 42 is a facility for removing nitrogen. The aerobic tank 43 is a facility for decomposing organic substances, removing phosphorus, and nitrifying ammonia. The final settling pond 44 is a facility for settling activated sludge.

水処理プラント４には、上記設備間で水や汚泥を搬送するポンプや、槽内に空気を供給するブロワ、空気中又は水中の物質の濃度を計測するセンサー等の設備が設置される。薬品投入ポンプ４１１は、微生物を活性化させる炭素源等の薬品を嫌気槽４１に投入するポンプである。循環ポンプ４３１は、好気槽４３と無酸素槽４２との間で循環する被処理水の循環量を制御するポンプである。ブロワ４３２は、好気槽４３に空気を供給して曝気量を制御する。返送汚泥ポンプ４４１は、最終沈澱池４４から無酸素槽４２に汚泥を返送するポンプである。余剰汚泥引き抜きポンプ４４２は、最終沈澱池４４から過剰な汚泥を引き抜くポンプである。センサー４１２及びセンサー４４３は、それぞれ、嫌気槽４１及び最終沈澱池４４における放流水の水質を計測する。 The water treatment plant 4 is equipped with equipment such as a pump for transporting water and sludge between the above equipment, a blower for supplying air to the tank, and a sensor for measuring the concentration of substances in the air or water. The chemical input pump 411 is a pump that inputs chemicals such as a carbon source that activates microorganisms into the anaerobic tank 41. The circulation pump 431 is a pump that controls the circulation amount of the water to be treated circulating between the aerobic tank 43 and the oxygen-free tank 42. The blower 432 supplies air to the aerobic tank 43 to control the amount of aeration. The return sludge pump 441 is a pump that returns sludge from the final settling pond 44 to the oxygen-free tank 42. The excess sludge extraction pump 442 is a pump that extracts excess sludge from the final sedimentation pond 44. The sensor 412 and the sensor 443 measure the quality of the discharged water in the anaerobic tank 41 and the final settling pond 44, respectively.

一般に、このような生物学的廃水処理プロセスでは、操作量は返送汚泥の返送率であり、制御量は放流水に含まれる窒素の濃度（以下「放流窒素濃度」という。）及びリンの濃度（以下「放流リン濃度」という。）である。返送率は、返送汚泥ポンプ４４１の放流量を流入量で割ることによって得られる。放流窒素濃度及び放流リン濃度は、センサー４１２及びセンサー４４３によって取得される。なお、制御量を、放流水に含まれる窒素の量（以下「放流窒素量」という。）及びリンの量（以下「放流リン量」という。）としてもよい。この場合、放流窒素量及び放流リン量は、それぞれ放流窒素濃度及び放流リン濃度に放流量を乗算することにより得られる。 Generally, in such a biological wastewater treatment process, the manipulated amount is the return rate of the returned sludge, and the controlled amount is the concentration of nitrogen contained in the discharged water (hereinafter referred to as "discharged nitrogen concentration") and the concentration of phosphorus (hereinafter referred to as "discharged nitrogen concentration"). Hereinafter referred to as "released phosphorus concentration"). The return rate is obtained by dividing the discharge rate of the return sludge pump 441 by the inflow rate. The released nitrogen concentration and the released phosphorus concentration are acquired by the sensor 412 and the sensor 443. The controlled amount may be the amount of nitrogen contained in the discharged water (hereinafter referred to as "discharged nitrogen amount") and the amount of phosphorus (hereinafter referred to as "discharged phosphorus amount"). In this case, the amount of discharged nitrogen and the amount of discharged phosphorus are obtained by multiplying the discharged nitrogen concentration and the discharged phosphorus concentration by the discharged flow rate, respectively.

適用例の極値制御システム１ｂは、このような水処理プラント４から放流窒素量や放流リン量等の制御量に基づく評価量を入力して極値制御を実行することにより、評価量を最適値に近づけるように操作量を更新していく。この場合に用いる評価関数の一例として、評価量を排水賦課金の考え方に基づく水質コストと、返送汚泥ポンプ４４１の電力コストとの総和（以下「総コスト」という。）として表す方法が考えられる。返送汚泥ポンプ４４１の電力コストは、返送汚泥流量と返送汚泥ポンプ４４１の定格電力などから算出することができる。一般に、排水賦課金の考え方では、水質コストは以下の式で表される。 The extreme value control system 1b of the application example optimizes the evaluation amount by inputting the evaluation amount based on the control amount such as the amount of discharged nitrogen and the amount of discharged phosphorus from the water treatment plant 4 and executing the extreme value control. The operation amount is updated so that it approaches the value. As an example of the evaluation function used in this case, a method of expressing the evaluation amount as the sum of the water quality cost based on the concept of wastewater levy and the electric power cost of the return sludge pump 441 (hereinafter referred to as "total cost") can be considered. The electric power cost of the return sludge pump 441 can be calculated from the return sludge flow rate, the rated power of the return sludge pump 441, and the like. Generally, in the concept of wastewater levy, the water quality cost is expressed by the following formula.

式（３１）においてＣＯＤは化学的酸素要求量、ＢＯＤは生物化学的酸素要求量、ＴＮは放流される窒素、ＴＰは放流されるリンを意味する。各コストの換算係数は、実際の排水賦課金に基づいて決定されても良いし、他の方法によって決定されてもよい。一般に、ＴＮ及びＴＰは、返送率を変えることによって大きく変化することが知られているため、ここでは返送率の制御に関する水質コストＪ１を式（３２）のように定義することにする。 In formula (31), COD means chemical oxygen demand, BOD means biochemical oxygen demand, TN means released nitrogen, and TP means released phosphorus. The conversion factor for each cost may be determined based on the actual wastewater levy or may be determined by other methods. In general, it is known that TN and TP change significantly by changing the return rate. Therefore, here, the water quality cost J1 relating to the control of the return rate is defined as the equation (32).

このような水質コストに加え、返送流量を変化させることによって間接的に変化するブロワの電力コストと、返送ポンプの電力コストとを合計した運転コストＪ２を定義し、その運転コストＪ２と水質コストＪ１との合計を総コストとする関数を評価関数として定義してもよい。例えば、運転コストＪ２は式（３３）のように定義することができる。 In addition to such water quality cost, the operating cost J2, which is the sum of the power cost of the blower indirectly changed by changing the return flow rate and the power cost of the return pump, is defined, and the operating cost J2 and the water quality cost J1 are defined. A function whose total cost is the sum of and may be defined as an evaluation function. For example, the operating cost J2 can be defined by the equation (33).

適用例の極値制御システム１ｂは、このような評価関数によって取得される評価量からヤコビアン信号を抽出し、そのヤコビアン信号に対して上述の正則化信号を作用させて極値制御を行うことにより、水処理プラント４のダイナミクスに対して適応的に積分ゲインを更新することができる。これにより、適用例の極値制御システム１ｂは、総コストを最小化する最適な操作量を、より安定した動作で探索することが可能となる。なお、実施形態の極値制御システムは、操作量の入力に対して制御量を出力する任意のプロセスの制御に適用可能である。例えば、制御対象プロセスは、下水処理プロセスや燃焼プロセス、石油化学プロセスなどであってもよい。 The extreme value control system 1b of the application example extracts a Jacobian signal from the evaluation amount acquired by such an evaluation function, and applies the above-mentioned regularized signal to the Jacobian signal to perform extreme value control. , The integrated gain can be updated adaptively to the dynamics of the water treatment plant 4. As a result, the extreme value control system 1b of the application example can search for the optimum operation amount that minimizes the total cost with more stable operation. The extreme value control system of the embodiment can be applied to the control of an arbitrary process that outputs a control amount with respect to the input of the operation amount. For example, the controlled process may be a sewage treatment process, a combustion process, a petrochemical process, or the like.

以上説明した少なくともひとつの実施形態によれば、評価関数のヤコビアンを推定する第１の勾配推定部と、ヤコビアンの推定値を積分することにより操作量を動かすべき方向及び量を決定する操作量決定部と、評価関数のヘシアンを推定する第２の勾配推定部と、操作量決定部に入力されるヤコビアンの推定値を、評価関数のヤコビアン又はヘシアンの推定値に基づく値であって０とならないように調整された正則化信号で除することにより、操作量決定部の積分ゲインを評価関数の変化に応じて調整するパラメータ調整部と、を持つことにより、制御対象プロセスのダイナミクスに適応して極値制御をより安定的に動作させることができる。 According to at least one embodiment described above, the first gradient estimation unit that estimates the Jacobian of the evaluation function and the operation amount determination that determines the direction and amount to move the operation amount by integrating the estimated values of the Jacobian. The Jacobian estimation value input to the unit, the second gradient estimation unit that estimates the Hessian of the evaluation function, and the manipulation amount determination unit is a value based on the Jacobian or Hessian estimation value of the evaluation function and is not 0. By dividing by the regularization signal adjusted so as to have a parameter adjustment unit that adjusts the integrated gain of the operation amount determination unit according to the change of the evaluation function, it is adapted to the dynamics of the controlled process. Extreme value control can be operated more stably.

本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 Although some embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These embodiments can be implemented in various other embodiments, and various omissions, replacements, and changes can be made without departing from the gist of the invention. These embodiments and variations thereof are included in the scope of the invention described in the claims and the equivalent scope thereof, as are included in the scope and gist of the invention.

Claims

The operation amount so that the evaluation amount, which is a value based on the control amount of the controlled object process and is a value represented by an unknown evaluation function with respect to the operation amount of the controlled object process, approaches the optimum value of the evaluation function. It is an optimal control device that executes extreme value control to update
A first gradient estimation unit that estimates the Jacobian of the evaluation function based on the signal indicating the evaluation amount, and
An operation amount determination unit that determines the direction and amount in which the operation amount should be moved by integrating the Jacobian estimation value, and the operation amount determination unit.
A second gradient estimation unit that estimates the Hessian of the evaluation function based on the signal indicating the evaluation amount, and
By dividing the estimated value of the Jacobian input to the manipulated variable determination unit by a regularized signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0, the said. A parameter adjustment unit that adjusts the integrated gain of the operation amount determination unit according to changes in the evaluation function, and a parameter adjustment unit.
Optimal control device equipped with.

The regularized signal is the Nth root of the value obtained by adding a small positive constant δ to the Nth root value (N> 0) of the absolute value of the Hessian estimated value, or M of the absolute value of the Hessian estimated value. It is a signal represented by a value obtained by dividing a -1 power value (M ≧ 1) by a value obtained by adding a small positive constant δ to the M power value of the absolute value of the estimated Hessian value.
The optimum control device according to claim 1.

The regularization signal is the Nth root of the value obtained by adding a small positive constant δ to the Nth root value (N> 0) of the absolute value of the Jacobian estimated value, or M of the absolute value of the Jacobian estimated value. It is a signal represented by a value obtained by dividing the -1st root value (M ≧ 1) by the value obtained by adding a minute positive constant δ to the Mth root value of the absolute value of the Jacobian estimated value.
The optimum control device according to claim 1.

The regularization signal is a signal represented by the absolute value of the Jacobian estimate.
The optimum control device according to claim 1.

The regularized signal is a signal represented by an approximate code estimate obtained by approximating the Jacobian code estimate with a continuous function.
The optimum control device according to claim 1.

By feeding back the operation amount determined by the operation amount determination unit and multiplying it by the regularization signal, the operation amount is corrected so that the operating point farther from the extremum moves in the extremum direction. Further equipped with a correction unit,
The optimum control device according to any one of claims 3 to 5.

The operation amount so that the evaluation amount, which is a value based on the control amount of the controlled object process and is a value represented by an unknown evaluation function with respect to the operation amount of the controlled object process, approaches the optimum value of the evaluation function. Is a way to perform extreme value control to update
The step of estimating the Jacobian of the evaluation function based on the signal indicating the evaluation amount, and
An operation amount determination step that determines the direction and amount in which the operation amount should be moved by integrating the Jacobian estimates, and the operation amount determination step.
The step of estimating the Hessian of the evaluation function based on the signal indicating the evaluation amount, and
The operation is performed by dividing the estimated value of the Jacobian input to the operation amount determination step by a regularized signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0. A step of adjusting the integrated gain in the quantity determination step according to the change of the evaluation function, and
Optimal control method with.

The operation amount so that the evaluation amount, which is a value based on the control amount of the controlled object process and is a value represented by an unknown evaluation function with respect to the operation amount of the controlled object process, approaches the optimum value of the evaluation function. To a computer that functions as an optimal control device that performs extreme value control.
The step of estimating the Jacobian of the evaluation function based on the signal indicating the evaluation amount, and
An operation amount determination step that determines the direction and amount in which the operation amount should be moved by integrating the Jacobian estimates, and the operation amount determination step.
The step of estimating the Hessian of the evaluation function based on the signal indicating the evaluation amount, and
The operation is performed by dividing the estimated value of the Jacobian input to the operation amount determination step by a regularized signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0. A step of adjusting the integrated gain in the quantity determination step according to the change of the evaluation function, and
A computer program to run.