WO2020241657A1

WO2020241657A1 - Optimum control device, optimum control method and computer program

Info

Publication number: WO2020241657A1
Application number: PCT/JP2020/020816
Authority: WO
Inventors: 理山中; 祐太大西; 由紀夫平岡
Original assignee: 東芝インフラシステムズ株式会社; 株式会社東芝
Priority date: 2019-05-29
Filing date: 2020-05-27
Publication date: 2020-12-03
Also published as: JPWO2020241657A1; CN113874794A; JP7183411B2; CN113874794B

Abstract

Provided are an optimum control device, an optimum control method, and a computer program which can adapt to the dynamics of a target control process and more stably operate extreme value control. The optimum control device according to the present embodiment comprises a first gradient estimation unit, a manipulated variable determination unit, a second gradient estimation unit, and a parameter adjustment unit. The first gradient estimation unit estimates the Jacobian of an evaluation function on the basis of a signal indicating an evaluation amount. The manipulated variable determination unit determines a direction and an amount to move a manipulated variable by means of integrating the estimated value of the Jacobian. The second gradient estimation unit estimates the Hessian of the evaluation function on the basis of the signal indicating the evaluation amount. The parameter adjustment unit adjusts integration gain in the manipulated variable determination unit in accordance with a change in the evaluation function by means of dividing the estimated value of the Jacobian, which was inputted into the manipulated variable determination unit, by a regularized signal, which is a value based on the estimated value of the Jacobian or the Hessian of the evaluation function and adjusted so as to not be 0.

Description

Optimal control device, optimal control method and computer program

The embodiment of the present invention relates to an optimum control device, an optimum control method, and a computer program.

In recent years, a technique called extreme value control has been attracting attention as a method of plant control. Extreme value control is a model-free real-time optimal control technology that does not use a complex model of the plant. The outline of the extreme value control is to search for an operation amount for which the evaluation amount based on the control amount of the controlled target process is optimized by forcibly changing the operation amount. When such extreme value control is applied to plant control, it is necessary to appropriately set various parameters related to extreme value control (hereinafter referred to as "extreme value control parameters") according to the characteristics of the controlled process. Conventionally, some guidelines for designing extreme value control parameters have been shown, but all of them operate stably in extreme value control by adapting to temporal changes in the controlled process (hereinafter referred to as "dynamics"). It has not reached the point where it can be made to do.

Japanese Patent Application Laid-Open No. 2017-033104

An object to be solved by the present invention is to provide an optimum control device, an optimum control method, and a computer program capable of operating extreme value control more stably by adapting to the dynamics of the controlled process.

The optimum control device of the embodiment optimizes the evaluation amount, which is a value based on the control amount of the controlled object process and is a value represented by an unknown evaluation function with respect to the operation amount of the controlled object process. It is an optimal control device that executes extreme value control that updates the manipulated variable so as to approach the value. The optimum control device includes a first gradient estimation unit, a manipulated variable determination unit, a second gradient estimation unit, and a parameter adjustment unit. The first gradient estimation unit estimates the Jacobian of the evaluation function based on the signal indicating the evaluation amount. The manipulated variable determination unit determines the direction and amount in which the manipulated variable should be moved by integrating the estimated value of the Jacobian. The second gradient estimation unit estimates the hesian of the evaluation function based on the signal indicating the evaluation amount. The parameter adjusting unit divides the estimated value of the Jacobian input to the manipulated variable determination unit by a regularized signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0. By doing so, the integral gain of the manipulated variable determination unit is adjusted according to the change of the evaluation function.

FIG. 1A is a diagram illustrating a basic concept of extreme value control in the first embodiment. FIG. 1B is a diagram illustrating a basic concept of extreme value control in the first embodiment. FIG. 1C is a diagram illustrating a basic concept of extreme value control in the first embodiment. FIG. 2 is a diagram showing a basic configuration example of an extreme value control system in the first embodiment. FIG. 3 is a diagram showing a configuration example of the extreme value control system of the first embodiment. FIG. 4 is a diagram showing a specific example of the method of adjusting the extreme value control parameter in the first embodiment. FIG. 5 is a diagram showing a first configuration example of the gradient estimator according to the first embodiment. FIG. 6 is a diagram showing a second configuration example of the gradient estimator according to the first embodiment. FIG. 7 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the first method in the first embodiment. FIG. 8 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the first method in the first embodiment. FIG. 9 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the first embodiment. FIG. 10 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the first embodiment. FIG. 11 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the third method in the first embodiment. FIG. 12 is a diagram for explaining an example of a regularized signal generated by the fourth method in the first embodiment. FIG. 13 is a diagram for explaining another example of the regularization signal generated by the fourth method in the first embodiment. FIG. 14A is a diagram for explaining an example of the result of simulating the response of the manipulated variable without using the regularization signal in the first embodiment. FIG. 14B is a diagram for explaining an example of the result of simulating the response of the manipulated variable using the coded signal of the regularized signal gradient as the regularized signal in the first embodiment. FIG. 14C is a diagram for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment. FIG. 14D is a diagram for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment. FIG. 15 is a diagram showing a configuration example of the extreme value control system of the second embodiment. FIG. 16 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the second embodiment. FIG. 17 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the third method in the second embodiment. FIG. 18 is a diagram showing an application example of the extreme value control system of the first embodiment or the second embodiment.

Embodiment

Hereinafter, the optimum control device, the optimum control method, and the computer program of the embodiment will be described with reference to the drawings.

(First Embodiment) [Outline of extreme value control]
1A to 1C are diagrams illustrating a basic concept of extreme value control.
Extreme value control is a control method in which the manipulated variable is updated in the direction of approaching the optimum value while observing the change in the evaluated variable with respect to the manipulated variable. The evaluation amount is a value that serves as an index for optimizing the process to be controlled (hereinafter referred to as “controlled process”), and is determined based on the controlled amount of the controlled process. For example, the evaluation quantity is represented by a predetermined evaluation function with the control quantity as a variable. The evaluation quantity may be defined based on any evaluation standard as long as it is a value based on the control quantity. For example, the evaluation quantity may be the control quantity itself. Generally, in extreme value control, the evaluation function of the controlled process may be a function unknown to the manipulated variable.

Specifically, in extreme value control, the manipulated variable is changed by applying a dither signal to the signal indicating the manipulated variable. The dither signal is a signal whose value changes periodically, and is usually given as a sine wave. In extreme value control, the operation amount is continuously vibrated by the dither signal, and the change (increase / decrease) in the evaluation amount caused by the vibration is observed. Then, based on the change in the observed evaluation amount, a new operation amount that changes the evaluation amount so as to approach the optimum value (maximum value or minimum value) of the evaluation function is calculated, and the calculated new operation amount is currently used. Update the operation amount of. Extreme value control is a control method that searches for the optimum value of the evaluation function by repeating such observation of the evaluation quantity and update of the manipulated quantity.

For example, FIG. 1A shows an evaluation function curve EV assuming a downwardly convex quadratic function as an example of an evaluation function unknown with respect to the manipulated variable. Further, FIG. 1B shows a case where the signal indicating the evaluation amount changes in the opposite phase to the dither signal as a result of vibrating the operation amount of the controlled process with the dither signal (for example, the evaluation amount decreases as the operation amount increases). To). Such a change occurs, for example, when the operating point changes in the region on the left side of the minimum point P10 of the evaluation function curve EV (for example, when the operating point changes from the operating point P11 toward the minimum point P10).

On the other hand, FIG. 1C shows a case where the signal indicating the evaluation amount changes in the same phase as the dither signal as a result of changing the operation amount of the controlled process with the same dither signal as in FIG. 1B (for example, with respect to an increase in the operation amount). The evaluation amount also increases). Such a change occurs, for example, when the operating point changes in the region to the right of the minimum point P10 of the evaluation function curve EV (for example, when the operating point changes from the operating point P12 toward the minimum point P10).

Therefore, as a result of periodically increasing or decreasing the operation amount, the operation amount is decreased when the evaluation amount increases or decreases in the same phase as the operation amount, and the operation amount is increased when the evaluation amount increases or decreases in the opposite phase to the operation amount. By increasing the value, the evaluation amount can be brought closer to the optimum value. Conventionally, PID control (Proportional-Integral-Derivative Control), which has been generally used as a control method for industrial plants, controls the operation amount so that the control amount follows a preset target value. It was a type control method. On the other hand, since extreme value control is an optimum value search type control method that optimizes the evaluation amount, a process model that expresses the relationship between the operation amount and the control amount for the controlled target process like PID control is used. There is no need to create it in advance. Extreme value control having such a property has the potential to become widespread in the future because it can effectively function even for a controlled target process in which a target value cannot be set in advance. On the other hand, an extreme value control system that realizes extreme value control can be realized with a relatively simple configuration as shown in FIG. 2 below.

FIG. 2 is a diagram showing a basic configuration example of an extreme value control system.
The extreme value control system 9 of FIG. 2 includes a modulation dither signal output unit 11, a high-pass filter 12 (HPF: High-Pass Filter), a demodulation dither signal output unit 13, and a low-pass filter 14 (LPF: Low-Pass Filter). And an integrator 15. As described above, the configuration of the extreme value control system 9 is as complicated as that of the conventional PID control controller. Therefore, the extreme value control system 9 can be easily implemented by using hardware such as a PLC (Programmable Logic Controller), similarly to the PID control controller. The outline of the operation of the extreme value control system 9 of FIG. 2 will be described below. Here, a case of searching for the minimum value of the evaluation function as the optimum value will be described as an example.

First, the modulation dither signal output unit 11 applies a dither signal to forcibly change the operation amount of the controlled process. For example, the modulation dither signal output unit 11 periodically changes the operation amount of the controlled target process by applying a dither signal such as a sine wave. Hereinafter, this operation is referred to as modulation, and the dither signal used for modulation is referred to as a modulation dither signal. The control amount changes according to the change in the operation amount due to this modulation. The controlled target process acquires an evaluation quantity based on the control quantity that changes in this way, and feeds back the acquired evaluation quantity to the extreme value control system 9.

In general, the controlled variable often changes with a certain time delay with respect to the change in the manipulated variable, so the evaluation quantity acquired based on the controlled variable also has a certain time delay with respect to the change in the manipulated variable. Will change. The function of acquiring the evaluation amount based on the control amount does not necessarily have to be included in the controlled process. For example, the function of acquiring the evaluation quantity may be included in the extreme value control system 9, or may be realized by another device that may intervene between the controlled target process and the extreme value control system 9.

The extreme value control system 9 updates the operation amount so that the evaluation amount approaches the extreme value of the evaluation function based on the evaluation amount fed back in this way. In this case, it is assumed that the evaluation function of the controlled process has a minimum value, but as described above, since the evaluation function is an unknown function with respect to the manipulated variable, its extreme value is also unknown with respect to the manipulated variable. Is. Therefore, the extremum control system 9 observes the magnitude and direction of the change of the evaluation quantity changed according to the modulation based on the signal of the evaluation quantity fed back, and based on the magnitude and direction of the observed change. Determine a new amount of operation.

Specifically, the determination of this new manipulated variable is realized by the high-pass filter 12, the demodulation dither signal output unit 13, the low-pass filter 14, and the integrator 15 having the following functions.

The high-pass filter 12 removes a constant value bias according to an unknown minimum value from the feedback evaluation amount signal. This process is a process for always adjusting the unknown minimum value to zero, and is a preprocess for determining the direction (increase or decrease) in which the integrator 15 described later updates the manipulated variable.

The demodulation dither signal output unit 13 causes the demodulation dither signal to act on the signal of the evaluation amount adjusted in this way, so that the evaluation amount changed according to the modulation of the operation amount is changed to the modulation dither signal. Extract the same frequency component. Hereinafter, this operation is referred to as demodulation, and the dither signal used for demodulation is referred to as a demodulation dither signal. The role of demodulation is as follows.

The evaluation function unknown to the manipulated variable may contain a non-linear element. In this case, the evaluation function is assumed to be a non-linear function that is convex downward (convex upward in the case of maximal value search). Due to such a non-linear element, it is highly likely that a harmonic component or a detuning component corresponding to the frequency ω of the modulation dither signal appears in the evaluation quantity. Demodulation is a process for removing the effects of such harmonics and harmonics. By this modulation, among the components included in the evaluation amount signal, a component having the same frequency ω as the modulation dither signal in which the evaluation amount is changed is extracted.

The demodulated evaluation amount signal is input to the low-pass filter 14. The low-pass filter 14 extracts a steady component (low frequency component) from the signal of the evaluation amount. Specifically, the stationary component indicates the first derivative value (hereinafter referred to as "Jacobian") of the evaluation function, and is considered to indicate the direction (increase or decrease) of the change in the evaluation amount due to modulation.

The integrator 15 integrates the steady-state components extracted by the low-pass filter 14. The integrator 15 functions as an estimator that estimates the direction of the manipulated variable (hereinafter referred to as “search direction”) to be moved in order to bring the evaluated quantity closer to the minimum value based on the integrated value of the steady-state component. The method of estimating the search direction in this way is generally called the gradient method, and is one of the basic methods of estimating the search direction in the adaptive control system.

Specifically, the integrator 15 estimates the gradient of the evaluation function based on the integrated value of the constant component, and the magnitude of the manipulated variable to be moved in the search direction and the search direction of the manipulated variable based on the estimated gradient value. Adjust (the amount of movement of the operation amount). The manipulated variable adjusted in this way is modulated by the modulation dither signal and input to the controlled process.

Here, the configuration has been described assuming that the extreme value control system 9 searches for the minimum value, but when the extreme value control system 9 searches for the maximum value, the integrator 15 estimates it. The sign of the gradient may be reversed. Further, since the integrator generally has a low-pass characteristic, the extreme value control system 9 does not necessarily have to include the low-pass filter 14 when the integrator 15 has a sufficient low-pass characteristic.

The extreme value control system 9 realized by such a configuration is as complicated as the PID control system that has been generally used in the conventional process control, and therefore, like the PID control system, the PLC ( It can be easily implemented using hardware such as Programmable Logic Controller).

The basic configuration of the extreme value control system has been described above, but such a conventional extreme value control system has a problem that it is not always possible to realize extreme value control adapted to the dynamics of the controlled process. It was. Therefore, in the following, the configuration of the extreme value control system of the embodiment capable of solving such a problem will be described in detail.

[Details of Embodiment]
FIG. 3 is a diagram showing a configuration example of the extreme value control system 1 of the first embodiment.
The plant P shown in FIG. 3 is an example of a means for realizing a controlled process, and is, for example, a water treatment plant for realizing a biological wastewater treatment process. The plant P includes various process devices that realize the controlled process, and operates the process devices based on the operation amount input from the extremum control system 1. Further, the plant P includes various measuring devices for measuring the controlled amount of the controlled process, and outputs information indicating the measured values (hereinafter referred to as “measurement information”) to the extreme value control system 1. The extreme value control system 1 updates the operation amount in the direction (search direction) that brings the evaluation amount of the controlled target process closer to the optimum value based on the measurement information acquired from the plant P.

The basic operation of such extreme value control is that the extreme value control system 1 of the embodiment has the same modulation dither signal output unit 11, high-pass filter 12, and demodulation dither signal as the extreme value control system 9 of the conventional configuration. This is realized by including an output unit 13, a low-pass filter 14, and an integrator 15. On the other hand, the extremum control system 1 of the embodiment is different from the conventional extremum control system 9 in that it includes a parameter adjusting unit 2 that adjusts the extremum control parameters based on the evaluation amount signal.

For example, the extremum control system 1 includes a CPU (Central Processing Unit), a memory, an auxiliary storage device, etc. connected by a bus, and executes an extremum control program. The extreme value control system 1 functions as a device or system including each of the above-mentioned functional units by executing the extreme value control program. Even if all or part of each function of the extreme value control system 1 is realized by using hardware such as ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), and FPGA (Field Programmable Gate Array). Good. The program may be recorded on a computer-readable recording medium. The computer-readable recording medium is, for example, a flexible disk, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, or a storage device such as a hard disk built in a computer system. The program may be transmitted over a telecommunication line.

The parameter adjustment unit 2 has a function of adaptively adjusting extreme value control parameters with respect to the dynamics of the controlled process. Specifically, the parameter adjusting unit 2 adaptively adjusts the integrated gain of the integrator 15 based on the gradient estimation value of the evaluation function that changes from moment to moment due to the dynamics.

FIG. 4 is a diagram showing a specific example of the method of adjusting the extreme value control parameter in the first embodiment. Specifically, FIG. 4 cites the adjustment method described in Patent Document 1.
No. 4 in FIG. As described in 5, the parameter adjusting unit 2 in the present embodiment estimates the second derivative value (hereinafter referred to as “hesian”) of the evaluation function based on the evaluation amount fed back from the controlled target process, and the estimated hesian. The value is used to determine the new integrated gain. For such adjustment of the integrated gain, the parameter adjusting unit 2 includes a first multiplier 21, a gradient estimation unit 22, a regularized signal output unit 23, and a second multiplier 24.

The first multiplier 21 multiplies the evaluation amount signal input from the high-pass filter 12 by the dither signal (squared signal) and outputs it to the gradient estimation unit 22. The gradient estimation unit 22 extracts the hesian signal H (t) from the output signal of the first multiplier 21 and outputs it to the regularized signal output unit 23. In this case, for example, the gradient estimation unit 22 can estimate the differential value of the 0th or higher order of the evaluation function by using the method described in Non-Patent Document 1. That is, the gradient estimation unit 22 can function as a first gradient estimation unit that estimates the Jacobian of the evaluation function or a second gradient estimation unit that estimates the Hesian of the evaluation function. Specifically, Non-Patent Document 1 describes a configuration in which a low-pass filter is used to estimate the differential value of the 0th or higher order of the evaluation function, and the basic concept thereof is as follows.

Generally, the manipulated amount may include a harmonic component or a harmonic component, but when the dither signal is given as a sine wave, the manipulated amount after modulation also changes in a sine wave shape at approximately the same frequency as the dither signal. .. Therefore, it is assumed that the manipulated variable U changes in a sinusoidal shape of U (t) = U0 + a × sinωt, and the evaluated quantity that changes according to the manipulated variable is represented by the evaluation function J shown in the equation (1). ..

Equation (1) defines the evaluation quantity J (t) as an (unknown) function for the manipulated quantity U (t). Considering the dynamics of the plant, to be precise, f should be an operator of the dynamic system rather than a function, but the frequency ω of the dither signal changes slowly enough with respect to the dynamics of the plant. When it is brought about, f can be regarded as a function approximately. In this embodiment, f is regarded as a function under such a premise. Equation (2) is obtained by Taylor-expanding equation (1).

Here, Dkf (k is an integer of 1 or more) means the k-th derivative of the function f with respect to U. Equation (3) is obtained by multiplying this equation (2) by sinnωt (n is an integer of 1 or more). Further, when the periodic averaging process (or time integration) is performed on the equation (3), only the component related to sinnωt remains due to the orthogonality of the sine wave, and the equation (4) is obtained.

Here, since the amplitude a of the dither signal and the powerful number n are constants, assuming that the value of the nth derivative Dnf does not change significantly in one control cycle, the equations (4) are expressed in equations (5) and (6). ) Can be expressed as. Then, by back-calculating from the equation (5), the equation (7) representing the nth derivative Dnf can be obtained.

5 and 6 are diagrams showing a configuration example of the gradient estimator represented by the equation (7).
Specifically, FIG. 5 shows a configuration example of the Jacobian estimator (that is, when n = 1), and the evaluation quantity signal J (t) on which the dither signal sinωt is applied is processed by a low-pass filter and then multiplied by 2 / a. By doing so, the Jacobian of the evaluation function is obtained. Since this configuration corresponds to defining a new integral gain KImod = KI × (a / 2) for the integrator 15 having the integral gain KI, the conventional basic poles shown in FIG. 2 are used. The value control system can be regarded as a configuration in which the Jacobian of the evaluation function is estimated by the low-pass filter 14.

On the other hand, FIG. 6 shows a configuration example of the Hessian estimator (that is, when n = 2), and the evaluation amount signal J (t) on which the squared signal sin ² ωt of the dither signal is applied is processed by a low-pass filter and then multiplied by 16. It represents a configuration in which the evaluation function hesian is obtained by processing the evaluation amount signal J (t) with a low-pass filter from the first signal, subtracting the second signal multiplied by 8 and multiplying by 1 / a2.

The parameter adjusting unit 2 can adjust the integral gain by using the hesian of the evaluation function estimated by such a method as it is, but in that case, the extreme value control may become unstable due to the reason described later. There is. Therefore, in the extreme value control system 1 of the present embodiment, the Jacobian estimated by the low-pass filter 14 is regularized based on the estimated value of Hesian, and the regularized Jacobian signal is supplied to the integrator 15. As a result, the extreme value control system 1 of the present embodiment can adaptively update the integrated gain while avoiding the instability of the extreme value control.

Specifically, the regularization signal output unit 23 generates a signal (hereinafter referred to as “regularization signal”) that regularizes the Jacobian signal output by the gradient estimation unit 22, and causes the second multiplier 24. Output. The second multiplier 24 inputs the Jacobian signal G (t) from the low-pass filter 14 and the regularized signal from the regularized signal output unit 23, and multiplies the Jacobian signal by the regularized signal to make the Jacobian signal regular. To become. The second multiplier 24 supplies the regularized Jacobian signal Gn (t) to the integrator 15.

[First method of generating a regularized signal]
In general, "regularization" of a signal means avoiding an adverse condition (ill-condition) that when an attempt is made to perform some kind of back calculation on a target signal, the back calculation does not exist and the back calculation cannot be performed. .. For example, an example of such an adverse condition is "zero division" in division.

On the other hand, as shown in FIG. 4, the extremum control system 1 of the present embodiment adaptively updates the integral gain of the integrator 15 by using the evaluation function Hessian, and this is each control. The integral gain of the period can be replaced with the equivalent of performing extreme value control by fixing the Jacobian of the evaluation function to a constant value obtained by dividing (normalizing) it by Hessian. That is, the configuration of the extremum control system 1 in the present embodiment can be regarded as adding a configuration for normalizing the Jacobian with Hessian to the basic configuration shown in FIG.

Therefore, in the present embodiment, avoiding adverse conditions such as "zero division" in the "normalization" of the Jacobian signal is defined as "regularization", and acts on the Jacobian signal to perform such regularization. Is generated as a regularized signal. Specifically, the regularized signal output unit 23 generates a signal that realizes signal conversion (⇔) satisfying each of the following conditions as a regularized signal.

[Condition 1] G (t) = 0 ⇔ Gn (t) = 0
[Condition 2] G (t) is positive (negative) ⇔ Gn (t) is positive (negative)
[Condition 3] G (t) <∞ ⇔ Gn (t) <∞
[Condition 4] G (t) → ∞ ⇔ Gn (t) → k (0 <k <∞)

G (t) represents a Jacobian signal, and Gn (t) represents a Jacobian signal that has been regularized (that is, divided by Hessian) by the action of a regularized signal. [Condition 1] represents a condition that Gn (t) is also 0 only when G (t) is 0. [Condition 2] represents a condition that the signs of G (t) and Gn (t) are the same. [Condition 3] represents a condition that when G (t) is finite, Gn (t) is also finite (that is, zero division does not occur). [Condition 4] represents a condition that when G (t) diverges to ∞, Gn (t) does not diverge to ∞ and converges to a certain positive finite value. A regularized signal having such a property is represented by, for example, the equation (8).

Δ in equation (8) is a positive constant (δ> 0) and represents a so-called regularization constant. H (t) represents the hesian of the estimated evaluation function. Since the regularization signal represented by the equation (8) can be generated by the process of adding the regularization constant δ to the hesian signal and the process of taking the absolute value of the hesian signal, it is possible to suppress the increase in size of the device. The stability of extreme value control can be improved.

For example, when searching for the minimum value (minimum value) by extreme value control, a downwardly convex M-th order (M> 1) evaluation function in which hesian takes a positive value near the extreme value, or hesian is located near the extreme value. It is possible to stably search for extreme values even for evaluation functions of the Mth order (0 <M <1), which are negative and convex.

Note that the regularized signal generated using Hessian does not necessarily have to be represented by the Hessian first-order function as in Eq. (8). For example, the regularization signal may be represented by a Hessian square function such as Eqs. (9) or (10), or a Hessian M-th power as in Eqs. (11) or (12). It may be represented by a (M = 1, 2, 3, ...) Function.

7 and 8 are diagrams showing a configuration example of an extremum control system that generates a regularized signal by the first method. Specifically, FIG. 7 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (8) is input to the integrator 15. Further, FIG. 8 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (9) is input to the integrator 15.

[Second method of generating a regularized signal]
In the first method, since the Hessian signal is used to generate the regularized signal, some mechanism for estimating the Hessian of the evaluation function is required. For example, in the extremum control system 1 shown in FIGS. 7 and 8, a first multiplier 21 and a gradient estimation unit 22 (for example, a low-pass filter) for generating a Hessian signal from an evaluation amount signal are required. On the other hand, the second method is a method of generating a regularized signal without the first multiplier 21 and the gradient estimation unit 22 by using the Jacobian estimation value instead of the Hesian estimation value. .. For example, the regularized signal generated by the second method is represented by the equation (13).

Here, G (t) represents the Jacobian of the evaluation function estimated by the gradient estimation unit 22, or an amount proportional to the Jacobian. The regularized signal generated by the second method does not necessarily have to be represented by the Jacobian first power function. For example, the regularization signal generated by the second method may be represented by the Jacobian M-th power (M = 1, 2, 3, ...) Function as in the first method (. For example, equations (14) and (15)).

9 and 10 are diagrams showing a configuration example of an extreme value control system that generates a regularized signal by the second method. Specifically, FIG. 9 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (13) is input to the integrator 15. Further, FIG. 10 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal (in the case of M = 2) represented by the equation (14) is input to the integrator 15. According to such a second method, it is possible to prevent the configuration of the extremum control system from becoming complicated due to the regularization of the Jacobian signal.

[Third method of generating a regularized signal]
The third method is a method in which the second method is further simplified by setting δ = 0 in the formulas (13) to (15). In this case, the regularized signal generated by the third method is represented by the equation (16).

In the first and second methods, δ (> 0) works to avoid zero division, so if this δ is set to 0, RS will diverge to infinity when G (t) = 0. The Jacobian signal Gn (t) after regularization also diverges. Since such a signal does not satisfy the conditions 1 to 4 as a regularization signal in the first place, it should not be used as a regularization signal in the first place, but the regularization of the Jacobian signal is defined as in the equation (17). In some cases, it can be used as a regularization signal.

Equation (17) indicates that dividing the Jacobian signal by the regularization signal RS of equation (16) is defined as regularization of the Jacobian signal. Since this is an operation of dividing the Jacobian signal by the absolute value of the Jacobian signal, Gn (t) in the equation (17) takes either a value of -1 or +1. That is, it can be said that the regularized signal generated by the third method is a signal that acts on the Jacobian signal and extracts its code information (-1 or +1). Therefore, when the regularization defined in the equation (17) is performed, the signal represented by the equation (16) functions as a regularization signal in principle.

Since the purpose of such regularization is to extract the code information, if the code information of the Jacobian signal can be extracted, G (t) is actually set to its absolute value | G (t). The process of dividing by | is not always necessary. Therefore, the regularized signal output unit 23 in this case may be configured to directly calculate sgn (G (t)). Further, by adopting such a configuration, it is possible to avoid the occurrence of zero division. Here, sgn (x) represents a function that returns the sign of the value x.

FIG. 11 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the third method in the first embodiment.
The code information extracted in this way is simply a signal indicating only the direction (increase or decrease) in which the manipulated variable should be moved. Therefore, by providing such a regularization configuration, it is possible to configure a simple extremum control system as shown in FIG. 11, for example. In this case, since the amount of change in the operation amount in each control cycle is a constant value, the amount of change is adjusted to be a desired amount by multiplying sgn (G (t)) by a coefficient. May be good.

Since the equation (16) can be obtained by setting δ = 0 not only in the equation (13) but also in the equations (14) and (15), the equations (13) to (15) are δ. It can be considered that the action of the equation (16) is adjusted with> 0 as a parameter. In fact, regularization by sgn () gives only the direction in which the manipulated variable should be moved, so chattering may occur near the extremum if the parameters are not adjusted sufficiently. In this case, it is considered that the regularized signal generated by using the equations (13) to (15) in which the equation (16) is adjusted by δ> 0 acts on the Jacobian signal so as to suppress chattering near the extreme value. ..

The second multiplier 24 regularizes the Jacobian signal G (t) by multiplying the regularized signal generated in this way, and outputs the regularized Jacobian signal Gn (t) to the integrator 15. The regularization of the Jacobian signal by the parameter adjusting unit 2 is expressed by the equation (18).

The parameter adjusting unit 2 adaptively updates the integrated gain of the integrator 15 using the Jacobian signal regularized in this way. In general, the direction in which the manipulated variable should be moved in extreme value control is represented by the sign of the gradient (Jacobian) estimated for the evaluation function unknown to the manipulated variable, and the amount of movement is adjusted by the integral gain. Here, a method of adjusting the integral gain based on the concept of the average system (average system) described in Non-Patent Document 2 and the like will be described.

The average system is a system that represents the dynamic behavior of a system that takes the periodic average (average) when a periodic input is applied to a certain system. Generally, the average system is used in the stability analysis of the extreme value control system. For example, Non-Patent Document 2 specifically describes the dynamics of the average system of an extreme value control system of a static plant having no dynamics. The averaging system is represented by equations (19) and (20).

In equation (19), G (u) represents the Jacobian estimate of the evaluation function that is unknown for the manipulated variable u. a represents the amplitude of the dither signal. P represents the power of the dither signal, P = 1/2 when the sine wave is the dither signal, P = 1/3 when the triangular wave is the dither signal, and the square wave is the dither signal. If so, P = 1. τ represents the time (τ = ωt) obtained by scaling the real time t with the frequency ω of the dither signal, and KI0 represents the integrated gain on the time axis of τ. KI0 is converted into an integral gain KI on the time axis of real time t by the equation (20).

Further, with respect to the equation (19), the equation (21) can be obtained by taking the periodic average u ~ = u−u * with the equilibrium point of the manipulated variable u as u *. "U-" means a symbol with "-" directly above "u".

In the average system represented by the equations (19) and (21), the extreme value of how quickly the evaluation amount converges to the minimum value (minimum value) according to the vibration of the operation amount due to the dither signal. It expresses the dynamics of convergence in control. Non-Patent Document 2 assumes that the controlled process is static, but the period of the dither signal is set to be sufficiently longer than the time constant of the plant. This means that if the dither signal frequency ω is set sufficiently lower than the cutoff frequency of the controlled process, the controlled process with dynamics can be regarded as an approximately static controlled process. That is. This is also supported by the singular perturbation theory used in the stability analysis of extremum control.

Further, even in the case of an extreme value control system having a basic configuration including a high-pass filter and a low-pass filter as shown in FIG. 2, the cutoff frequencies of these filters are appropriately set and are dominant. When the dynamics (slowest dynamics) is the integrator 15 (integrator), the average system shown in equation (19) can be used to characterize the overall behavior of the extreme value control system. Therefore, in such a case, the integral gain can be adjusted by using the equation (19).

Since the evaluation function Jacobian G (u) in the equation (19) is often a nonlinear function related to u, the equation (19) is generally a nonlinear differential equation. Here, the average system linearized around the appropriate operating point u0 with respect to u in the equation (19) can be expressed by the equation (22).

Here, "u ^" means a symbol with "^" directly above "u". u ^ = u−u0, where H (u0) represents the Jacobian of G (u). That is, H (u0) is the evaluation function Hessian. Therefore, in the present embodiment, the convergence speed of the extreme value control is adjusted by adaptively adjusting the integral gain KI0 so as to be proportional to the reciprocal of H (u0). In such adjustment of the integral gain, the regularized signal (see equations (8) to (12)) generated by the first method avoids zero division by the estimated value of Hessian and causes a sudden code change. It acts to suppress.

In addition to this, the regularized signal by the first method is defined as a new signal by expelling the estimated value of Hessian that fluctuates with time from the calculation formula of the integral gain. By expelling the factor that makes the integral gain variable from the calculation formula in this way, the term 1 / H (u0) is removed from the calculation formula, and the integral gain KI0 can be adjusted as a fixed value. Further, the minute constant δ (> 0) included in the definition formula of the regularized signal is for avoiding zero division by Hesian, and KI0 is the maximum value (constant term) when the estimated value of Hesian becomes 0. 1 / δ times). Therefore, the integral gain can be adjusted by determining δ based on the maximum value assumed for KI0.

Equation (22) is a linear approximation of the nonlinear equation of equation (19), but it is said that G (u) is eliminated as a method of suppressing the influence of G (u), which is a nonlinear element of equation (19). A more direct method can be considered. The above-mentioned second and third methods are methods for generating a regularized signal according to such an idea. Since the integral gain KI0 in the equation (19) is a parameter to be adjusted and can be determined by the designer, it is regarded as a kind of operation amount that transforms the differential equation in the equation (19), and the equation (19) is used. By newly defining the integral gain KI0'that satisfies 23), the non-linear element in the equation (19) can be eliminated.

However, when the equation (23) is applied, the information about the Jacobian is not included in the equation (19), so that the information necessary for determining the direction in which the manipulated variable should be moved is lost. Therefore, by taking the absolute value of the Jacobian as in the equation (24) instead of the equation (23), the sign information of the Jacobian can be left in the equation (19). In this case, equation (19) is expressed as equation (25).

By doing so, it is possible to eliminate the non-linearity of the Jacobian while leaving the information necessary for determining the search direction of the manipulated variable. That is, it can be said that the regularized signal by the third method is a signal for expelling the Jacobian non-linearity from the calculation formula of the integral gain. Since the equation (25) includes only the Jacobian code information, the adjustment of the new integral gain KI0'is easy. The adjustment of KI0'may be performed with reference to the adjustment method shown in FIG. 5, considering that the transient behavior of the extreme value control based on the equation (25) becomes linear.

While the regularization according to the third method is expressed very simply, chattering may occur near the extreme value because Jacobian contains only code information. It can be considered that the regularization according to the second method introduces a minute constant δ> 0 into the third method in order to avoid this chattering. In this case, considering that the integrated gain is converted as in the equation (26), the average system corresponding to the equation (25) is represented by the equation (27).

The average system of equation (27) behaves in the same way as the average system of equation (25) because the effect of δ is small when the Jacobian value is large. On the other hand, when the Jacobian value is small, the influence of δ becomes large, so that the average system of Eq. (27) behaves in proportion to the Jacobian. Therefore, even in the regularization according to the second method, the integral gain KI0'is basically adjusted in the same way as in Eq. (25) by simply setting δ to prevent chattering near the extreme value. be able to.

[Fourth method of generating a regularized signal]
The regularized signal by the fourth method is a signal represented by an approximate code estimated value obtained by approximating the code estimated value of the Jacobian signal with a continuous function.
When the Jacobian signal is regularized by the regularized signal generated by the fourth method, the Jacobian signal (coded signal) after the regularization is approximated by a smooth continuous function by the regularized signal by the third method. , The signal corresponding to the generalized Jacobian signal after the regularization by the regularization signal by the second method.

In the second method, δ = 0 is the regularized signal by the third method. In other words, the signal introduced by the fine constant δ> 0 in the third method is the regularized signal by the second method. However, the approximation function that satisfies the above-mentioned conditions 1-4 of the regularization signal while approximating the sign function that is the regularization signal by the third method is not limited to the regularization signal by the second method.

For example, if the sign function is approximated by a continuous function or a smooth continuous function, the definition of the regularized signal in Condition 1-4 is satisfied. There are many (smooth) approximation functions of such a sign function, but the approximation function is not limited to the regularization signal by the second method described above, and for example, the following approximation functions can be considered.

A. Saturation function

Here, m (G (t)) is a strict monotonous increasing function of G (t) satisfying m (0) = 0, and as a typical example, −sgn (G (t)) / α. · | G (t) | A power function of G (t) such as ^ρ , α> 0, ρ> 0. For example, when ρ = 1, the gradient G (t) is cut off by ± 1. Equivalent to.

B. Sigmoid function (hyperbolic tangent)

C. Arc tangent

In addition to the above examples of AC, the cumulative normal distribution function, Gompertz function, Gudermannian function, etc. included in the sigmoid function in a broad sense were translated so that the origin was at the center, and the value range was appropriately scale-converted to ± 1. Functions etc. are also included. Alternatively, replace the time t of the step response (eg, 1-exp (t) in the case of the first-order lag system) of the transfer function (eg, higher-order lag system) that is stable and does not cause overshoot or vibration with G (t). (Example: 1-exp (G (t)) in the case of a first-order lag system), and then folded back and joined so as to be point-symmetrical about the origin (Example: In the case of a first-order lag system, sgn (G (G (t))) Since t)) (1-exp (| G (t) |)) also functions as a smooth approximation function of the sign function, they can be used as a regularized signal.

FIG. 12 is a diagram for explaining an example of a regularized signal generated by the fourth method in the first embodiment. FIG. 12 illustrates the relationship between the gradient (Jacobian signal) G (t) and the regularized Jacobian signal Gn (t), and the origin of G (t) = Gn (t) = 0 is the pole. Corresponds to the extremum of value search.

Approximating the Jacobian signal (sign function) after regularization by the third method with a continuous function corresponds to adjusting the behavior of the regularized signal near the extremum. That is, if a saturation function in which the gradient is simply cut off by ± 1 is applied, the conventional gradient type extreme value control is applied as it is in the vicinity of the extreme value. In the conventional extreme value control, when the evaluation function is just a quadratic function (with respect to the manipulated variable), the gradient obtained by differentiating it becomes a linear function (linear function). At this time, when the gradient method is applied, it has the convergence characteristic (= exponential convergence) of the linear system just near the extremum. Such convergence is generally considered preferable.

Therefore, if the shape of the gradient of the evaluation function and G is proportional to a power of the manipulated variable u as a (t) αu ⁿ (evaluation function is proportional to ^{u (n + 1)),} Gn (t) = If G (t) ^{(1 / n)} is set, then Gn (t) ∝u, so that it can be converged exponentially in the vicinity of the extremum. Actually, since the shape of the evaluation function near the extremum is unknown, n cannot be theoretically obtained, but while checking the behavior near the extremum, the selection of the approximate function of the sign function and its parameters are selected. By adjusting, fine tuning of the behavior near the extreme value becomes possible.
For example, as shown in FIG. 12, when the value of α is changed in the sigmoid function, the value of G (t) when the absolute value of the regularization signal Gn (t) becomes smaller than 1 changes. It is possible to adjust the range A itself in the "neighborhood" of the extreme value.

FIG. 13 is a diagram for explaining another example of the regularization signal generated by the fourth method in the first embodiment. Similar to FIG. 12, FIG. 13 also illustrates the relationship between the Jacobian signal G (t) and the regularized Jacobian signal Gn (t), but the origin of G (t) = Gn (t) = 0. Corresponds to the extremum of the extremum search.

For example, as shown in FIG. 13, when the continuous function having a convex shape upward and the continuous function having a convex shape downward are switched in the extremum neighborhood B, the situation where chattering occurs near the extremum value can be suppressed. It is possible to improve the phenomenon that the extremum search control does not converge to the extremum and stops in the vicinity.

If the Jacobian signal Gn (t) after regularization of an upwardly convex shape (for example, α = 3 in the sigmoid function) is selected, Gn (t) becomes closer to a pure sign function, so the convergence value is extreme. By selecting a function with such a shape when the value is not reached, a search close to the true extremum becomes possible. However, it should be noted that chattering may occur if the upward convex tendency is too strong (too close to the coding function).
On the other hand, if a regularized signal Gn (t) having a downwardly convex shape (for example, α = 10, ρ = 1.5 in the saturation function) is selected, chattering occurs near the extreme value. The effect of suppressing is recognized. It should be noted that the weakening of the downward convexity reduces the extremum search performance.

FIG. 14A is a diagram for explaining an example of the result of simulating the response of the manipulated variable without using the regularization signal in the first embodiment. Here, the result of simulating the response of the operation amount when various regularized signals are applied to a virtual control target in which a function of y = u ⁵ is added to the first-order lag system is illustrated.
The example shown in FIG. 14A is a comparative example to which the regularization signal is not applied, and is adjusted so as to operate well when the initial value is 2. At this time, if the initial value is changed to 3, the extreme value control diverges. This is because the first derivative of y = u ⁵ is y'= 5u ⁴ , and when the initial value is 2, y'= 5.2 ⁵ = 160, whereas when the initial value is 3, y'= This is because 5 ・ 3 ⁵ = 1215, and the gradient changes significantly.

Further, even when the initial value was set to 2, the extreme value (= 1) of the manipulated variable did not converge to the extreme value within a predetermined time. This is because when u approaches 0, y'quickly approaches 0. That is, in a virtual control target, the gradient becomes almost flat near the extreme value.

FIG. 14B is a diagram for explaining an example of the result of simulating the response of the manipulated variable using the coded signal of the gradient as the regularized signal in the first embodiment.
In contrast to the comparative example shown in FIG. 14A, the example shown in FIGS. 14B-14D applies a regularization signal. The example shown in FIG. 14B is a case where a sign function which is a regularized signal by the third method is applied. At this time, the extremum can be searched at the same convergence speed regardless of whether the initial value is 2 or 3, but chattering occurs near the extremum 1. This is because the sign function is the strongest discontinuous switching function.

14C and 14D are diagrams for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment.
In the example shown in FIG. 14C, the saturation function (ρ = 1) shown in A of the fourth method is applied as a continuous approximation function of the sign function. Similar to FIG. 14B, the extreme value can be searched at the same convergence speed regardless of the initial value, but the extreme value is not reached within a predetermined time as in FIG. 14A. This is because the gradient itself is used near the extreme value, so that the same phenomenon as in FIG. 14A occurs near the extreme value.

The example shown in FIG. 14D is, for example, -sgn (G (t)) / α · | G (t) | ^ρ , α> 0, ρ shown in A of the fourth method as a continuous approximation function of the sign function. The saturation function, which is a power function of G (t) such as> 0, ρ ≠ 1, is applied, and α = 1, ρ = 1/5. In this example, the extremum can be searched regardless of the initial value, and chattering can be suppressed. As in this example, it is possible to adjust the response of the extreme value search control by making good use of the continuous function that approximates the sign function.

In the extremum control system using the regularized signal generated by the fourth method, for example, in the configuration example shown in FIG. 11, the sign function sgn () is approximated by G (t) (as an example, the above-mentioned function A). -C) can be realized.
Further, in the extremum control system using the regularized signal generated by the fourth method, for example, in the configuration examples shown in FIGS. 9 and 10, the regularized signal output unit 23 performs an operation corresponding to the approximate function. By doing so, it can be realized. That is, the arithmetic expression in the regularized signal output unit 23 may be set so that the output of the regularized signal output unit 23 is a value obtained by dividing the Jacobian signal after the regularization by G (t).

According to the fourth method, the regularization function is not limited to that of the third method, and the behavior near the searched extreme value is obtained by selecting a continuous approximation function and adjusting the parameters of the function. Can be easily adjusted, and more detailed extremum search control can be realized.

As described above, the regularization signal generated by the regularization signal output unit 23 has a role of expelling complicated behavior due to the non-linear element of the evaluation function from the average system that gives the direction and amount to move the operation amount, and parameter adjustment. The unit 2 regularizes the Jacobian signal output by the gradient estimation unit 22 by using the regularization signal having such a property. Then, the integrator 15 can more appropriately control the search direction of the extreme value control by integrating the regularized Jacobian using the adjusted integration gain.

The extremum control system 1 of the first embodiment configured as described above is controlled by providing a parameter adjusting unit that regularizes the Jacobian signal and adaptively adjusts the integral gain based on the regularized Jacobian signal. It is possible to adapt the extreme value control of the target process to its dynamics and operate it more stably.

Specifically, the extremum control system 1 of the first embodiment realizes the following (1) and (2) by regularizing the Jacobian signal, thereby integrating without impairing the stability of the extremum control. It makes it possible to easily adjust the gain. (1) Avoid zero division when calculating the integral gain. (2) Maintain the sign information of the Jacobian signal (avoid sudden sign inversion).
As a result, the operator of the controlled process can more easily and safely adjust the convergence speed of the extreme value control.

(Second Embodiment)
FIG. 12 is a diagram showing a configuration example of the extreme value control system 1a of the second embodiment. The extreme value control system 1a is different from the extreme value control system 1 of the first embodiment in that it further includes an operation amount correction unit 3. The extreme value control system 1a shown in FIG. 12 is operated by the extreme value control system 1 shown in FIG. 9 among the configuration examples of the extreme value control system that generated the regularized signal by the second method in the first embodiment. It is configured by adding the amount correction unit 3. Since the other configurations of the extreme value control system 1a are the same as those of the extreme value control system 1 of the first embodiment, the same reference numerals as those in FIG. 3 are given here, and the description of those similar functional parts is omitted. To do.

When a regularization signal based on the evaluation function Hessian is used, a linear approximation system as shown in Eq. (22) can be obtained. Therefore, if the estimation accuracy of Hessian is high, the operation amount and the evaluation amount are exponentially extremized. It is thought that it can be converged. Such a mode of convergence is also a feature of a (first-order) linear system, and it is considered that such a feature can be easily obtained as expected by increasing the estimation accuracy of Hessian.

On the other hand, when the regularization signal based on the Jacobian evaluation function is used, the average system as shown in Eq. (25) can be obtained, so that the convergence to the extreme value is considered to be linear. Such a linear search may be required, but in general, when the manipulated variable or evaluation amount is far from the target extremum, it quickly approaches the extremum, and gradually approaches the extremum. It is often required to approach the extreme value.

In extreme value control represented by an average system as in equation (25), there is a high possibility that chattering will occur near the extreme value. Such chattering can be suppressed by adjusting the equation (25) with a minute constant δ (> 0) as in the above equation (27), but this is an extremum search near the extremum. It adjusts the behavior of, and does not adjust the movement of the extremum search as a whole.

Therefore, in order to adjust the behavior of the average system of the equation (19) or (22) as a whole, it is sufficient to have the manipulated variable u or the manipulated variable deviation u- on the right side. That is, it suffices if the equation (22) can be transformed as in the equation (29).

Here, F (u ~) is a function that satisfies F (0) = 0 and the sign of u ~ and the sign of F (u ~) match. That is, F (u ~) satisfies u ~ × F (u ~)> 0. The simplest example is F (u ~) = u ~, in which case equation (29) is a linear differential equation and u ~ exponentially converges to zero. It should be noted that the reason why u ~ is considered instead of u is that the optimum value u * of u is unknown, whereas u ~ needs to converge to zero. Therefore, the equilibrium point of the equation (29) may be set to 0. Because. Further, by using an appropriate exponentiation function of u to, it is possible to move the operating point toward the extremum faster when it is far from the extremum, and to move the operating point gently near the extremum. ..

The operation amount correction unit 3 is for obtaining an average system having the behavior as shown in the equation (29), and by feeding back the operation amount u and multiplying it with the regularization signal, the operating point away from the extreme value. Correct the operation amount so that it moves in the extreme value direction more moderately. For example, when the regularization signal is generated by the third method, the average system of Eq. (25) can be obtained, but in order to obtain the system of Eq. (29), the derivative of u and u to Focusing on the fact that the derivatives are equal, the right side of equation (25) may be multiplied by F (u ~).

However, u ~ = u-u *, and the optimum value of the manipulated variable is unknown, so u ~ cannot be used directly. However, since the manipulated variable u can be used, u ~ can be approximately obtained by applying a high-pass filter to u as an estimated value of u ~ and removing the unknown constant term u *. That is, u ~ can be estimated by the equation (30).

By multiplying the signal of u ~ estimated by the equation (30) with the normalized Jacobian signal, the extremum search can be modified so as to have an exponential response characteristic. If you want to move the operating point to the vicinity of the extreme value at a speed even faster than the exponential response characteristic, for example, by setting the estimated value of the operation amount u to the power value of the value obtained by the equation (30). The movement width of the operating point may be made larger.

13 and 14 are diagrams showing a configuration example of the extreme value control system 1a according to the second embodiment. Specifically, FIG. 13 shows the operation amount of the extreme value control system 1 shown in FIG. 10 among the configuration examples of the extreme value control system 1 that generated the regularized signal by the second method in the first embodiment. An example of the extreme value control system 1a configured by adding the correction unit 3 is shown. FIG. 14 shows the extreme value control system 1a configured by adding the manipulated variable correction unit 3 to the extreme value control system 1 (see FIG. 11) that generated the regularized signal by the third method in the first embodiment. An example is shown.

The extreme value control system 1a of the second embodiment configured in this way has the same effect as the extreme value control system 1 of the first embodiment, and in addition, finely adjusts the speed of the extreme value search. Will be possible. As a result, when the operating point is far away from the extremum, the operating point can be quickly moved to the vicinity of the extremum, and the operating point can be finely moved near the extremum, thus improving the speed and accuracy of the extremum search. It becomes possible.

(Application example)
FIG. 15 is a diagram showing an application example of the extreme value control system of the first embodiment or the extreme value control system of the second embodiment. FIG. 15 shows an example in which the extremum control system of the embodiment is applied to a water treatment plant 4 that realizes a biological wastewater treatment process. For example, the water treatment plant 4 shown in FIG. 15 includes facilities of an anaerobic tank 41, an oxygen-free tank 42, an aerobic tank 43, and a final settling pond 44. The anaerobic tank 41 is a facility for activating microorganisms. The oxygen-free tank 42 is a facility for removing nitrogen. The aerobic tank 43 is a facility for decomposing organic substances, removing phosphorus, and nitrifying ammonia. The final sedimentation pond 44 is a facility for precipitating activated sludge.

The water treatment plant 4 is equipped with equipment such as a pump for transporting water and sludge between the above equipment, a blower for supplying air into the tank, and a sensor for measuring the concentration of substances in the air or water. The chemical input pump 411 is a pump that charges a chemical such as a carbon source that activates microorganisms into the anaerobic tank 41. The circulation pump 431 is a pump that controls the circulation amount of the water to be treated that circulates between the aerobic tank 43 and the anoxic tank 42. The blower 432 supplies air to the aerobic tank 43 to control the amount of aeration. The return sludge pump 441 is a pump that returns sludge from the final settling pond 44 to the oxygen-free tank 42. The excess sludge extraction pump 442 is a pump that extracts excess sludge from the final sedimentation pond 44. The sensor 412 and the sensor 443 measure the quality of the discharged water in the anaerobic tank 41 and the final settling pond 44, respectively.

Generally, in such a biological wastewater treatment process, the manipulated amount is the return rate of the returned sludge, and the controlled amount is the concentration of nitrogen contained in the discharged water (hereinafter referred to as "discharged nitrogen concentration") and the concentration of phosphorus (hereinafter referred to as "discharged nitrogen concentration"). Hereinafter referred to as "released phosphorus concentration"). The return rate is obtained by dividing the discharge rate of the return sludge pump 441 by the inflow rate. The released nitrogen concentration and the released phosphorus concentration are acquired by the sensor 412 and the sensor 443. The controlled amount may be the amount of nitrogen contained in the discharged water (hereinafter referred to as "the amount of discharged nitrogen") and the amount of phosphorus (hereinafter referred to as "the amount of discharged phosphorus"). In this case, the amount of discharged nitrogen and the amount of discharged phosphorus are obtained by multiplying the discharged nitrogen concentration and the discharged phosphorus concentration by the discharged flow rate, respectively.

The extreme value control system 1b of the application example optimizes the evaluation amount by inputting the evaluation amount based on the control amount such as the amount of discharged nitrogen and the amount of discharged phosphorus from the water treatment plant 4 and executing the extreme value control. The operation amount is updated so that it approaches the value. As an example of the evaluation function used in this case, a method of expressing the evaluation amount as the sum of the water quality cost based on the concept of wastewater levy and the electric power cost of the return sludge pump 441 (hereinafter referred to as "total cost") can be considered. The electric power cost of the return sludge pump 441 can be calculated from the return sludge flow rate, the rated power of the return sludge pump 441, and the like. Generally, in the concept of wastewater levy, the water quality cost is expressed by the following formula.

In formula (31), COD means chemical oxygen demand, BOD means biochemical oxygen demand, TN means released nitrogen, and TP means released phosphorus. The conversion factor for each cost may be determined based on the actual wastewater levy or may be determined by other methods. In general, it is known that TN and TP change significantly by changing the return rate. Therefore, here, the water quality cost J1 relating to the control of the return rate is defined as in the equation (32).

In addition to such water quality cost, the operating cost J2, which is the sum of the power cost of the blower indirectly changed by changing the return flow rate and the power cost of the return pump, is defined, and the operating cost J2 and the water quality cost J1 are defined. A function whose total cost is the sum of and may be defined as an evaluation function. For example, the operating cost J2 can be defined as in the equation (33).

The extreme value control system 1b of the application example extracts a Jacobian signal from the evaluation amount acquired by such an evaluation function, and applies the above-mentioned regularized signal to the Jacobian signal to perform extreme value control. , The integrated gain can be updated adaptively to the dynamics of the water treatment plant 4. As a result, the extreme value control system 1b of the application example can search for the optimum operation amount that minimizes the total cost with more stable operation. The extreme value control system of the embodiment can be applied to the control of an arbitrary process that outputs a control amount with respect to the input of the operation amount. For example, the controlled process may be a sewage treatment process, a combustion process, a petrochemical process, or the like.

According to at least one embodiment described above, the first gradient estimation unit that estimates the Jacobian of the evaluation function and the operation amount determination that determines the direction and amount to move the operation amount by integrating the estimated values of the Jacobian. The Jacobian estimation value input to the unit, the second gradient estimation unit that estimates the evaluation function Hessian, and the manipulation amount determination unit is a value based on the Jacobian or Hessian estimation value of the evaluation function and does not become 0. By dividing by the regularized signal adjusted in this way, it has a parameter adjustment unit that adjusts the integral gain of the operation amount determination unit according to the change of the evaluation function, and adapts to the dynamics of the controlled process. Extreme value control can be operated more stably.

Although some embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the gist of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, as well as in the scope of the invention described in the claims and the equivalent scope thereof.

Claims

The manipulated variable so that the evaluation quantity, which is a value based on the controlled quantity of the controlled target process and is a value represented by an unknown evaluation function with respect to the manipulated variable of the controlled target process, approaches the optimum value of the evaluation function. It is an optimal control device that executes extreme value control to update
A first gradient estimation unit that estimates the Jacobian of the evaluation function based on a signal indicating the evaluation amount, and
An operation amount determination unit that determines the direction and amount to move the operation amount by integrating the Jacobian estimates.
A second gradient estimation unit that estimates the hesian of the evaluation function based on the signal indicating the evaluation amount, and
By dividing the estimated value of the Jacobian input to the manipulated variable determination unit by a regularized signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0, the said A parameter adjusting unit that adjusts the integrated gain of the manipulated variable determination unit according to changes in the evaluation function,
Optimal control device equipped with.
The regularization signal is the Nth root of the value obtained by adding a small positive constant δ to the Nth root value (N> 0) of the absolute value of the Hessian estimate, or the absolute value M of the Hessian estimate. It is a signal represented by a value obtained by dividing the -1 power value (M ≧ 1) by the value obtained by adding a minute positive constant δ to the M power value of the absolute value of the estimated value of Hesian.
The optimal control device according to claim 1.
The regularization signal is the Nth root of the value obtained by adding a minute positive constant δ to the Nth root value (N> 0) of the absolute value of the Jacobian estimated value, or the M of the absolute value of the Jacobian estimated value. It is a signal represented by a value obtained by dividing the -1 power value (M ≧ 1) by the value obtained by adding a minute positive constant δ to the M power value of the absolute value of the estimated value of Jacobian.
The optimal control device according to claim 1.
The regularization signal is a signal represented by the absolute value of the Jacobian estimate.
The optimal control device according to claim 1.
The regularized signal is a signal represented by an approximate code estimate obtained by approximating the Jacobian code estimate with a continuous function.
The optimal control device according to claim 1.
By feeding back the operation amount determined by the operation amount determination unit and multiplying it by the regularization signal, the operation amount is corrected so that the operating point farther from the extreme value moves in the extreme value direction. Further equipped with a correction unit,
The optimal control device according to any one of claims 3 to 5.
The manipulated variable so that the evaluation quantity, which is a value based on the controlled quantity of the controlled target process and is a value represented by an unknown evaluation function with respect to the manipulated variable of the controlled target process, approaches the optimum value of the evaluation function. Is a way to perform extremum control to update
The step of estimating the Jacobian of the evaluation function based on the signal indicating the evaluation amount, and
The operation amount determination step of determining the direction and amount to move the operation amount by integrating the Jacobian estimates, and
The step of estimating the hesian of the evaluation function based on the signal indicating the evaluation amount, and
The operation is performed by dividing the estimated value of the Jacobian input to the operation amount determination step by a regularization signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0. A step of adjusting the integrated gain in the quantity determination step according to a change in the evaluation function, and
Optimal control method with.
The manipulated value so that the evaluation quantity, which is a value based on the controlled quantity of the controlled target process and is a value represented by an unknown evaluation function with respect to the manipulated quantity of the controlled target process, approaches the optimum value of the evaluation function. To a computer that functions as an optimal control device that performs extreme value control
The step of estimating the Jacobian of the evaluation function based on the signal indicating the evaluation amount, and
The operation amount determination step of determining the direction and amount to move the operation amount by integrating the Jacobian estimates, and
The step of estimating the hesian of the evaluation function based on the signal indicating the evaluation amount, and
The operation is performed by dividing the estimated value of the Jacobian input to the operation amount determination step by a regularization signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0. A step of adjusting the integrated gain in the quantity determination step according to a change in the evaluation function, and
A computer program to run.