WO2020241657A1 - Optimum control device, optimum control method and computer program - Google Patents

Optimum control device, optimum control method and computer program Download PDF

Info

Publication number
WO2020241657A1
WO2020241657A1 PCT/JP2020/020816 JP2020020816W WO2020241657A1 WO 2020241657 A1 WO2020241657 A1 WO 2020241657A1 JP 2020020816 W JP2020020816 W JP 2020020816W WO 2020241657 A1 WO2020241657 A1 WO 2020241657A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
signal
jacobian
evaluation
evaluation function
Prior art date
Application number
PCT/JP2020/020816
Other languages
French (fr)
Japanese (ja)
Inventor
理 山中
祐太 大西
由紀夫 平岡
Original Assignee
東芝インフラシステムズ株式会社
株式会社 東芝
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 東芝インフラシステムズ株式会社, 株式会社 東芝 filed Critical 東芝インフラシステムズ株式会社
Priority to JP2021522795A priority Critical patent/JP7183411B2/en
Priority to CN202080039159.8A priority patent/CN113874794A/en
Publication of WO2020241657A1 publication Critical patent/WO2020241657A1/en

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric

Definitions

  • the embodiment of the present invention relates to an optimum control device, an optimum control method, and a computer program.
  • Extreme value control is a model-free real-time optimal control technology that does not use a complex model of the plant.
  • the outline of the extreme value control is to search for an operation amount for which the evaluation amount based on the control amount of the controlled target process is optimized by forcibly changing the operation amount.
  • An object to be solved by the present invention is to provide an optimum control device, an optimum control method, and a computer program capable of operating extreme value control more stably by adapting to the dynamics of the controlled process.
  • the optimum control device of the embodiment optimizes the evaluation amount, which is a value based on the control amount of the controlled object process and is a value represented by an unknown evaluation function with respect to the operation amount of the controlled object process. It is an optimal control device that executes extreme value control that updates the manipulated variable so as to approach the value.
  • the optimum control device includes a first gradient estimation unit, a manipulated variable determination unit, a second gradient estimation unit, and a parameter adjustment unit.
  • the first gradient estimation unit estimates the Jacobian of the evaluation function based on the signal indicating the evaluation amount.
  • the manipulated variable determination unit determines the direction and amount in which the manipulated variable should be moved by integrating the estimated value of the Jacobian.
  • the second gradient estimation unit estimates the hesian of the evaluation function based on the signal indicating the evaluation amount.
  • the parameter adjusting unit divides the estimated value of the Jacobian input to the manipulated variable determination unit by a regularized signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0. By doing so, the integral gain of the manipulated variable determination unit is adjusted according to the change of the evaluation function.
  • FIG. 1A is a diagram illustrating a basic concept of extreme value control in the first embodiment.
  • FIG. 1B is a diagram illustrating a basic concept of extreme value control in the first embodiment.
  • FIG. 1C is a diagram illustrating a basic concept of extreme value control in the first embodiment.
  • FIG. 2 is a diagram showing a basic configuration example of an extreme value control system in the first embodiment.
  • FIG. 3 is a diagram showing a configuration example of the extreme value control system of the first embodiment.
  • FIG. 4 is a diagram showing a specific example of the method of adjusting the extreme value control parameter in the first embodiment.
  • FIG. 5 is a diagram showing a first configuration example of the gradient estimator according to the first embodiment.
  • FIG. 6 is a diagram showing a second configuration example of the gradient estimator according to the first embodiment.
  • FIG. 7 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the first method in the first embodiment.
  • FIG. 8 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the first method in the first embodiment.
  • FIG. 9 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the first embodiment.
  • FIG. 10 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the first embodiment.
  • FIG. 11 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the third method in the first embodiment.
  • FIG. 12 is a diagram for explaining an example of a regularized signal generated by the fourth method in the first embodiment.
  • FIG. 13 is a diagram for explaining another example of the regularization signal generated by the fourth method in the first embodiment.
  • FIG. 14A is a diagram for explaining an example of the result of simulating the response of the manipulated variable without using the regularization signal in the first embodiment.
  • FIG. 14B is a diagram for explaining an example of the result of simulating the response of the manipulated variable using the coded signal of the regularized signal gradient as the regularized signal in the first embodiment.
  • FIG. 14C is a diagram for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment.
  • FIG. 14A is a diagram for explaining an example of the result of simulating the response of the manipulated variable without using the regularization signal in the first embodiment.
  • FIG. 14B is a diagram for explaining an example of the result of simulating the response of the manipulated variable using the coded signal of the regularized signal
  • FIG. 14D is a diagram for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment.
  • FIG. 15 is a diagram showing a configuration example of the extreme value control system of the second embodiment.
  • FIG. 16 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the second embodiment.
  • FIG. 17 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the third method in the second embodiment.
  • FIG. 18 is a diagram showing an application example of the extreme value control system of the first embodiment or the second embodiment.
  • Extreme value control is a control method in which the manipulated variable is updated in the direction of approaching the optimum value while observing the change in the evaluated variable with respect to the manipulated variable.
  • the evaluation amount is a value that serves as an index for optimizing the process to be controlled (hereinafter referred to as “controlled process”), and is determined based on the controlled amount of the controlled process.
  • the evaluation quantity is represented by a predetermined evaluation function with the control quantity as a variable.
  • the evaluation quantity may be defined based on any evaluation standard as long as it is a value based on the control quantity.
  • the evaluation quantity may be the control quantity itself.
  • the evaluation function of the controlled process may be a function unknown to the manipulated variable.
  • the manipulated variable is changed by applying a dither signal to the signal indicating the manipulated variable.
  • the dither signal is a signal whose value changes periodically, and is usually given as a sine wave.
  • the operation amount is continuously vibrated by the dither signal, and the change (increase / decrease) in the evaluation amount caused by the vibration is observed. Then, based on the change in the observed evaluation amount, a new operation amount that changes the evaluation amount so as to approach the optimum value (maximum value or minimum value) of the evaluation function is calculated, and the calculated new operation amount is currently used. Update the operation amount of.
  • Extreme value control is a control method that searches for the optimum value of the evaluation function by repeating such observation of the evaluation quantity and update of the manipulated quantity.
  • FIG. 1A shows an evaluation function curve EV assuming a downwardly convex quadratic function as an example of an evaluation function unknown with respect to the manipulated variable.
  • FIG. 1B shows a case where the signal indicating the evaluation amount changes in the opposite phase to the dither signal as a result of vibrating the operation amount of the controlled process with the dither signal (for example, the evaluation amount decreases as the operation amount increases).
  • Such a change occurs, for example, when the operating point changes in the region on the left side of the minimum point P10 of the evaluation function curve EV (for example, when the operating point changes from the operating point P11 toward the minimum point P10).
  • FIG. 1C shows a case where the signal indicating the evaluation amount changes in the same phase as the dither signal as a result of changing the operation amount of the controlled process with the same dither signal as in FIG. 1B (for example, with respect to an increase in the operation amount).
  • the evaluation amount also increases).
  • Such a change occurs, for example, when the operating point changes in the region to the right of the minimum point P10 of the evaluation function curve EV (for example, when the operating point changes from the operating point P12 toward the minimum point P10).
  • the operation amount is decreased when the evaluation amount increases or decreases in the same phase as the operation amount, and the operation amount is increased when the evaluation amount increases or decreases in the opposite phase to the operation amount.
  • PID control Proportional-Integral-Derivative Control
  • PID control which has been generally used as a control method for industrial plants, controls the operation amount so that the control amount follows a preset target value. It was a type control method.
  • extreme value control is an optimum value search type control method that optimizes the evaluation amount
  • a process model that expresses the relationship between the operation amount and the control amount for the controlled target process like PID control is used. There is no need to create it in advance. Extreme value control having such a property has the potential to become widespread in the future because it can effectively function even for a controlled target process in which a target value cannot be set in advance.
  • an extreme value control system that realizes extreme value control can be realized with a relatively simple configuration as shown in FIG. 2 below.
  • FIG. 2 is a diagram showing a basic configuration example of an extreme value control system.
  • the extreme value control system 9 of FIG. 2 includes a modulation dither signal output unit 11, a high-pass filter 12 (HPF: High-Pass Filter), a demodulation dither signal output unit 13, and a low-pass filter 14 (LPF: Low-Pass Filter). And an integrator 15.
  • the configuration of the extreme value control system 9 is as complicated as that of the conventional PID control controller. Therefore, the extreme value control system 9 can be easily implemented by using hardware such as a PLC (Programmable Logic Controller), similarly to the PID control controller.
  • PLC Programmable Logic Controller
  • the modulation dither signal output unit 11 applies a dither signal to forcibly change the operation amount of the controlled process.
  • the modulation dither signal output unit 11 periodically changes the operation amount of the controlled target process by applying a dither signal such as a sine wave.
  • this operation is referred to as modulation
  • the dither signal used for modulation is referred to as a modulation dither signal.
  • the control amount changes according to the change in the operation amount due to this modulation.
  • the controlled target process acquires an evaluation quantity based on the control quantity that changes in this way, and feeds back the acquired evaluation quantity to the extreme value control system 9.
  • the controlled variable often changes with a certain time delay with respect to the change in the manipulated variable, so the evaluation quantity acquired based on the controlled variable also has a certain time delay with respect to the change in the manipulated variable.
  • the function of acquiring the evaluation amount based on the control amount does not necessarily have to be included in the controlled process.
  • the function of acquiring the evaluation quantity may be included in the extreme value control system 9, or may be realized by another device that may intervene between the controlled target process and the extreme value control system 9.
  • the extreme value control system 9 updates the operation amount so that the evaluation amount approaches the extreme value of the evaluation function based on the evaluation amount fed back in this way.
  • the evaluation function of the controlled process has a minimum value, but as described above, since the evaluation function is an unknown function with respect to the manipulated variable, its extreme value is also unknown with respect to the manipulated variable. Is. Therefore, the extremum control system 9 observes the magnitude and direction of the change of the evaluation quantity changed according to the modulation based on the signal of the evaluation quantity fed back, and based on the magnitude and direction of the observed change. Determine a new amount of operation.
  • this new manipulated variable is realized by the high-pass filter 12, the demodulation dither signal output unit 13, the low-pass filter 14, and the integrator 15 having the following functions.
  • the high-pass filter 12 removes a constant value bias according to an unknown minimum value from the feedback evaluation amount signal.
  • This process is a process for always adjusting the unknown minimum value to zero, and is a preprocess for determining the direction (increase or decrease) in which the integrator 15 described later updates the manipulated variable.
  • the demodulation dither signal output unit 13 causes the demodulation dither signal to act on the signal of the evaluation amount adjusted in this way, so that the evaluation amount changed according to the modulation of the operation amount is changed to the modulation dither signal. Extract the same frequency component.
  • this operation is referred to as demodulation, and the dither signal used for demodulation is referred to as a demodulation dither signal.
  • the role of demodulation is as follows.
  • the evaluation function unknown to the manipulated variable may contain a non-linear element.
  • the evaluation function is assumed to be a non-linear function that is convex downward (convex upward in the case of maximal value search). Due to such a non-linear element, it is highly likely that a harmonic component or a detuning component corresponding to the frequency ⁇ of the modulation dither signal appears in the evaluation quantity. Demodulation is a process for removing the effects of such harmonics and harmonics. By this modulation, among the components included in the evaluation amount signal, a component having the same frequency ⁇ as the modulation dither signal in which the evaluation amount is changed is extracted.
  • the demodulated evaluation amount signal is input to the low-pass filter 14.
  • the low-pass filter 14 extracts a steady component (low frequency component) from the signal of the evaluation amount.
  • the stationary component indicates the first derivative value (hereinafter referred to as "Jacobian") of the evaluation function, and is considered to indicate the direction (increase or decrease) of the change in the evaluation amount due to modulation.
  • the integrator 15 integrates the steady-state components extracted by the low-pass filter 14.
  • the integrator 15 functions as an estimator that estimates the direction of the manipulated variable (hereinafter referred to as “search direction”) to be moved in order to bring the evaluated quantity closer to the minimum value based on the integrated value of the steady-state component.
  • search direction the direction of the manipulated variable
  • the method of estimating the search direction in this way is generally called the gradient method, and is one of the basic methods of estimating the search direction in the adaptive control system.
  • the integrator 15 estimates the gradient of the evaluation function based on the integrated value of the constant component, and the magnitude of the manipulated variable to be moved in the search direction and the search direction of the manipulated variable based on the estimated gradient value. Adjust (the amount of movement of the operation amount).
  • the manipulated variable adjusted in this way is modulated by the modulation dither signal and input to the controlled process.
  • the configuration has been described assuming that the extreme value control system 9 searches for the minimum value, but when the extreme value control system 9 searches for the maximum value, the integrator 15 estimates it. The sign of the gradient may be reversed. Further, since the integrator generally has a low-pass characteristic, the extreme value control system 9 does not necessarily have to include the low-pass filter 14 when the integrator 15 has a sufficient low-pass characteristic.
  • the extreme value control system 9 realized by such a configuration is as complicated as the PID control system that has been generally used in the conventional process control, and therefore, like the PID control system, the PLC ( It can be easily implemented using hardware such as Programmable Logic Controller).
  • FIG. 3 is a diagram showing a configuration example of the extreme value control system 1 of the first embodiment.
  • the plant P shown in FIG. 3 is an example of a means for realizing a controlled process, and is, for example, a water treatment plant for realizing a biological wastewater treatment process.
  • the plant P includes various process devices that realize the controlled process, and operates the process devices based on the operation amount input from the extremum control system 1. Further, the plant P includes various measuring devices for measuring the controlled amount of the controlled process, and outputs information indicating the measured values (hereinafter referred to as “measurement information”) to the extreme value control system 1.
  • the extreme value control system 1 updates the operation amount in the direction (search direction) that brings the evaluation amount of the controlled target process closer to the optimum value based on the measurement information acquired from the plant P.
  • the extreme value control system 1 of the embodiment has the same modulation dither signal output unit 11, high-pass filter 12, and demodulation dither signal as the extreme value control system 9 of the conventional configuration. This is realized by including an output unit 13, a low-pass filter 14, and an integrator 15.
  • the extremum control system 1 of the embodiment is different from the conventional extremum control system 9 in that it includes a parameter adjusting unit 2 that adjusts the extremum control parameters based on the evaluation amount signal.
  • the extremum control system 1 includes a CPU (Central Processing Unit), a memory, an auxiliary storage device, etc. connected by a bus, and executes an extremum control program.
  • the extreme value control system 1 functions as a device or system including each of the above-mentioned functional units by executing the extreme value control program. Even if all or part of each function of the extreme value control system 1 is realized by using hardware such as ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), and FPGA (Field Programmable Gate Array). Good.
  • the program may be recorded on a computer-readable recording medium.
  • the computer-readable recording medium is, for example, a flexible disk, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, or a storage device such as a hard disk built in a computer system.
  • the program may be transmitted over a telecommunication line.
  • the parameter adjustment unit 2 has a function of adaptively adjusting extreme value control parameters with respect to the dynamics of the controlled process. Specifically, the parameter adjusting unit 2 adaptively adjusts the integrated gain of the integrator 15 based on the gradient estimation value of the evaluation function that changes from moment to moment due to the dynamics.
  • FIG. 4 is a diagram showing a specific example of the method of adjusting the extreme value control parameter in the first embodiment. Specifically, FIG. 4 cites the adjustment method described in Patent Document 1. No. 4 in FIG. As described in 5, the parameter adjusting unit 2 in the present embodiment estimates the second derivative value (hereinafter referred to as “hesian”) of the evaluation function based on the evaluation amount fed back from the controlled target process, and the estimated hesian. The value is used to determine the new integrated gain. For such adjustment of the integrated gain, the parameter adjusting unit 2 includes a first multiplier 21, a gradient estimation unit 22, a regularized signal output unit 23, and a second multiplier 24.
  • hesian the second derivative value
  • the first multiplier 21 multiplies the evaluation amount signal input from the high-pass filter 12 by the dither signal (squared signal) and outputs it to the gradient estimation unit 22.
  • the gradient estimation unit 22 extracts the hesian signal H (t) from the output signal of the first multiplier 21 and outputs it to the regularized signal output unit 23.
  • the gradient estimation unit 22 can estimate the differential value of the 0th or higher order of the evaluation function by using the method described in Non-Patent Document 1. That is, the gradient estimation unit 22 can function as a first gradient estimation unit that estimates the Jacobian of the evaluation function or a second gradient estimation unit that estimates the Hesian of the evaluation function.
  • Non-Patent Document 1 describes a configuration in which a low-pass filter is used to estimate the differential value of the 0th or higher order of the evaluation function, and the basic concept thereof is as follows.
  • Equation (1) defines the evaluation quantity J (t) as an (unknown) function for the manipulated quantity U (t).
  • f should be an operator of the dynamic system rather than a function, but the frequency ⁇ of the dither signal changes slowly enough with respect to the dynamics of the plant.
  • f can be regarded as a function approximately.
  • Equation (2) is obtained by Taylor-expanding equation (1).
  • Equation (3) is obtained by multiplying this equation (2) by sinn ⁇ t (n is an integer of 1 or more). Further, when the periodic averaging process (or time integration) is performed on the equation (3), only the component related to sinn ⁇ t remains due to the orthogonality of the sine wave, and the equation (4) is obtained.
  • FIG. 5 and 6 are diagrams showing a configuration example of the gradient estimator represented by the equation (7).
  • the value control system can be regarded as a configuration in which the Jacobian of the evaluation function is estimated by the low-pass filter 14.
  • the parameter adjusting unit 2 can adjust the integral gain by using the hesian of the evaluation function estimated by such a method as it is, but in that case, the extreme value control may become unstable due to the reason described later. There is. Therefore, in the extreme value control system 1 of the present embodiment, the Jacobian estimated by the low-pass filter 14 is regularized based on the estimated value of Hesian, and the regularized Jacobian signal is supplied to the integrator 15. As a result, the extreme value control system 1 of the present embodiment can adaptively update the integrated gain while avoiding the instability of the extreme value control.
  • the regularization signal output unit 23 generates a signal (hereinafter referred to as “regularization signal”) that regularizes the Jacobian signal output by the gradient estimation unit 22, and causes the second multiplier 24. Output.
  • the second multiplier 24 inputs the Jacobian signal G (t) from the low-pass filter 14 and the regularized signal from the regularized signal output unit 23, and multiplies the Jacobian signal by the regularized signal to make the Jacobian signal regular. To become.
  • the second multiplier 24 supplies the regularized Jacobian signal Gn (t) to the integrator 15.
  • “regularization” of a signal means avoiding an adverse condition (ill-condition) that when an attempt is made to perform some kind of back calculation on a target signal, the back calculation does not exist and the back calculation cannot be performed. ..
  • an adverse condition is "zero division" in division.
  • the extremum control system 1 of the present embodiment adaptively updates the integral gain of the integrator 15 by using the evaluation function Hessian, and this is each control.
  • the integral gain of the period can be replaced with the equivalent of performing extreme value control by fixing the Jacobian of the evaluation function to a constant value obtained by dividing (normalizing) it by Hessian. That is, the configuration of the extremum control system 1 in the present embodiment can be regarded as adding a configuration for normalizing the Jacobian with Hessian to the basic configuration shown in FIG.
  • avoiding adverse conditions such as "zero division” in the "normalization” of the Jacobian signal is defined as "regularization”, and acts on the Jacobian signal to perform such regularization.
  • regularization acts on the Jacobian signal to perform such regularization.
  • the regularized signal output unit 23 generates a signal that realizes signal conversion ( ⁇ ) satisfying each of the following conditions as a regularized signal.
  • G (t) represents a Jacobian signal
  • Gn (t) represents a Jacobian signal that has been regularized (that is, divided by Hessian) by the action of a regularized signal.
  • [Condition 1] represents a condition that Gn (t) is also 0 only when G (t) is 0.
  • [Condition 2] represents a condition that the signs of G (t) and Gn (t) are the same.
  • [Condition 3] represents a condition that when G (t) is finite, Gn (t) is also finite (that is, zero division does not occur).
  • [Condition 4] represents a condition that when G (t) diverges to ⁇ , Gn (t) does not diverge to ⁇ and converges to a certain positive finite value.
  • a regularized signal having such a property is represented by, for example, the equation (8).
  • ⁇ in equation (8) is a positive constant ( ⁇ > 0) and represents a so-called regularization constant.
  • H (t) represents the hesian of the estimated evaluation function. Since the regularization signal represented by the equation (8) can be generated by the process of adding the regularization constant ⁇ to the hesian signal and the process of taking the absolute value of the hesian signal, it is possible to suppress the increase in size of the device. The stability of extreme value control can be improved.
  • a downwardly convex M-th order (M> 1) evaluation function in which hesian takes a positive value near the extreme value, or hesian is located near the extreme value. It is possible to stably search for extreme values even for evaluation functions of the Mth order (0 ⁇ M ⁇ 1), which are negative and convex.
  • the regularized signal generated using Hessian does not necessarily have to be represented by the Hessian first-order function as in Eq. (8).
  • FIG. 7 and 8 are diagrams showing a configuration example of an extremum control system that generates a regularized signal by the first method.
  • FIG. 7 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (8) is input to the integrator 15.
  • FIG. 8 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (9) is input to the integrator 15.
  • the second method is a method of generating a regularized signal without the first multiplier 21 and the gradient estimation unit 22 by using the Jacobian estimation value instead of the Hesian estimation value. ..
  • the regularized signal generated by the second method is represented by the equation (13).
  • G (t) represents the Jacobian of the evaluation function estimated by the gradient estimation unit 22, or an amount proportional to the Jacobian.
  • the regularized signal generated by the second method does not necessarily have to be represented by the Jacobian first power function.
  • equations (14) and (15) are examples of equations (14) and (15)).
  • FIG. 9 and 10 are diagrams showing a configuration example of an extreme value control system that generates a regularized signal by the second method.
  • FIG. 9 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (13) is input to the integrator 15.
  • the regularized signal generated by the third method is represented by the equation (16).
  • > 0
  • the Jacobian signal Gn (t) after regularization also diverges. Since such a signal does not satisfy the conditions 1 to 4 as a regularization signal in the first place, it should not be used as a regularization signal in the first place, but the regularization of the Jacobian signal is defined as in the equation (17). In some cases, it can be used as a regularization signal.
  • Equation (17) indicates that dividing the Jacobian signal by the regularization signal RS of equation (16) is defined as regularization of the Jacobian signal. Since this is an operation of dividing the Jacobian signal by the absolute value of the Jacobian signal, Gn (t) in the equation (17) takes either a value of -1 or +1. That is, it can be said that the regularized signal generated by the third method is a signal that acts on the Jacobian signal and extracts its code information (-1 or +1). Therefore, when the regularization defined in the equation (17) is performed, the signal represented by the equation (16) functions as a regularization signal in principle.
  • the regularized signal output unit 23 may be configured to directly calculate sgn (G (t)). Further, by adopting such a configuration, it is possible to avoid the occurrence of zero division.
  • sgn (x) represents a function that returns the sign of the value x.
  • FIG. 11 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the third method in the first embodiment.
  • the code information extracted in this way is simply a signal indicating only the direction (increase or decrease) in which the manipulated variable should be moved. Therefore, by providing such a regularization configuration, it is possible to configure a simple extremum control system as shown in FIG. 11, for example. In this case, since the amount of change in the operation amount in each control cycle is a constant value, the amount of change is adjusted to be a desired amount by multiplying sgn (G (t)) by a coefficient. May be good.
  • the second multiplier 24 regularizes the Jacobian signal G (t) by multiplying the regularized signal generated in this way, and outputs the regularized Jacobian signal Gn (t) to the integrator 15.
  • the regularization of the Jacobian signal by the parameter adjusting unit 2 is expressed by the equation (18).
  • the parameter adjusting unit 2 adaptively updates the integrated gain of the integrator 15 using the Jacobian signal regularized in this way.
  • the direction in which the manipulated variable should be moved in extreme value control is represented by the sign of the gradient (Jacobian) estimated for the evaluation function unknown to the manipulated variable, and the amount of movement is adjusted by the integral gain.
  • Jacobian the sign of the gradient
  • the integral gain a method of adjusting the integral gain based on the concept of the average system (average system) described in Non-Patent Document 2 and the like will be described.
  • the average system is a system that represents the dynamic behavior of a system that takes the periodic average (average) when a periodic input is applied to a certain system.
  • the average system is used in the stability analysis of the extreme value control system.
  • Non-Patent Document 2 specifically describes the dynamics of the average system of an extreme value control system of a static plant having no dynamics.
  • the averaging system is represented by equations (19) and (20).
  • G (u) represents the Jacobian estimate of the evaluation function that is unknown for the manipulated variable u.
  • a represents the amplitude of the dither signal.
  • KI0 represents the integrated gain on the time axis of ⁇ . KI0 is converted into an integral gain KI on the time axis of real time t by the equation (20).
  • U- means a symbol with "-" directly above "u”.
  • Non-Patent Document 2 assumes that the controlled process is static, but the period of the dither signal is set to be sufficiently longer than the time constant of the plant. This means that if the dither signal frequency ⁇ is set sufficiently lower than the cutoff frequency of the controlled process, the controlled process with dynamics can be regarded as an approximately static controlled process. That is. This is also supported by the singular perturbation theory used in the stability analysis of extremum control.
  • the equation (19) is generally a nonlinear differential equation.
  • the average system linearized around the appropriate operating point u0 with respect to u in the equation (19) can be expressed by the equation (22).
  • u ⁇ means a symbol with " ⁇ " directly above "u”.
  • u ⁇ u ⁇ u0, where H (u0) represents the Jacobian of G (u). That is, H (u0) is the evaluation function Hessian. Therefore, in the present embodiment, the convergence speed of the extreme value control is adjusted by adaptively adjusting the integral gain KI0 so as to be proportional to the reciprocal of H (u0). In such adjustment of the integral gain, the regularized signal (see equations (8) to (12)) generated by the first method avoids zero division by the estimated value of Hessian and causes a sudden code change. It acts to suppress.
  • the regularized signal by the first method is defined as a new signal by expelling the estimated value of Hessian that fluctuates with time from the calculation formula of the integral gain.
  • the factor that makes the integral gain variable from the calculation formula in this way, the term 1 / H (u0) is removed from the calculation formula, and the integral gain KI0 can be adjusted as a fixed value.
  • the minute constant ⁇ (> 0) included in the definition formula of the regularized signal is for avoiding zero division by Hesian, and KI0 is the maximum value (constant term) when the estimated value of Hesian becomes 0. 1 / ⁇ times). Therefore, the integral gain can be adjusted by determining ⁇ based on the maximum value assumed for KI0.
  • Equation (22) is a linear approximation of the nonlinear equation of equation (19), but it is said that G (u) is eliminated as a method of suppressing the influence of G (u), which is a nonlinear element of equation (19).
  • a more direct method can be considered.
  • the above-mentioned second and third methods are methods for generating a regularized signal according to such an idea. Since the integral gain KI0 in the equation (19) is a parameter to be adjusted and can be determined by the designer, it is regarded as a kind of operation amount that transforms the differential equation in the equation (19), and the equation (19) is used. By newly defining the integral gain KI0'that satisfies 23), the non-linear element in the equation (19) can be eliminated.
  • equation (23) when the equation (23) is applied, the information about the Jacobian is not included in the equation (19), so that the information necessary for determining the direction in which the manipulated variable should be moved is lost. Therefore, by taking the absolute value of the Jacobian as in the equation (24) instead of the equation (23), the sign information of the Jacobian can be left in the equation (19). In this case, equation (19) is expressed as equation (25).
  • the regularized signal by the third method is a signal for expelling the Jacobian non-linearity from the calculation formula of the integral gain. Since the equation (25) includes only the Jacobian code information, the adjustment of the new integral gain KI0'is easy. The adjustment of KI0'may be performed with reference to the adjustment method shown in FIG. 5, considering that the transient behavior of the extreme value control based on the equation (25) becomes linear.
  • the average system of equation (27) behaves in the same way as the average system of equation (25) because the effect of ⁇ is small when the Jacobian value is large.
  • the influence of ⁇ becomes large, so that the average system of Eq. (27) behaves in proportion to the Jacobian. Therefore, even in the regularization according to the second method, the integral gain KI0'is basically adjusted in the same way as in Eq. (25) by simply setting ⁇ to prevent chattering near the extreme value. be able to.
  • the regularized signal by the fourth method is a signal represented by an approximate code estimated value obtained by approximating the code estimated value of the Jacobian signal with a continuous function.
  • the Jacobian signal is regularized by the regularized signal generated by the fourth method
  • the Jacobian signal (coded signal) after the regularization is approximated by a smooth continuous function by the regularized signal by the third method.
  • the signal introduced by the fine constant ⁇ > 0 in the third method is the regularized signal by the second method.
  • the approximation function that satisfies the above-mentioned conditions 1-4 of the regularization signal while approximating the sign function that is the regularization signal by the third method is not limited to the regularization signal by the second method.
  • the sign function is approximated by a continuous function or a smooth continuous function
  • the definition of the regularized signal in Condition 1-4 is satisfied.
  • the approximation function is not limited to the regularization signal by the second method described above, and for example, the following approximation functions can be considered.
  • a power function of G (t) such as ⁇ , ⁇ > 0, ⁇ > 0. For example, when ⁇ 1, the gradient G (t) is cut off by ⁇ 1. Equivalent to.
  • the cumulative normal distribution function, Gompertz function, Gudermannian function, etc. included in the sigmoid function in a broad sense were translated so that the origin was at the center, and the value range was appropriately scale-converted to ⁇ 1.
  • Functions etc. are also included.
  • FIG. 12 is a diagram for explaining an example of a regularized signal generated by the fourth method in the first embodiment.
  • FIG. 14A is a diagram for explaining an example of the result of simulating the response of the manipulated variable without using the regularization signal in the first embodiment.
  • the example shown in FIG. 14A is a comparative example to which the regularization signal is not applied, and is adjusted so as to operate well when the initial value is 2. At this time, if the initial value is changed to 3, the extreme value control diverges.
  • FIG. 14B is a diagram for explaining an example of the result of simulating the response of the manipulated variable using the coded signal of the gradient as the regularized signal in the first embodiment.
  • the example shown in FIGS. 14B-14D applies a regularization signal.
  • the example shown in FIG. 14B is a case where a sign function which is a regularized signal by the third method is applied. At this time, the extremum can be searched at the same convergence speed regardless of whether the initial value is 2 or 3, but chattering occurs near the extremum 1. This is because the sign function is the strongest discontinuous switching function.
  • FIG. 14C and 14D are diagrams for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment.
  • the extreme value can be searched at the same convergence speed regardless of the initial value, but the extreme value is not reached within a predetermined time as in FIG. 14A. This is because the gradient itself is used near the extreme value, so that the same phenomenon as in FIG. 14A occurs near the extreme value.
  • the example shown in FIG. 14D is, for example, -sgn (G (t)) / ⁇ ⁇
  • the extremum can be searched regardless of the initial value, and chattering can be suppressed.
  • the sign function sgn () is approximated by G (t) (as an example, the above-mentioned function A).
  • -C can be realized.
  • the regularized signal output unit 23 performs an operation corresponding to the approximate function. By doing so, it can be realized. That is, the arithmetic expression in the regularized signal output unit 23 may be set so that the output of the regularized signal output unit 23 is a value obtained by dividing the Jacobian signal after the regularization by G (t).
  • the regularization function is not limited to that of the third method, and the behavior near the searched extreme value is obtained by selecting a continuous approximation function and adjusting the parameters of the function. Can be easily adjusted, and more detailed extremum search control can be realized.
  • the regularization signal generated by the regularization signal output unit 23 has a role of expelling complicated behavior due to the non-linear element of the evaluation function from the average system that gives the direction and amount to move the operation amount, and parameter adjustment.
  • the unit 2 regularizes the Jacobian signal output by the gradient estimation unit 22 by using the regularization signal having such a property. Then, the integrator 15 can more appropriately control the search direction of the extreme value control by integrating the regularized Jacobian using the adjusted integration gain.
  • the extremum control system 1 of the first embodiment configured as described above is controlled by providing a parameter adjusting unit that regularizes the Jacobian signal and adaptively adjusts the integral gain based on the regularized Jacobian signal. It is possible to adapt the extreme value control of the target process to its dynamics and operate it more stably.
  • the extremum control system 1 of the first embodiment realizes the following (1) and (2) by regularizing the Jacobian signal, thereby integrating without impairing the stability of the extremum control. It makes it possible to easily adjust the gain. (1) Avoid zero division when calculating the integral gain. (2) Maintain the sign information of the Jacobian signal (avoid sudden sign inversion). As a result, the operator of the controlled process can more easily and safely adjust the convergence speed of the extreme value control.
  • FIG. 12 is a diagram showing a configuration example of the extreme value control system 1a of the second embodiment.
  • the extreme value control system 1a is different from the extreme value control system 1 of the first embodiment in that it further includes an operation amount correction unit 3.
  • the extreme value control system 1a shown in FIG. 12 is operated by the extreme value control system 1 shown in FIG. 9 among the configuration examples of the extreme value control system that generated the regularized signal by the second method in the first embodiment. It is configured by adding the amount correction unit 3. Since the other configurations of the extreme value control system 1a are the same as those of the extreme value control system 1 of the first embodiment, the same reference numerals as those in FIG. 3 are given here, and the description of those similar functional parts is omitted. To do.
  • the operation amount correction unit 3 is for obtaining an average system having the behavior as shown in the equation (29), and by feeding back the operation amount u and multiplying it with the regularization signal, the operating point away from the extreme value. Correct the operation amount so that it moves in the extreme value direction more moderately.
  • the regularization signal is generated by the third method
  • the average system of Eq. (25) can be obtained, but in order to obtain the system of Eq. (29), the derivative of u and u to Focusing on the fact that the derivatives are equal, the right side of equation (25) may be multiplied by F (u ⁇ ).
  • u ⁇ u-u *
  • u ⁇ u-u *
  • the optimum value of the manipulated variable is unknown, so u ⁇ cannot be used directly.
  • u ⁇ can be approximately obtained by applying a high-pass filter to u as an estimated value of u ⁇ and removing the unknown constant term u *. That is, u ⁇ can be estimated by the equation (30).
  • the extremum search can be modified so as to have an exponential response characteristic. If you want to move the operating point to the vicinity of the extreme value at a speed even faster than the exponential response characteristic, for example, by setting the estimated value of the operation amount u to the power value of the value obtained by the equation (30). The movement width of the operating point may be made larger.
  • FIG. 13 and 14 are diagrams showing a configuration example of the extreme value control system 1a according to the second embodiment.
  • FIG. 13 shows the operation amount of the extreme value control system 1 shown in FIG. 10 among the configuration examples of the extreme value control system 1 that generated the regularized signal by the second method in the first embodiment.
  • An example of the extreme value control system 1a configured by adding the correction unit 3 is shown.
  • FIG. 14 shows the extreme value control system 1a configured by adding the manipulated variable correction unit 3 to the extreme value control system 1 (see FIG. 11) that generated the regularized signal by the third method in the first embodiment. An example is shown.
  • the extreme value control system 1a of the second embodiment configured in this way has the same effect as the extreme value control system 1 of the first embodiment, and in addition, finely adjusts the speed of the extreme value search. Will be possible.
  • the operating point when the operating point is far away from the extremum, the operating point can be quickly moved to the vicinity of the extremum, and the operating point can be finely moved near the extremum, thus improving the speed and accuracy of the extremum search. It becomes possible.
  • FIG. 15 is a diagram showing an application example of the extreme value control system of the first embodiment or the extreme value control system of the second embodiment.
  • FIG. 15 shows an example in which the extremum control system of the embodiment is applied to a water treatment plant 4 that realizes a biological wastewater treatment process.
  • the water treatment plant 4 shown in FIG. 15 includes facilities of an anaerobic tank 41, an oxygen-free tank 42, an aerobic tank 43, and a final settling pond 44.
  • the anaerobic tank 41 is a facility for activating microorganisms.
  • the oxygen-free tank 42 is a facility for removing nitrogen.
  • the aerobic tank 43 is a facility for decomposing organic substances, removing phosphorus, and nitrifying ammonia.
  • the final sedimentation pond 44 is a facility for precipitating activated sludge.
  • the water treatment plant 4 is equipped with equipment such as a pump for transporting water and sludge between the above equipment, a blower for supplying air into the tank, and a sensor for measuring the concentration of substances in the air or water.
  • the chemical input pump 411 is a pump that charges a chemical such as a carbon source that activates microorganisms into the anaerobic tank 41.
  • the circulation pump 431 is a pump that controls the circulation amount of the water to be treated that circulates between the aerobic tank 43 and the anoxic tank 42.
  • the blower 432 supplies air to the aerobic tank 43 to control the amount of aeration.
  • the return sludge pump 441 is a pump that returns sludge from the final settling pond 44 to the oxygen-free tank 42.
  • the excess sludge extraction pump 442 is a pump that extracts excess sludge from the final sedimentation pond 44.
  • the sensor 412 and the sensor 443 measure the quality of the discharged water in the anaerobic tank 41 and the final settling pond 44, respectively.
  • the manipulated amount is the return rate of the returned sludge
  • the controlled amount is the concentration of nitrogen contained in the discharged water (hereinafter referred to as “discharged nitrogen concentration”) and the concentration of phosphorus (hereinafter referred to as “discharged nitrogen concentration”).
  • discharged nitrogen concentration the concentration of nitrogen contained in the discharged water
  • discharged nitrogen concentration the concentration of phosphorus
  • released phosphorus concentration the concentration of phosphorus concentration
  • the controlled amount may be the amount of nitrogen contained in the discharged water (hereinafter referred to as “the amount of discharged nitrogen”) and the amount of phosphorus (hereinafter referred to as “the amount of discharged phosphorus").
  • the amount of discharged nitrogen and the amount of discharged phosphorus are obtained by multiplying the discharged nitrogen concentration and the discharged phosphorus concentration by the discharged flow rate, respectively.
  • the extreme value control system 1b of the application example optimizes the evaluation amount by inputting the evaluation amount based on the control amount such as the amount of discharged nitrogen and the amount of discharged phosphorus from the water treatment plant 4 and executing the extreme value control.
  • the operation amount is updated so that it approaches the value.
  • a method of expressing the evaluation amount as the sum of the water quality cost based on the concept of wastewater levy and the electric power cost of the return sludge pump 441 (hereinafter referred to as "total cost”) can be considered.
  • the electric power cost of the return sludge pump 441 can be calculated from the return sludge flow rate, the rated power of the return sludge pump 441, and the like.
  • the water quality cost is expressed by the following formula.
  • COD means chemical oxygen demand
  • BOD means biochemical oxygen demand
  • TN means released nitrogen
  • TP means released phosphorus.
  • the conversion factor for each cost may be determined based on the actual wastewater levy or may be determined by other methods. In general, it is known that TN and TP change significantly by changing the return rate. Therefore, here, the water quality cost J1 relating to the control of the return rate is defined as in the equation (32).
  • the operating cost J2 which is the sum of the power cost of the blower indirectly changed by changing the return flow rate and the power cost of the return pump, is defined, and the operating cost J2 and the water quality cost J1 are defined.
  • the operating cost J2 can be defined as in the equation (33).
  • the extreme value control system 1b of the application example extracts a Jacobian signal from the evaluation amount acquired by such an evaluation function, and applies the above-mentioned regularized signal to the Jacobian signal to perform extreme value control.
  • the integrated gain can be updated adaptively to the dynamics of the water treatment plant 4.
  • the extreme value control system 1b of the application example can search for the optimum operation amount that minimizes the total cost with more stable operation.
  • the extreme value control system of the embodiment can be applied to the control of an arbitrary process that outputs a control amount with respect to the input of the operation amount.
  • the controlled process may be a sewage treatment process, a combustion process, a petrochemical process, or the like.
  • the first gradient estimation unit that estimates the Jacobian of the evaluation function and the operation amount determination that determines the direction and amount to move the operation amount by integrating the estimated values of the Jacobian.
  • the Jacobian estimation value input to the unit, the second gradient estimation unit that estimates the evaluation function Hessian, and the manipulation amount determination unit is a value based on the Jacobian or Hessian estimation value of the evaluation function and does not become 0.

Abstract

Provided are an optimum control device, an optimum control method, and a computer program which can adapt to the dynamics of a target control process and more stably operate extreme value control. The optimum control device according to the present embodiment comprises a first gradient estimation unit, a manipulated variable determination unit, a second gradient estimation unit, and a parameter adjustment unit. The first gradient estimation unit estimates the Jacobian of an evaluation function on the basis of a signal indicating an evaluation amount. The manipulated variable determination unit determines a direction and an amount to move a manipulated variable by means of integrating the estimated value of the Jacobian. The second gradient estimation unit estimates the Hessian of the evaluation function on the basis of the signal indicating the evaluation amount. The parameter adjustment unit adjusts integration gain in the manipulated variable determination unit in accordance with a change in the evaluation function by means of dividing the estimated value of the Jacobian, which was inputted into the manipulated variable determination unit, by a regularized signal, which is a value based on the estimated value of the Jacobian or the Hessian of the evaluation function and adjusted so as to not be 0.

Description

最適制御装置、最適制御方法及びコンピュータプログラムOptimal control device, optimal control method and computer program
 本発明の実施形態は、最適制御装置、最適制御方法及びコンピュータプログラムに関する。 The embodiment of the present invention relates to an optimum control device, an optimum control method, and a computer program.
 近年、プラント制御の方法として、極値制御と呼ばれる技術が注目されている。極値制御は、プラントの複雑なモデルを用いないモデルフリーのリアルタイム最適制御技術である。極値制御の概要は、操作量を強制的に変化させることにより、制御対象プロセスの制御量に基づく評価量が最適化される操作量を探索していくものである。このような極値制御をプラント制御に適用する場合、極値制御に係る各種のパラメータ(以下「極値制御パラメータ」という。)を制御対象プロセスの特性に応じて適切に設定する必要がある。従来、極値制御パラメータの設計に関する指針がいくつか示されているが、そのいずれも制御対象プロセスの時間的な変化(以下「ダイナミクス」という。)に適応して極値制御を安定的に動作させることができるまでには至っていない。 In recent years, a technique called extreme value control has been attracting attention as a method of plant control. Extreme value control is a model-free real-time optimal control technology that does not use a complex model of the plant. The outline of the extreme value control is to search for an operation amount for which the evaluation amount based on the control amount of the controlled target process is optimized by forcibly changing the operation amount. When such extreme value control is applied to plant control, it is necessary to appropriately set various parameters related to extreme value control (hereinafter referred to as "extreme value control parameters") according to the characteristics of the controlled process. Conventionally, some guidelines for designing extreme value control parameters have been shown, but all of them operate stably in extreme value control by adapting to temporal changes in the controlled process (hereinafter referred to as "dynamics"). It has not reached the point where it can be made to do.
日本国特開2017-033104号公報Japanese Patent Application Laid-Open No. 2017-033104
 本発明が解決しようとする課題は、制御対象プロセスのダイナミクスに適応して極値制御をより安定的に動作させることができる最適制御装置、最適制御方法及びコンピュータプログラムを提供することである。 An object to be solved by the present invention is to provide an optimum control device, an optimum control method, and a computer program capable of operating extreme value control more stably by adapting to the dynamics of the controlled process.
 実施形態の最適制御装置は、制御対象プロセスの制御量に基づく値であって前記制御対象プロセスの操作量に対して未知の評価関数によって表される値である評価量を、前記評価関数の最適値に近づけるように前記操作量を更新する極値制御を実行する最適制御装置である。最適制御装置は、第1の勾配推定部と、操作量決定部と、第2の勾配推定部と、パラメータ調整部と、を持つ。第1の勾配推定部は、前記評価量を示す信号に基づいて前記評価関数のヤコビアンを推定する。操作量決定部は、前記ヤコビアンの推定値を積分することにより前記操作量を動かすべき方向及び量を決定する。第2の勾配推定部は、前記評価量を示す信号に基づいて前記評価関数のヘシアンを推定する。パラメータ調整部は、前記操作量決定部に入力される前記ヤコビアンの推定値を、前記評価関数のヤコビアン又はヘシアンの推定値に基づく値であって0とならないように調整された正則化信号で除することにより、前記操作量決定部の積分ゲインを前記評価関数の変化に応じて調整する。 The optimum control device of the embodiment optimizes the evaluation amount, which is a value based on the control amount of the controlled object process and is a value represented by an unknown evaluation function with respect to the operation amount of the controlled object process. It is an optimal control device that executes extreme value control that updates the manipulated variable so as to approach the value. The optimum control device includes a first gradient estimation unit, a manipulated variable determination unit, a second gradient estimation unit, and a parameter adjustment unit. The first gradient estimation unit estimates the Jacobian of the evaluation function based on the signal indicating the evaluation amount. The manipulated variable determination unit determines the direction and amount in which the manipulated variable should be moved by integrating the estimated value of the Jacobian. The second gradient estimation unit estimates the hesian of the evaluation function based on the signal indicating the evaluation amount. The parameter adjusting unit divides the estimated value of the Jacobian input to the manipulated variable determination unit by a regularized signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0. By doing so, the integral gain of the manipulated variable determination unit is adjusted according to the change of the evaluation function.
図1Aは、第1の実施形態において、極値制御の基本的な概念を説明する図である。FIG. 1A is a diagram illustrating a basic concept of extreme value control in the first embodiment. 図1Bは、第1の実施形態において、極値制御の基本的な概念を説明する図である。FIG. 1B is a diagram illustrating a basic concept of extreme value control in the first embodiment. 図1Cは、第1の実施形態において、極値制御の基本的な概念を説明する図である。FIG. 1C is a diagram illustrating a basic concept of extreme value control in the first embodiment. 図2は、第1の実施形態において、極値制御システムの基本的な構成例を示す図である。FIG. 2 is a diagram showing a basic configuration example of an extreme value control system in the first embodiment. 図3は、第1の実施形態の極値制御システムの構成例を示す図である。FIG. 3 is a diagram showing a configuration example of the extreme value control system of the first embodiment. 図4は、第1の実施形態における極値制御パラメータの調整方法の具体例を示す図である。FIG. 4 is a diagram showing a specific example of the method of adjusting the extreme value control parameter in the first embodiment. 図5は、第1の実施形態における勾配推定器の第1の構成例を示す図である。FIG. 5 is a diagram showing a first configuration example of the gradient estimator according to the first embodiment. 図6は、第1の実施形態における勾配推定器の第2の構成例を示す図である。FIG. 6 is a diagram showing a second configuration example of the gradient estimator according to the first embodiment. 図7は、第1の実施形態において、第1の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 7 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the first method in the first embodiment. 図8は、第1の実施形態において、第1の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 8 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the first method in the first embodiment. 図9は、第1の実施形態において、第2の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 9 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the first embodiment. 図10は、第1の実施形態において、第2の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 10 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the first embodiment. 図11は、第1の実施形態において、第3の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 11 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the third method in the first embodiment. 図12は、第1の実施形態において、第4の方法によって生成された正則化信号の一例について説明するための図である。FIG. 12 is a diagram for explaining an example of a regularized signal generated by the fourth method in the first embodiment. 図13は、第1の実施形態において、第4の方法によって生成された正則化信号の他の例について説明するための図である。FIG. 13 is a diagram for explaining another example of the regularization signal generated by the fourth method in the first embodiment. 図14Aは、第1の実施形態において、正則化信号を用いることなく操作量の応答をシミュレーションした結果の一例を説明するための図である。FIG. 14A is a diagram for explaining an example of the result of simulating the response of the manipulated variable without using the regularization signal in the first embodiment. 図14Bは、第1の実施形態において、正則化信号として正則化信号勾配の符号信号を用いて操作量の応答をシミュレーションした結果の一例を説明するための図である。FIG. 14B is a diagram for explaining an example of the result of simulating the response of the manipulated variable using the coded signal of the regularized signal gradient as the regularized signal in the first embodiment. 図14Cは、第1の実施形態において、第4の方法によって生成された正則化信号を用いたときの効果の一例を説明するための図である。FIG. 14C is a diagram for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment. 図14Dは、第1の実施形態において、第4の方法によって生成された正則化信号を用いたときの効果の一例を説明するための図である。FIG. 14D is a diagram for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment. 図15は、第2の実施形態の極値制御システムの構成例を示す図である。FIG. 15 is a diagram showing a configuration example of the extreme value control system of the second embodiment. 図16は、第2の実施形態において、第2の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 16 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the second method in the second embodiment. 図17は、第2の実施形態において、第3の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。FIG. 17 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the third method in the second embodiment. 図18は、第1の実施形態又は第2の実施形態の極値制御システムの適用例を示す図である。FIG. 18 is a diagram showing an application example of the extreme value control system of the first embodiment or the second embodiment.
実施形態Embodiment
 以下、実施形態の最適制御装置、最適制御方法及びコンピュータプログラムを、図面を参照して説明する。 Hereinafter, the optimum control device, the optimum control method, and the computer program of the embodiment will be described with reference to the drawings.
(第1の実施形態)[極値制御の概略]
 図1A乃至図1Cは、極値制御の基本的な概念を説明する図である。
 極値制御は、操作量に対する評価量の変化を観測しながら、評価量を最適値に近づける方向に操作量を更新していく制御方法である。評価量は、制御対象となるプロセス(以下「制御対象プロセス」という。)の最適化の指標となる値であり、制御対象プロセスの制御量に基づいて決定される。例えば、評価量は制御量を変数とする所定の評価関数によって表される。評価量は、制御量に基づく値であればどのような評価基準に基づいて定義されてもよい。例えば、評価量は制御量そのものであってもよい。一般に、極値制御において、制御対象プロセスの評価関数は操作量に対して未知の関数であってよい。
(First Embodiment) [Outline of extreme value control]
1A to 1C are diagrams illustrating a basic concept of extreme value control.
Extreme value control is a control method in which the manipulated variable is updated in the direction of approaching the optimum value while observing the change in the evaluated variable with respect to the manipulated variable. The evaluation amount is a value that serves as an index for optimizing the process to be controlled (hereinafter referred to as “controlled process”), and is determined based on the controlled amount of the controlled process. For example, the evaluation quantity is represented by a predetermined evaluation function with the control quantity as a variable. The evaluation quantity may be defined based on any evaluation standard as long as it is a value based on the control quantity. For example, the evaluation quantity may be the control quantity itself. Generally, in extreme value control, the evaluation function of the controlled process may be a function unknown to the manipulated variable.
 具体的には、極値制御では、操作量を示す信号にディザー信号を作用させることによって操作量を変化させる。ディザー信号は、値が周期的に変化する信号であり、通常は正弦波で与えられることが多い。極値制御では、ディザー信号によって操作量を継続的に振動させ、それによって生じる評価量の変化(増減)を観測する。そして、観測された評価量の変化に基づいて、評価関数の最適値(最大値又は最小値)に近づくように評価量を変化させる新たな操作量を算出し、算出した新たな操作量で現在の操作量を更新する。極値制御は、このような評価量の観測及び操作量の更新を繰り返すことによって評価関数の最適値を探索していく制御方法である。 Specifically, in extreme value control, the manipulated variable is changed by applying a dither signal to the signal indicating the manipulated variable. The dither signal is a signal whose value changes periodically, and is usually given as a sine wave. In extreme value control, the operation amount is continuously vibrated by the dither signal, and the change (increase / decrease) in the evaluation amount caused by the vibration is observed. Then, based on the change in the observed evaluation amount, a new operation amount that changes the evaluation amount so as to approach the optimum value (maximum value or minimum value) of the evaluation function is calculated, and the calculated new operation amount is currently used. Update the operation amount of. Extreme value control is a control method that searches for the optimum value of the evaluation function by repeating such observation of the evaluation quantity and update of the manipulated quantity.
 例えば、図1Aは、操作量に対して未知の評価関数の一例として下に凸の二次関数を想定した評価関数曲線EVを示す。また、図1Bは、制御対象プロセスの操作量をディザー信号で振動させた結果、評価量を示す信号がディザー信号とは逆位相で変化した場合(例えば操作量の増加に対して評価量が減少する)を示す。このような変化は、動作点が例えば評価関数曲線EVの極小点P10より左側の領域で変化する場合(例えば動作点P11から極小点P10に向かって変化する場合)に起こる。 For example, FIG. 1A shows an evaluation function curve EV assuming a downwardly convex quadratic function as an example of an evaluation function unknown with respect to the manipulated variable. Further, FIG. 1B shows a case where the signal indicating the evaluation amount changes in the opposite phase to the dither signal as a result of vibrating the operation amount of the controlled process with the dither signal (for example, the evaluation amount decreases as the operation amount increases). To). Such a change occurs, for example, when the operating point changes in the region on the left side of the minimum point P10 of the evaluation function curve EV (for example, when the operating point changes from the operating point P11 toward the minimum point P10).
 一方、図1Cは、図1Bと同様のディザー信号で制御対象プロセスの操作量を変化させた結果、評価量を示す信号がディザー信号と同位相で変化した場合(例えば操作量の増加に対して評価量も増加する)を示す。このような変化は、動作点が例えば評価関数曲線EVの極小点P10より右側の領域で変化する場合(例えば動作点P12から極小点P10に向かって変化する場合)に起こる。 On the other hand, FIG. 1C shows a case where the signal indicating the evaluation amount changes in the same phase as the dither signal as a result of changing the operation amount of the controlled process with the same dither signal as in FIG. 1B (for example, with respect to an increase in the operation amount). The evaluation amount also increases). Such a change occurs, for example, when the operating point changes in the region to the right of the minimum point P10 of the evaluation function curve EV (for example, when the operating point changes from the operating point P12 toward the minimum point P10).
 したがって、操作量を周期的に増減させた結果、評価量が操作量と同位相で増減する場合には操作量を減少させ、評価量が操作量と逆位相で増減する場合には操作量を増加させることによって、評価量を最適値に近づけることができる。従来、産業用プラントの制御方式として一般的に用いられてきたPID制御(Proportional-Integral-Derivative Control)は、制御量が予め設定された目標値に追従するように操作量を制御する目標値追従型の制御方式であった。これに対して、極値制御は、評価量を最適化する最適値探索型の制御方式であるため、PID制御のように制御対象プロセスについて操作量と制御量との関係性を表すプロセスモデルを予め作成しておく必要がない。このような性質を有する極値制御は、目標値を予め設定できないような制御対象プロセスについても有効に機能させることができるため今後広く普及する可能性を秘めている。その一方で、極値制御を実現する極値制御システムは、次の図2に示すように比較的簡単な構成で実現することができる。 Therefore, as a result of periodically increasing or decreasing the operation amount, the operation amount is decreased when the evaluation amount increases or decreases in the same phase as the operation amount, and the operation amount is increased when the evaluation amount increases or decreases in the opposite phase to the operation amount. By increasing the value, the evaluation amount can be brought closer to the optimum value. Conventionally, PID control (Proportional-Integral-Derivative Control), which has been generally used as a control method for industrial plants, controls the operation amount so that the control amount follows a preset target value. It was a type control method. On the other hand, since extreme value control is an optimum value search type control method that optimizes the evaluation amount, a process model that expresses the relationship between the operation amount and the control amount for the controlled target process like PID control is used. There is no need to create it in advance. Extreme value control having such a property has the potential to become widespread in the future because it can effectively function even for a controlled target process in which a target value cannot be set in advance. On the other hand, an extreme value control system that realizes extreme value control can be realized with a relatively simple configuration as shown in FIG. 2 below.
 図2は、極値制御システムの基本的な構成例を示す図である。
 図2の極値制御システム9は、変調用ディザー信号出力部11、ハイパスフィルタ12(HPF:High-Pass Filter)、復調用ディザー信号出力部13、ローパスフィルタ14(LPF:Low-Pass Filter)、及び積分器15を備える。このように極値制御システム9の構成は、従来のPID制御コントローラと比較しても同程度の複雑さである。そのため、極値制御システム9は、PID制御コントローラと同様に、PLC(Programmable Logic Controller)等のハードウェアを用いて容易に実装可能である。以下、図2の極値制御システム9の動作の概要について説明する。なお、ここでは、最適値として評価関数の極小値を探索する場合を例に説明する。
FIG. 2 is a diagram showing a basic configuration example of an extreme value control system.
The extreme value control system 9 of FIG. 2 includes a modulation dither signal output unit 11, a high-pass filter 12 (HPF: High-Pass Filter), a demodulation dither signal output unit 13, and a low-pass filter 14 (LPF: Low-Pass Filter). And an integrator 15. As described above, the configuration of the extreme value control system 9 is as complicated as that of the conventional PID control controller. Therefore, the extreme value control system 9 can be easily implemented by using hardware such as a PLC (Programmable Logic Controller), similarly to the PID control controller. The outline of the operation of the extreme value control system 9 of FIG. 2 will be described below. Here, a case of searching for the minimum value of the evaluation function as the optimum value will be described as an example.
 まず、変調用ディザー信号出力部11は、ディザー信号を作用させることにより、制御対象プロセスの操作量に対して強制的な変化を与える。例えば、変調用ディザー信号出力部11は、正弦波等のディザー信号を作用させることにより、制御対象プロセスの操作量を周期的に変化させる。以下、この操作をモジュレーション(Modulation:変調)といい、モジュレーションに用いられるディザー信号を変調用ディザー信号という。このモジュレーションによる操作量の変化に応じて制御量が変化する。制御対象プロセスは、このように変化する制御量に基づいて評価量を取得し、取得した評価量を極値制御システム9にフィードバックする。 First, the modulation dither signal output unit 11 applies a dither signal to forcibly change the operation amount of the controlled process. For example, the modulation dither signal output unit 11 periodically changes the operation amount of the controlled target process by applying a dither signal such as a sine wave. Hereinafter, this operation is referred to as modulation, and the dither signal used for modulation is referred to as a modulation dither signal. The control amount changes according to the change in the operation amount due to this modulation. The controlled target process acquires an evaluation quantity based on the control quantity that changes in this way, and feeds back the acquired evaluation quantity to the extreme value control system 9.
 一般に、制御量は操作量の変化に対してある程度の時間遅れを伴って変化することが多いため、制御量に基づいて取得される評価量も操作量の変化に対してある程度の時間遅れを伴って変化するものとなる。なお、制御量に基づいて評価量を取得する機能は、必ずしも制御対象プロセスに含まれる必要はない。例えば、評価量を取得する機能は、極値制御システム9に含まれてもよいし、制御対象プロセスと極値制御システム9との間に介在しうる他の装置によって実現されてもよい。 In general, the controlled variable often changes with a certain time delay with respect to the change in the manipulated variable, so the evaluation quantity acquired based on the controlled variable also has a certain time delay with respect to the change in the manipulated variable. Will change. The function of acquiring the evaluation amount based on the control amount does not necessarily have to be included in the controlled process. For example, the function of acquiring the evaluation quantity may be included in the extreme value control system 9, or may be realized by another device that may intervene between the controlled target process and the extreme value control system 9.
 極値制御システム9は、このようにフィードバックされる評価量に基づいて、評価量を評価関数の極値に近づけるように操作量を更新する。この場合、制御対象プロセスの評価関数が極小値を持つことが前提となるが、上述のとおり、評価関数は操作量に対して未知の関数であるため、その極値も操作量に対して未知である。そのため、極値制御システム9は、モジュレーションに応じて変化した評価量の変化の大きさ及び方向をフィードバックされる評価量の信号に基づいて観測し、観測された変化の大きさ及び方向に基づいて新たな操作量を決定する。 The extreme value control system 9 updates the operation amount so that the evaluation amount approaches the extreme value of the evaluation function based on the evaluation amount fed back in this way. In this case, it is assumed that the evaluation function of the controlled process has a minimum value, but as described above, since the evaluation function is an unknown function with respect to the manipulated variable, its extreme value is also unknown with respect to the manipulated variable. Is. Therefore, the extremum control system 9 observes the magnitude and direction of the change of the evaluation quantity changed according to the modulation based on the signal of the evaluation quantity fed back, and based on the magnitude and direction of the observed change. Determine a new amount of operation.
 具体的には、この新たな操作量の決定は、ハイパスフィルタ12、復調用ディザー信号出力部13、ローパスフィルタ14、及び積分器15が以下の各機能を有することによって実現される。 Specifically, the determination of this new manipulated variable is realized by the high-pass filter 12, the demodulation dither signal output unit 13, the low-pass filter 14, and the integrator 15 having the following functions.
 ハイパスフィルタ12は、フィードバックされる評価量の信号から未知の極小値に応じた一定値のバイアスを除去する。この処理はすなわち、未知の極小値を常にゼロに調整するための処理であり、後述する積分器15が操作量を更新する方向(増加又は減少)を決定するための前処理である。 The high-pass filter 12 removes a constant value bias according to an unknown minimum value from the feedback evaluation amount signal. This process is a process for always adjusting the unknown minimum value to zero, and is a preprocess for determining the direction (increase or decrease) in which the integrator 15 described later updates the manipulated variable.
 復調用ディザー信号出力部13は、このように調整された評価量の信号に対して復調用のディザー信号を作用させることにより、操作量のモジュレーションに応じて変化した評価量から変調用ディザー信号と同じ周波数成分を抽出する。以下、この操作をデモジュレーション(Demodulation:復調)といい、デモジュレーションに用いられるディザー信号を復調用ディザー信号という。デモジュレーションの役割は以下のとおりである。 The demodulation dither signal output unit 13 causes the demodulation dither signal to act on the signal of the evaluation amount adjusted in this way, so that the evaluation amount changed according to the modulation of the operation amount is changed to the modulation dither signal. Extract the same frequency component. Hereinafter, this operation is referred to as demodulation, and the dither signal used for demodulation is referred to as a demodulation dither signal. The role of demodulation is as follows.
 操作量に対して未知の評価関数には非線形要素が含まれている場合がある。この場合、評価関数は下に凸(極大値探索の場合は上に凸)の非線形関数であると想定される。このような非線形要素に起因して、評価量には変調用ディザー信号の周波数ωに応じた高調波成分や分調波成分が現れる可能性が高いと考えられる。デモジュレーションは、このような高調波や分調波の影響を取り除くための処理である。このデモジュレーションにより、評価量の信号に含まれる成分のうち、評価量を変化させた変調用ディザー信号と同じ周波数ωの成分が抽出される。 The evaluation function unknown to the manipulated variable may contain a non-linear element. In this case, the evaluation function is assumed to be a non-linear function that is convex downward (convex upward in the case of maximal value search). Due to such a non-linear element, it is highly likely that a harmonic component or a detuning component corresponding to the frequency ω of the modulation dither signal appears in the evaluation quantity. Demodulation is a process for removing the effects of such harmonics and harmonics. By this modulation, among the components included in the evaluation amount signal, a component having the same frequency ω as the modulation dither signal in which the evaluation amount is changed is extracted.
 復調された評価量の信号は、ローパスフィルタ14に入力される。ローパスフィルタ14によって、評価量の信号から定常成分(低周波成分)が抽出される。具体的には、定常成分は、評価関数の一階微分値(以下「ヤコビアン」という。)を示し、モジュレーションによる評価量の変化の方向(増加又は減少)を表すと考えられる。 The demodulated evaluation amount signal is input to the low-pass filter 14. The low-pass filter 14 extracts a steady component (low frequency component) from the signal of the evaluation amount. Specifically, the stationary component indicates the first derivative value (hereinafter referred to as "Jacobian") of the evaluation function, and is considered to indicate the direction (increase or decrease) of the change in the evaluation amount due to modulation.
 積分器15は、ローパスフィルタ14によって抽出された定常成分を積分する。積分器15は、定常成分の積分値に基づいて、評価量を極小値に近づけるために動かすべき操作量の方向(以下「探索方向」という。)を推定する推定器として機能する。このようにして探索方向を推定する方法は一般に勾配法と呼ばれ、適応制御系において探索方向を推定する基本的な方法の1つである。 The integrator 15 integrates the steady-state components extracted by the low-pass filter 14. The integrator 15 functions as an estimator that estimates the direction of the manipulated variable (hereinafter referred to as “search direction”) to be moved in order to bring the evaluated quantity closer to the minimum value based on the integrated value of the steady-state component. The method of estimating the search direction in this way is generally called the gradient method, and is one of the basic methods of estimating the search direction in the adaptive control system.
 具体的には、積分器15は、定常成分の積分値に基づいて評価関数の勾配を推定し、推定した勾配の値に基づいて操作量の探索方向、及び探索方向に動かす操作量の大きさ(操作量を動かす量)を調整する。このように調整された操作量は、変調用ディザー信号によって変調されて制御対象プロセスに入力される。 Specifically, the integrator 15 estimates the gradient of the evaluation function based on the integrated value of the constant component, and the magnitude of the manipulated variable to be moved in the search direction and the search direction of the manipulated variable based on the estimated gradient value. Adjust (the amount of movement of the operation amount). The manipulated variable adjusted in this way is modulated by the modulation dither signal and input to the controlled process.
 なお、ここでは、極値制御システム9が極小値を探索する場合を想定して、その構成を説明したが、極値制御システム9により極大値を探索する場合には、積分器15が推定する勾配の符号を反転させればよい。また、一般に、積分器はローパス特性を有するため、積分器15が十分なローパス特性を有する場合には、極値制御システム9は必ずしもローパスフィルタ14を備える必要はない。 Here, the configuration has been described assuming that the extreme value control system 9 searches for the minimum value, but when the extreme value control system 9 searches for the maximum value, the integrator 15 estimates it. The sign of the gradient may be reversed. Further, since the integrator generally has a low-pass characteristic, the extreme value control system 9 does not necessarily have to include the low-pass filter 14 when the integrator 15 has a sufficient low-pass characteristic.
 このような構成により実現される極値制御システム9は、従来のプロセス制御において一般的であったPID制御システムと比較しても同程度の複雑さであるため、PID制御システムと同様にPLC(Programmable Logic Controller)等のハードウェアを用いて容易に実装可能である。 The extreme value control system 9 realized by such a configuration is as complicated as the PID control system that has been generally used in the conventional process control, and therefore, like the PID control system, the PLC ( It can be easily implemented using hardware such as Programmable Logic Controller).
 以上、極値制御システムの基本的な構成について説明したが、このような従来の極値制御システムには、必ずしも制御対象プロセスのダイナミクスに適応した極値制御を実現することができないという課題があった。そこで以下では、このような課題を解決することができる実施形態の極値制御システムの構成について詳細に説明する。 The basic configuration of the extreme value control system has been described above, but such a conventional extreme value control system has a problem that it is not always possible to realize extreme value control adapted to the dynamics of the controlled process. It was. Therefore, in the following, the configuration of the extreme value control system of the embodiment capable of solving such a problem will be described in detail.
[実施形態の詳細]
 図3は、第1の実施形態の極値制御システム1の構成例を示す図である。
 図3に示すプラントPは制御対象プロセスを実現する手段の一例であり、例えば、生物学的排水処理プロセスを実現する水処理プラントである。プラントPは、制御対象プロセスを実現する各種のプロセス機器を備え、極値制御システム1から入力する操作量に基づいてプロセス機器を動作させる。また、プラントPは、制御対象プロセスの制御量を計測する各種の計測機器を含み、その計測値を示す情報(以下「計測情報」という。)を極値制御システム1に出力する。極値制御システム1は、プラントPから取得される計測情報に基づいて、制御対象プロセスの評価量を最適値に近づける方向(探索方向)に操作量を更新していく。
[Details of Embodiment]
FIG. 3 is a diagram showing a configuration example of the extreme value control system 1 of the first embodiment.
The plant P shown in FIG. 3 is an example of a means for realizing a controlled process, and is, for example, a water treatment plant for realizing a biological wastewater treatment process. The plant P includes various process devices that realize the controlled process, and operates the process devices based on the operation amount input from the extremum control system 1. Further, the plant P includes various measuring devices for measuring the controlled amount of the controlled process, and outputs information indicating the measured values (hereinafter referred to as “measurement information”) to the extreme value control system 1. The extreme value control system 1 updates the operation amount in the direction (search direction) that brings the evaluation amount of the controlled target process closer to the optimum value based on the measurement information acquired from the plant P.
 このような極値制御の基本的な動作は、実施形態の極値制御システム1が、従来構成の極値制御システム9と同様の変調用ディザー信号出力部11、ハイパスフィルタ12、復調用ディザー信号出力部13、ローパスフィルタ14、積分器15を備えることによって実現される。その一方で、実施形態の極値制御システム1は、評価量信号に基づいて極値制御パラメータを調整するパラメータ調整部2を備える点で従来構成の極値制御システム9と異なる。 The basic operation of such extreme value control is that the extreme value control system 1 of the embodiment has the same modulation dither signal output unit 11, high-pass filter 12, and demodulation dither signal as the extreme value control system 9 of the conventional configuration. This is realized by including an output unit 13, a low-pass filter 14, and an integrator 15. On the other hand, the extremum control system 1 of the embodiment is different from the conventional extremum control system 9 in that it includes a parameter adjusting unit 2 that adjusts the extremum control parameters based on the evaluation amount signal.
 例えば、極値制御システム1は、バスで接続されたCPU(Central Processing Unit)やメモリや補助記憶装置などを備え、極値制御プログラムを実行する。極値制御システム1は、極値制御プログラムの実行によって上記の各機能部を備える装置又はシステムとして機能する。なお、極値制御システム1の各機能の全て又は一部は、ASIC(Application Specific Integrated Circuit)やPLD(Programmable Logic Device)やFPGA(Field Programmable Gate Array)等のハードウェアを用いて実現されてもよい。プログラムは、コンピュータ読み取り可能な記録媒体に記録されてもよい。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ROM、CD-ROM等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置である。プログラムは、電気通信回線を介して送信されてもよい。 For example, the extremum control system 1 includes a CPU (Central Processing Unit), a memory, an auxiliary storage device, etc. connected by a bus, and executes an extremum control program. The extreme value control system 1 functions as a device or system including each of the above-mentioned functional units by executing the extreme value control program. Even if all or part of each function of the extreme value control system 1 is realized by using hardware such as ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), and FPGA (Field Programmable Gate Array). Good. The program may be recorded on a computer-readable recording medium. The computer-readable recording medium is, for example, a flexible disk, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, or a storage device such as a hard disk built in a computer system. The program may be transmitted over a telecommunication line.
 パラメータ調整部2は、制御対象プロセスのダイナミクスに対して適応的に極値制御パラメータを調整する機能を有する。具体的には、パラメータ調整部2は、ダイナミクスにより時々刻々と変化する評価関数の勾配推定値に基づいて積分器15の積分ゲインを適応的に調整する。 The parameter adjustment unit 2 has a function of adaptively adjusting extreme value control parameters with respect to the dynamics of the controlled process. Specifically, the parameter adjusting unit 2 adaptively adjusts the integrated gain of the integrator 15 based on the gradient estimation value of the evaluation function that changes from moment to moment due to the dynamics.
 図4は、第1の実施形態における極値制御パラメータの調整方法の具体例を示す図である。具体的には、図4は、特許文献1に記載の調整方法を引用したものである。
 図4のNo.5に記載のとおり、本実施形態におけるパラメータ調整部2は、制御対象プロセスからフィードバックされる評価量に基づいて評価関数の二階微分値(以下「ヘシアン」という。)を推定し、推定したヘシアンの値を用いて新たな積分ゲインを決定する。このような積分ゲインの調整のため、パラメータ調整部2は、第1乗算器21、勾配推定部22、正則化信号出力部23、及び第2乗算器24を備える。
FIG. 4 is a diagram showing a specific example of the method of adjusting the extreme value control parameter in the first embodiment. Specifically, FIG. 4 cites the adjustment method described in Patent Document 1.
No. 4 in FIG. As described in 5, the parameter adjusting unit 2 in the present embodiment estimates the second derivative value (hereinafter referred to as “hesian”) of the evaluation function based on the evaluation amount fed back from the controlled target process, and the estimated hesian. The value is used to determine the new integrated gain. For such adjustment of the integrated gain, the parameter adjusting unit 2 includes a first multiplier 21, a gradient estimation unit 22, a regularized signal output unit 23, and a second multiplier 24.
 第1乗算器21は、ハイパスフィルタ12から入力する評価量信号にディザー信号(の二乗信号)を乗算して勾配推定部22に出力する。勾配推定部22は、第1乗算器21の出力信号からヘシアン信号H(t)を抽出して正則化信号出力部23に出力する。この場合、例えば勾配推定部22は、非特許文献1に記載されている方法を用いて評価関数の0階以上の微分値を推定することができる。すなわち、勾配推定部22は、評価関数のヤコビアンを推定する第1の勾配推定部、又は評価関数のヘシアンを推定する第2の勾配推定部として機能し得る。具体的には、非特許文献1には、ローパスフィルタを用いて評価関数の0階以上の微分値を推定する構成が記載されており、その基本的な考え方は以下のとおりである。 The first multiplier 21 multiplies the evaluation amount signal input from the high-pass filter 12 by the dither signal (squared signal) and outputs it to the gradient estimation unit 22. The gradient estimation unit 22 extracts the hesian signal H (t) from the output signal of the first multiplier 21 and outputs it to the regularized signal output unit 23. In this case, for example, the gradient estimation unit 22 can estimate the differential value of the 0th or higher order of the evaluation function by using the method described in Non-Patent Document 1. That is, the gradient estimation unit 22 can function as a first gradient estimation unit that estimates the Jacobian of the evaluation function or a second gradient estimation unit that estimates the Hesian of the evaluation function. Specifically, Non-Patent Document 1 describes a configuration in which a low-pass filter is used to estimate the differential value of the 0th or higher order of the evaluation function, and the basic concept thereof is as follows.
 一般に、操作量には高調波成分や分調波成分が含まれる場合があるが、ディザー信号が正弦波で与えられる場合、変調後の操作量も概ねディザー信号と同じ周波数で正弦波状に変化する。そこで、操作量UがU(t)=U0+a×sinωtという正弦波状に変化すると仮定し、その操作量に応じて変化する評価量が式(1)に示す評価関数Jで表されると仮定する。 Generally, the manipulated amount may include a harmonic component or a harmonic component, but when the dither signal is given as a sine wave, the manipulated amount after modulation also changes in a sine wave shape at approximately the same frequency as the dither signal. .. Therefore, it is assumed that the manipulated variable U changes in a sinusoidal shape of U (t) = U0 + a × sinωt, and the evaluated quantity that changes according to the manipulated variable is represented by the evaluation function J shown in the equation (1). ..
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 式(1)は、評価量J(t)を操作量U(t)についての(未知の)関数として定義するものである。プラントのダイナミクスを考慮すれば、正確にはfは関数ではなく動的システムの作用素(オペレータ)とされるべきであるが、ディザー信号の周波数ωがプラントのダイナミクスに対して十分に緩やかな変化をもたらす場合にはfを近似的に関数とみなすことができる。本実施形態では、このような前提のもとでfを関数とみなす。式(1)をテーラー展開することにより式(2)が得られる。 Equation (1) defines the evaluation quantity J (t) as an (unknown) function for the manipulated quantity U (t). Considering the dynamics of the plant, to be precise, f should be an operator of the dynamic system rather than a function, but the frequency ω of the dither signal changes slowly enough with respect to the dynamics of the plant. When it is brought about, f can be regarded as a function approximately. In this embodiment, f is regarded as a function under such a premise. Equation (2) is obtained by Taylor-expanding equation (1).
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 ここで、Dkf(kは1以上の整数)は、関数fのUに関するk階微分を意味する。この式(2)にsinnωt(nは1以上の整数)をかけることにより式(3)が得られる。さらに、式(3)に対して周期平均処理を施す(又は時間積分する)と、正弦波の直交性によりsinnωtに関する成分のみが残り、式(4)が得られる。 Here, Dkf (k is an integer of 1 or more) means the k-th derivative of the function f with respect to U. Equation (3) is obtained by multiplying this equation (2) by sinnωt (n is an integer of 1 or more). Further, when the periodic averaging process (or time integration) is performed on the equation (3), only the component related to sinnωt remains due to the orthogonality of the sine wave, and the equation (4) is obtained.
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 ここで、ディザー信号の振幅aと冪数nが定数であることから、n階微分Dnfの値が1制御周期で大きく変化しないと仮定すれば、式(4)は式(5)及び(6)のように表すことができる。そして、式(5)から逆算することによって、n階微分Dnfを表す式(7)を得ることができる。 Here, since the amplitude a of the dither signal and the powerful number n are constants, assuming that the value of the nth derivative Dnf does not change significantly in one control cycle, the equations (4) are expressed in equations (5) and (6). ) Can be expressed as. Then, by back-calculating from the equation (5), the equation (7) representing the nth derivative Dnf can be obtained.
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000007
 図5及び図6は、式(7)で表される勾配推定器の構成例を示す図である。
 具体的には、図5はヤコビアン推定器の構成例(すなわちn=1の場合)を示し、ディザー信号sinωtを作用させた評価量信号J(t)をローパスフィルタで処理した後に2/a倍することによって評価関数のヤコビアンを得る構成を表している。なお、この構成は、積分ゲインKIを有する積分器15に対して新たな積分ゲインKImod=KI×(a/2)を定義したことに相当するため、図2に示した従来の基本的な極値制御システムは、ローパスフィルタ14によって評価関数のヤコビアンを推定する構成であるとみなすことができる。
5 and 6 are diagrams showing a configuration example of the gradient estimator represented by the equation (7).
Specifically, FIG. 5 shows a configuration example of the Jacobian estimator (that is, when n = 1), and the evaluation quantity signal J (t) on which the dither signal sinωt is applied is processed by a low-pass filter and then multiplied by 2 / a. By doing so, the Jacobian of the evaluation function is obtained. Since this configuration corresponds to defining a new integral gain KImod = KI × (a / 2) for the integrator 15 having the integral gain KI, the conventional basic poles shown in FIG. 2 are used. The value control system can be regarded as a configuration in which the Jacobian of the evaluation function is estimated by the low-pass filter 14.
 一方図6はヘシアン推定器の構成例(すなわちn=2の場合)を示し、ディザー信号の二乗信号sinωtを作用させた評価量信号J(t)をローパスフィルタで処理した後に16倍した第1の信号から、評価量信号J(t)をローパスフィルタで処理して8倍した第2の信号を減算して1/a2倍することによって評価関数のヘシアンを得る構成を表している。 On the other hand, FIG. 6 shows a configuration example of the Hessian estimator (that is, when n = 2), and the evaluation amount signal J (t) on which the squared signal sin 2 ωt of the dither signal is applied is processed by a low-pass filter and then multiplied by 16. It represents a configuration in which the evaluation function hesian is obtained by processing the evaluation amount signal J (t) with a low-pass filter from the first signal, subtracting the second signal multiplied by 8 and multiplying by 1 / a2.
 なお、パラメータ調整部2は、このような方法で推定した評価関数のヘシアンをそのまま用いて積分ゲインを調整することもできるが、その場合、後述する理由により極値制御が不安定化する可能性がある。そこで、本実施形態の極値制御システム1では、ローパスフィルタ14によって推定されたヤコビアンをヘシアンの推定値に基づいて正則化し、正則化後のヤコビアン信号を積分器15に供給する。これにより、本実施形態の極値制御システム1は、極値制御の不安定化を回避しつつ、積分ゲインを適応的に更新することが可能となる。 The parameter adjusting unit 2 can adjust the integral gain by using the hesian of the evaluation function estimated by such a method as it is, but in that case, the extreme value control may become unstable due to the reason described later. There is. Therefore, in the extreme value control system 1 of the present embodiment, the Jacobian estimated by the low-pass filter 14 is regularized based on the estimated value of Hesian, and the regularized Jacobian signal is supplied to the integrator 15. As a result, the extreme value control system 1 of the present embodiment can adaptively update the integrated gain while avoiding the instability of the extreme value control.
 具体的には、正則化信号出力部23が、勾配推定部22の出力するヤコビアン信号を正則化(Regularization)する信号(以下「正則化信号」という。)を生成して第2乗算器24に出力する。第2乗算器24は、ローパスフィルタ14からヤコビアン信号G(t)を、正則化信号出力部23から正則化信号を、それぞれ入力し、ヤコビアン信号に正則化信号を掛け合わせることによりヤコビアン信号を正則化する。第2乗算器24は、正則化後のヤコビアン信号Gn(t)を積分器15に供給する。 Specifically, the regularization signal output unit 23 generates a signal (hereinafter referred to as “regularization signal”) that regularizes the Jacobian signal output by the gradient estimation unit 22, and causes the second multiplier 24. Output. The second multiplier 24 inputs the Jacobian signal G (t) from the low-pass filter 14 and the regularized signal from the regularized signal output unit 23, and multiplies the Jacobian signal by the regularized signal to make the Jacobian signal regular. To become. The second multiplier 24 supplies the regularized Jacobian signal Gn (t) to the integrator 15.
[正則化信号を生成する第1の方法]
 一般に信号の「正則化」とは、対象の信号に対して何らかの逆算を行おうとした場合に、逆が存在せず、逆算ができなくなるといった悪条件(ill-condition)を回避することを意味する。例えば、このような悪条件の一例として、割り算における「ゼロ割り」などが挙げられる。
[First method of generating a regularized signal]
In general, "regularization" of a signal means avoiding an adverse condition (ill-condition) that when an attempt is made to perform some kind of back calculation on a target signal, the back calculation does not exist and the back calculation cannot be performed. .. For example, an example of such an adverse condition is "zero division" in division.
 一方で、図4に示したとおり、本実施形態の極値制御システム1は評価関数のヘシアンを用いて積分器15の積分ゲインを適応的に更新していくものであるが、これは各制御周期の積分ゲインを、評価関数のヤコビアンをヘシアンで割る(正規化する)ことによって得られる一定値に固定して極値制御を実行することと等価的に置き換えることができる。すなわち、本実施形態における極値制御システム1の構成は、図2に示した基本的な構成に対して、ヤコビアンをヘシアンで正規化する構成を付加したものとみなすことができる。 On the other hand, as shown in FIG. 4, the extremum control system 1 of the present embodiment adaptively updates the integral gain of the integrator 15 by using the evaluation function Hessian, and this is each control. The integral gain of the period can be replaced with the equivalent of performing extreme value control by fixing the Jacobian of the evaluation function to a constant value obtained by dividing (normalizing) it by Hessian. That is, the configuration of the extremum control system 1 in the present embodiment can be regarded as adding a configuration for normalizing the Jacobian with Hessian to the basic configuration shown in FIG.
 そこで、本実施形態では、ヤコビアン信号の「正規化(Normalization)」において「ゼロ割り」等の悪条件を回避することを「正則化」と定義し、ヤコビアン信号に作用してこのような正則化を実現する信号を正則化信号として生成する。具体的には、正則化信号出力部23は、以下の各条件を満たす信号変換(⇔)を実現する信号を正則化信号として生成する。 Therefore, in the present embodiment, avoiding adverse conditions such as "zero division" in the "normalization" of the Jacobian signal is defined as "regularization", and acts on the Jacobian signal to perform such regularization. Is generated as a regularized signal. Specifically, the regularized signal output unit 23 generates a signal that realizes signal conversion (⇔) satisfying each of the following conditions as a regularized signal.
[条件1]G(t)=0 ⇔ Gn(t)=0
[条件2]G(t)が正(負) ⇔ Gn(t)が正(負)
[条件3]G(t)<∞ ⇔ Gn(t)<∞
[条件4]G(t)→∞ ⇔ Gn(t)→k (0<k<∞)
[Condition 1] G (t) = 0 ⇔ Gn (t) = 0
[Condition 2] G (t) is positive (negative) ⇔ Gn (t) is positive (negative)
[Condition 3] G (t) <∞ ⇔ Gn (t) <∞
[Condition 4] G (t) → ∞ ⇔ Gn (t) → k (0 <k <∞)
 G(t)はヤコビアン信号を表し、Gn(t)は正則化信号の作用によって正則化された(すなわちヘシアンで割り算された)ヤコビアン信号を表す。[条件1]はG(t)が0のときに限りGn(t)も0となるという条件を表している。[条件2]はG(t)とGn(t)の符号は同じであるという条件を表している。[条件3]はG(t)が有限のときはGn(t)も有限となる(すなわちゼロ割りが起こらない)という条件を表している。[条件4]はG(t)が∞に発散したときにはGn(t)は∞に発散せず、ある正の有限値に収束するという条件を表している。このような性質を有する正則化信号は、例えば式(8)で表される。 G (t) represents a Jacobian signal, and Gn (t) represents a Jacobian signal that has been regularized (that is, divided by Hessian) by the action of a regularized signal. [Condition 1] represents a condition that Gn (t) is also 0 only when G (t) is 0. [Condition 2] represents a condition that the signs of G (t) and Gn (t) are the same. [Condition 3] represents a condition that when G (t) is finite, Gn (t) is also finite (that is, zero division does not occur). [Condition 4] represents a condition that when G (t) diverges to ∞, Gn (t) does not diverge to ∞ and converges to a certain positive finite value. A regularized signal having such a property is represented by, for example, the equation (8).
Figure JPOXMLDOC01-appb-M000008
Figure JPOXMLDOC01-appb-M000008
 式(8)におけるδは正の定数(δ>0)であり、いわゆる正則化定数を表す。H(t)は推定された評価関数のヘシアンを表す。式(8)によって表される正則化信号は、ヘシアン信号に正則化定数δを加える処理と、ヘシアン信号の絶対値をとる処理とによって生成することができるため、装置の大型化を抑制しつつ極値制御の安定性を向上させることができる。 Δ in equation (8) is a positive constant (δ> 0) and represents a so-called regularization constant. H (t) represents the hesian of the estimated evaluation function. Since the regularization signal represented by the equation (8) can be generated by the process of adding the regularization constant δ to the hesian signal and the process of taking the absolute value of the hesian signal, it is possible to suppress the increase in size of the device. The stability of extreme value control can be improved.
 例えば、極値制御によって最小値(極小値)を探索する場合、極値近傍でヘシアンが正の値をとる下に凸なM次(M>1)の評価関数や、極値近傍でヘシアンが負の値となる上に凸なM次(0<M<1)の評価関数についても安定して極値を探索することが可能になる。 For example, when searching for the minimum value (minimum value) by extreme value control, a downwardly convex M-th order (M> 1) evaluation function in which hesian takes a positive value near the extreme value, or hesian is located near the extreme value. It is possible to stably search for extreme values even for evaluation functions of the Mth order (0 <M <1), which are negative and convex.
 なお、ヘシアンを用いて生成される正則化信号は、必ずしも式(8)のようなヘシアンの一乗関数によって表されるものである必要は無い。例えば、正則化信号は式(9)や式(10)のようなヘシアンの二乗関数によって表されるものであってもよいし、式(11)や式(12)のようなヘシアンのM乗(M=1,2,3,…)関数によって表されるものであってもよい。 Note that the regularized signal generated using Hessian does not necessarily have to be represented by the Hessian first-order function as in Eq. (8). For example, the regularization signal may be represented by a Hessian square function such as Eqs. (9) or (10), or a Hessian M-th power as in Eqs. (11) or (12). It may be represented by a (M = 1, 2, 3, ...) Function.
Figure JPOXMLDOC01-appb-M000009
Figure JPOXMLDOC01-appb-M000009
Figure JPOXMLDOC01-appb-M000010
Figure JPOXMLDOC01-appb-M000010
Figure JPOXMLDOC01-appb-M000011
Figure JPOXMLDOC01-appb-M000011
Figure JPOXMLDOC01-appb-M000012
Figure JPOXMLDOC01-appb-M000012
 図7及び図8は、第1の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。具体的には、図7は、式(8)で表される正則化信号によって正則化されたヤコビアン信号Gn(t)を積分器15に入力する構成例を示す。また、図8は、式(9)で表される正則化信号によって正則化されたヤコビアン信号Gn(t)を積分器15に入力する構成例を示す。 7 and 8 are diagrams showing a configuration example of an extremum control system that generates a regularized signal by the first method. Specifically, FIG. 7 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (8) is input to the integrator 15. Further, FIG. 8 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (9) is input to the integrator 15.
[正則化信号を生成する第2の方法]
 第1の方法では、正則化信号の生成にヘシアン信号を用いるため、評価関数のヘシアンを推定する何らかの機構が必要となる。例えば、図7及び図8に示した極値制御システム1においては、評価量の信号からヘシアン信号を生成するための第1乗算器21及び勾配推定部22(例えばローパスフィルタ)が必要となる。これに対して、第2の方法は、ヘシアンの推定値に代えてヤコビアンの推定値を用いることで、第1乗算器21及び勾配推定部22を備えずに正則化信号を生成する方法である。例えば、第2の方法によって生成される正則化信号は式(13)で表される。
[Second method of generating a regularized signal]
In the first method, since the Hessian signal is used to generate the regularized signal, some mechanism for estimating the Hessian of the evaluation function is required. For example, in the extremum control system 1 shown in FIGS. 7 and 8, a first multiplier 21 and a gradient estimation unit 22 (for example, a low-pass filter) for generating a Hessian signal from an evaluation amount signal are required. On the other hand, the second method is a method of generating a regularized signal without the first multiplier 21 and the gradient estimation unit 22 by using the Jacobian estimation value instead of the Hesian estimation value. .. For example, the regularized signal generated by the second method is represented by the equation (13).
Figure JPOXMLDOC01-appb-M000013
Figure JPOXMLDOC01-appb-M000013
 ここでG(t)は、勾配推定部22によって推定された評価関数のヤコビアン、又はヤコビアンに比例する量を表す。なお、第2の方法で生成される正則化信号は、必ずしもヤコビアンの一乗関数で表されるものである必要はない。例えば、第2の方法で生成される正則化信号は、第1の方法と同様に、ヤコビアンのM乗(M=1,2,3,…)関数によって表されるものであってもよい(例えば式(14)及び(15))。 Here, G (t) represents the Jacobian of the evaluation function estimated by the gradient estimation unit 22, or an amount proportional to the Jacobian. The regularized signal generated by the second method does not necessarily have to be represented by the Jacobian first power function. For example, the regularization signal generated by the second method may be represented by the Jacobian M-th power (M = 1, 2, 3, ...) Function as in the first method (. For example, equations (14) and (15)).
Figure JPOXMLDOC01-appb-M000014
Figure JPOXMLDOC01-appb-M000014
Figure JPOXMLDOC01-appb-M000015
Figure JPOXMLDOC01-appb-M000015
 図9及び図10は、第2の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。具体的には、図9は、式(13)で表される正則化信号によって正則化されたヤコビアン信号Gn(t)を積分器15に入力する構成例を示す。また、図10は、式(14)で表される正則化信号(M=2の場合)によって正則化されたヤコビアン信号Gn(t)を積分器15に入力する構成例を示す。このような第2の方法によれば、ヤコビアン信号の正則化のために極値制御システムの構成が複雑化することを抑制することができる。 9 and 10 are diagrams showing a configuration example of an extreme value control system that generates a regularized signal by the second method. Specifically, FIG. 9 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal represented by the equation (13) is input to the integrator 15. Further, FIG. 10 shows a configuration example in which the Jacobian signal Gn (t) normalized by the regularized signal (in the case of M = 2) represented by the equation (14) is input to the integrator 15. According to such a second method, it is possible to prevent the configuration of the extremum control system from becoming complicated due to the regularization of the Jacobian signal.
[正則化信号を生成する第3の方法]
 第3の方法は、式(13)~(15)においてδ=0とすることにより第2の方法をより簡略化した方法である。この場合、第3の方法によって生成される正則化信号は式(16)で表される。
[Third method of generating a regularized signal]
The third method is a method in which the second method is further simplified by setting δ = 0 in the formulas (13) to (15). In this case, the regularized signal generated by the third method is represented by the equation (16).
Figure JPOXMLDOC01-appb-M000016
Figure JPOXMLDOC01-appb-M000016
 第1及び第2の方法においてδ(>0)はゼロ割りを回避するように働くため、このδを0とすると、G(t)=0のときにRSが無限大に発散してしまい、正則化後のヤコビアン信号Gn(t)も同様に発散してしまう。このような信号は、そもそも正則化信号としての条件1~4を満たしていないため、本来は正則化信号として用いるべきものではないが、ヤコビアン信号の正則化を式(17)のように定義する場合には正則化信号として用いることができる。 In the first and second methods, δ (> 0) works to avoid zero division, so if this δ is set to 0, RS will diverge to infinity when G (t) = 0. The Jacobian signal Gn (t) after regularization also diverges. Since such a signal does not satisfy the conditions 1 to 4 as a regularization signal in the first place, it should not be used as a regularization signal in the first place, but the regularization of the Jacobian signal is defined as in the equation (17). In some cases, it can be used as a regularization signal.
Figure JPOXMLDOC01-appb-M000017
Figure JPOXMLDOC01-appb-M000017
 式(17)は、ヤコビアン信号を式(16)の正則化信号RSで割ることをヤコビアン信号の正則化と定義することを表している。これは、ヤコビアン信号をヤコビアン信号の絶対値で割る操作であるから、式(17)におけるGn(t)は-1又は+1のいずれかの値をとる。すなわち、第3の方法によって生成される正則化信号とは、ヤコビアン信号に作用してその符号情報(-1又は+1)を抽出する信号であると言える。このため、式(17)に定義した正則化を行う場合には、式(16)によって表される信号は原理的には正則化信号として機能する。 Equation (17) indicates that dividing the Jacobian signal by the regularization signal RS of equation (16) is defined as regularization of the Jacobian signal. Since this is an operation of dividing the Jacobian signal by the absolute value of the Jacobian signal, Gn (t) in the equation (17) takes either a value of -1 or +1. That is, it can be said that the regularized signal generated by the third method is a signal that acts on the Jacobian signal and extracts its code information (-1 or +1). Therefore, when the regularization defined in the equation (17) is performed, the signal represented by the equation (16) functions as a regularization signal in principle.
 なお、このような正則化の目的が符号情報を抽出することであることからすれば、ヤコビアン信号の符号情報を抽出することができれば、実際にG(t)をその絶対値|G(t)|で割り算する処理は必ずしも必要ない。そのため、この場合の正則化信号出力部23は、sgn(G(t))を直接的に計算するように構成されてもよい。また、このような構成にすることによりゼロ割りの発生を回避することができる。ここでsgn(x)は値xの符号を返す関数を表している。 Since the purpose of such regularization is to extract the code information, if the code information of the Jacobian signal can be extracted, G (t) is actually set to its absolute value | G (t). The process of dividing by | is not always necessary. Therefore, the regularized signal output unit 23 in this case may be configured to directly calculate sgn (G (t)). Further, by adopting such a configuration, it is possible to avoid the occurrence of zero division. Here, sgn (x) represents a function that returns the sign of the value x.
 図11は、第1の実施形態において、第3の方法によって正則化信号を生成する極値制御システムの構成例を示す図である。
 このように抽出される符号情報は、単純に操作量を動かすべき方向(増加又は減少)のみを示す信号となる。そのため、このような正則化の構成を備えることにより、例えば図11に示すような簡易な極値制御システムを構成することが可能になる。なお、この場合、各制御周期において操作量を変化させる量は一定値となるため、sgn(G(t))に係数を乗じるなどして、その変化量が所望量となるように調整してもよい。
FIG. 11 is a diagram showing a configuration example of an extremum control system that generates a regularized signal by the third method in the first embodiment.
The code information extracted in this way is simply a signal indicating only the direction (increase or decrease) in which the manipulated variable should be moved. Therefore, by providing such a regularization configuration, it is possible to configure a simple extremum control system as shown in FIG. 11, for example. In this case, since the amount of change in the operation amount in each control cycle is a constant value, the amount of change is adjusted to be a desired amount by multiplying sgn (G (t)) by a coefficient. May be good.
 なお、式(13)だけでなく、式(14)や式(15)においてもδ=0とすることで式(16)が得られることからすれば、式(13)~(15)はδ>0をパラメータとして式(16)の作用を調整するものと考えることができる。実際、sgn()による正則化では操作量を動かすべき方向のみが与えられるため、パラメータが十分適切に調整されていない場合には極値近傍でチャタリングを生じる可能性がある。この場合、式(16)をδ>0によって調整した式(13)~(15)を用いて生成した正則化信号は、極値近傍でのチャタリングを抑制するようにヤコビアン信号に作用すると考えられる。 Since the equation (16) can be obtained by setting δ = 0 not only in the equation (13) but also in the equations (14) and (15), the equations (13) to (15) are δ. It can be considered that the action of the equation (16) is adjusted with> 0 as a parameter. In fact, regularization by sgn () gives only the direction in which the manipulated variable should be moved, so chattering may occur near the extremum if the parameters are not adjusted sufficiently. In this case, it is considered that the regularized signal generated by using the equations (13) to (15) in which the equation (16) is adjusted by δ> 0 acts on the Jacobian signal so as to suppress chattering near the extreme value. ..
 第2乗算器24は、このようにして生成される正則化信号を乗じることによりヤコビアン信号G(t)を正則化し、正則化後のヤコビアン信号Gn(t)を積分器15に出力する。このようなパラメータ調整部2によるヤコビアン信号の正則化は式(18)によって表される。 The second multiplier 24 regularizes the Jacobian signal G (t) by multiplying the regularized signal generated in this way, and outputs the regularized Jacobian signal Gn (t) to the integrator 15. The regularization of the Jacobian signal by the parameter adjusting unit 2 is expressed by the equation (18).
Figure JPOXMLDOC01-appb-M000018
Figure JPOXMLDOC01-appb-M000018
 パラメータ調整部2は、このようにして正則化されたヤコビアン信号を用いて積分器15の積分ゲインを適応的に更新する。一般に、極値制御において操作量を動かすべき方向は、操作量に対して未知である評価関数について推定された勾配(ヤコビアン)の符号によって表され、その動かす量は積分ゲインによって調整される。ここでは、非特許文献2などに記載されているアベレージシステム(平均システム)の考え方に基づいて積分ゲインを調整する方法について説明する。 The parameter adjusting unit 2 adaptively updates the integrated gain of the integrator 15 using the Jacobian signal regularized in this way. In general, the direction in which the manipulated variable should be moved in extreme value control is represented by the sign of the gradient (Jacobian) estimated for the evaluation function unknown to the manipulated variable, and the amount of movement is adjusted by the integral gain. Here, a method of adjusting the integral gain based on the concept of the average system (average system) described in Non-Patent Document 2 and the like will be described.
 アベレージシステムとは、あるシステムに周期的な入力が加えられたときに、その周期平均(アベレージ)をとったシステムの動的な挙動を表すシステムである。一般に、アベレージシステムは極値制御システムの安定性解析などにおいて用いられる。例えば、非特許文献2には、ダイナミクスを持たないスタティックなプラントの極値制御システムについて、そのアベレージシステムのダイナミクスが具体的に記載されている。そのアベレージシステムは式(19)及び(20)で表される。 The average system is a system that represents the dynamic behavior of a system that takes the periodic average (average) when a periodic input is applied to a certain system. Generally, the average system is used in the stability analysis of the extreme value control system. For example, Non-Patent Document 2 specifically describes the dynamics of the average system of an extreme value control system of a static plant having no dynamics. The averaging system is represented by equations (19) and (20).
Figure JPOXMLDOC01-appb-M000019
Figure JPOXMLDOC01-appb-M000019
Figure JPOXMLDOC01-appb-M000020
Figure JPOXMLDOC01-appb-M000020
 式(19)においてG(u)は操作量uに対して未知である評価関数のヤコビアンの推定値を表す。aはディザー信号の振幅を表す。Pはディザー信号のパワーを表し、正弦波をディザー信号とする場合にはP=1/2であり、三角波をディザー信号とする場合にはP=1/3であり、矩形波をディザー信号とする場合にはP=1である。τは実時間tをディザー信号の周波数ωでスケール変換した時間(τ=ωt)を表し、KI0はτの時間軸上における積分ゲインを表す。KI0は式(20)によって実時間tの時間軸上における積分ゲインKIに変換される。 In equation (19), G (u) represents the Jacobian estimate of the evaluation function that is unknown for the manipulated variable u. a represents the amplitude of the dither signal. P represents the power of the dither signal, P = 1/2 when the sine wave is the dither signal, P = 1/3 when the triangular wave is the dither signal, and the square wave is the dither signal. If so, P = 1. τ represents the time (τ = ωt) obtained by scaling the real time t with the frequency ω of the dither signal, and KI0 represents the integrated gain on the time axis of τ. KI0 is converted into an integral gain KI on the time axis of real time t by the equation (20).
 さらに式(19)について、操作量uの平衡点をu*として周期平均u~=u-u*をとることにより式(21)が得られる。『u~』は『u』の真上に『~』を冠した記号を意味している。 Further, with respect to the equation (19), the equation (21) can be obtained by taking the periodic average u ~ = u−u * with the equilibrium point of the manipulated variable u as u *. "U-" means a symbol with "-" directly above "u".
Figure JPOXMLDOC01-appb-M000021
Figure JPOXMLDOC01-appb-M000021
 式(19)及び(21)によって表されるアベレージシステムは、ディザー信号による操作量の振動に応じて、評価量がどのような速度で最小値(極小値)に収束していくかという極値制御における収束のダイナミクスを表現したものである。非特許文献2では、制御対象プロセスがスタティックである場合を仮定しているが、ディザー信号の周期がプラントの時定数よりも十分に長く設定されている。これはすなわちディザー信号の周波数ωが制御対象プロセスのカットオフ周波数よりも十分に小さく設定されている場合には、ダイナミクスを持つ制御対象プロセスを近似的にスタティックな制御対象プロセスとみなすことができるということである。これは、極値制御の安定性解析で用いられる特異摂動論によっても裏付けられる。 In the average system represented by the equations (19) and (21), the extreme value of how quickly the evaluation amount converges to the minimum value (minimum value) according to the vibration of the operation amount due to the dither signal. It expresses the dynamics of convergence in control. Non-Patent Document 2 assumes that the controlled process is static, but the period of the dither signal is set to be sufficiently longer than the time constant of the plant. This means that if the dither signal frequency ω is set sufficiently lower than the cutoff frequency of the controlled process, the controlled process with dynamics can be regarded as an approximately static controlled process. That is. This is also supported by the singular perturbation theory used in the stability analysis of extremum control.
 また、図2のようにハイパスフィルタやローパスフィルタを備えて構成される基本的な構成の極値制御システムの場合であっても、これらのフィルタのカットオフ周波数が適切に設定され、支配的なダイナミクス(最も遅いダイナミクス)が積分器15(積分器)となる場合には、式(19)に示すアベレージシステムで極値制御システムの全体の挙動を特徴づけることができる。そのため、このような場合には式(19)を用いて積分ゲインを調整することができる。 Further, even in the case of an extreme value control system having a basic configuration including a high-pass filter and a low-pass filter as shown in FIG. 2, the cutoff frequencies of these filters are appropriately set and are dominant. When the dynamics (slowest dynamics) is the integrator 15 (integrator), the average system shown in equation (19) can be used to characterize the overall behavior of the extreme value control system. Therefore, in such a case, the integral gain can be adjusted by using the equation (19).
 式(19)における評価関数のヤコビアンG(u)はuに関する非線形関数となることが多いため、一般に式(19)は非線形微分方程式となる。ここで、式(19)のuに関して適当な動作点u0の周辺で線形化したアベレージシステムは式(22)で表すことができる。 Since the evaluation function Jacobian G (u) in the equation (19) is often a nonlinear function related to u, the equation (19) is generally a nonlinear differential equation. Here, the average system linearized around the appropriate operating point u0 with respect to u in the equation (19) can be expressed by the equation (22).
Figure JPOXMLDOC01-appb-M000022
Figure JPOXMLDOC01-appb-M000022
 ここで、『u^』は『u』の真上に『^』を冠した記号を意味している。u^=u-u0であり、H(u0)はG(u)のヤコビアンを表す。すなわちH(u0)は評価関数のヘシアンである。そのため、本実施形態では積分ゲインKI0をH(u0)の逆数に比例するように適応的に調整することで極値制御の収束速度を調整する。このような積分ゲインの調整において、第1の方法によって生成される正則化信号(式(8)~(12)参照)は、ヘシアンの推定値によるゼロ割りを回避するとともに、急激な符号変化を抑制するように作用する。 Here, "u ^" means a symbol with "^" directly above "u". u ^ = u−u0, where H (u0) represents the Jacobian of G (u). That is, H (u0) is the evaluation function Hessian. Therefore, in the present embodiment, the convergence speed of the extreme value control is adjusted by adaptively adjusting the integral gain KI0 so as to be proportional to the reciprocal of H (u0). In such adjustment of the integral gain, the regularized signal (see equations (8) to (12)) generated by the first method avoids zero division by the estimated value of Hessian and causes a sudden code change. It acts to suppress.
 これに加えて第1の方法による正則化信号は、時間とともに変動するヘシアンの推定値を積分ゲインの算出式から追い出し、新たな信号として定義している。このようにして積分ゲインを可変とする要因を算出式から追い出すことにより、算出式から1/H(u0)の項が除かれることになり、積分ゲインKI0を固定値として調整することができる。また、正則化信号の定義式に含まれる微小定数δ(>0)は、ヘシアンによるゼロ割りを回避するためものであり、ヘシアンの推定値が0になった場合にKI0が最大値(定数項の1/δ倍)をとる。そのため、KI0について想定する最大値に基づいてδを決定することにより積分ゲインの調整が可能になる。 In addition to this, the regularized signal by the first method is defined as a new signal by expelling the estimated value of Hessian that fluctuates with time from the calculation formula of the integral gain. By expelling the factor that makes the integral gain variable from the calculation formula in this way, the term 1 / H (u0) is removed from the calculation formula, and the integral gain KI0 can be adjusted as a fixed value. Further, the minute constant δ (> 0) included in the definition formula of the regularized signal is for avoiding zero division by Hesian, and KI0 is the maximum value (constant term) when the estimated value of Hesian becomes 0. 1 / δ times). Therefore, the integral gain can be adjusted by determining δ based on the maximum value assumed for KI0.
 式(22)は式(19)の非線形方程式を線形近似したものであるが、式(19)の非線形要素であるG(u)の影響を抑制する方法として、G(u)を消去するというより直接的な方法が考えられる。上述の第2及び第3の方法は、このような考え方に沿った正則化信号を生成する方法である。式(19)における積分ゲインKI0は調整対象のパラメータであり、設計者が決定することのできるものであるから、これを式(19)の微分方程式を変形する一種の操作量とみなし、式(23)を満たすような積分ゲインKI0’を新たに定義することにより、式(19)における非線形要素を消去することができる。 Equation (22) is a linear approximation of the nonlinear equation of equation (19), but it is said that G (u) is eliminated as a method of suppressing the influence of G (u), which is a nonlinear element of equation (19). A more direct method can be considered. The above-mentioned second and third methods are methods for generating a regularized signal according to such an idea. Since the integral gain KI0 in the equation (19) is a parameter to be adjusted and can be determined by the designer, it is regarded as a kind of operation amount that transforms the differential equation in the equation (19), and the equation (19) is used. By newly defining the integral gain KI0'that satisfies 23), the non-linear element in the equation (19) can be eliminated.
Figure JPOXMLDOC01-appb-M000023
Figure JPOXMLDOC01-appb-M000023
 しかしながら式(23)を適用するとヤコビアンに関する情報が式(19)に含まれなくなるため、操作量を動かすべき方向を決定するために必要な情報が失われてしまうことになる。そこで、式(23)に代えて、式(24)のようにヤコビアンの絶対値をとることにより、式(19)にヤコビアンの符号情報を残すことができる。この場合、式(19)は式(25)のように表される。 However, when the equation (23) is applied, the information about the Jacobian is not included in the equation (19), so that the information necessary for determining the direction in which the manipulated variable should be moved is lost. Therefore, by taking the absolute value of the Jacobian as in the equation (24) instead of the equation (23), the sign information of the Jacobian can be left in the equation (19). In this case, equation (19) is expressed as equation (25).
Figure JPOXMLDOC01-appb-M000024
Figure JPOXMLDOC01-appb-M000024
Figure JPOXMLDOC01-appb-M000025
Figure JPOXMLDOC01-appb-M000025
 このようにすると、操作量の探索方向の決定に必要な情報を残しつつ、ヤコビアンの非線形性を消去することができる。すなわち第3の方法による正則化信号は、ヤコビアンの非線形性を積分ゲインの算出式から追い出すための信号であるということができる。式(25)にはヤコビアンの符号情報のみが含まれるため新たな積分ゲインKI0'の調整は簡単になる。なお、KI0’の調整は、式(25)に基づく極値制御の過渡的な挙動が直線的になることを踏まえ、図5に示した調整方法を参考に行えば良い。 By doing so, it is possible to eliminate the non-linearity of the Jacobian while leaving the information necessary for determining the search direction of the manipulated variable. That is, it can be said that the regularized signal by the third method is a signal for expelling the Jacobian non-linearity from the calculation formula of the integral gain. Since the equation (25) includes only the Jacobian code information, the adjustment of the new integral gain KI0'is easy. The adjustment of KI0'may be performed with reference to the adjustment method shown in FIG. 5, considering that the transient behavior of the extreme value control based on the equation (25) becomes linear.
 第3の方法に係る正則化は、非常に簡単に表される一方で、ヤコビアンについては符号情報のみを含むため極値近傍においてチャタリングが起こる可能性がある。第2の方法に係る正則化は、このチャタリングを回避するために、第3の方法に微少定数δ>0を導入したものと考えることができる。この場合、積分ゲインを式(26)のように変換すると考えると、式(25)に相当するアベレージシステムは式(27)で表される。 While the regularization according to the third method is expressed very simply, chattering may occur near the extreme value because Jacobian contains only code information. It can be considered that the regularization according to the second method introduces a minute constant δ> 0 into the third method in order to avoid this chattering. In this case, considering that the integrated gain is converted as in the equation (26), the average system corresponding to the equation (25) is represented by the equation (27).
Figure JPOXMLDOC01-appb-M000026
Figure JPOXMLDOC01-appb-M000026
Figure JPOXMLDOC01-appb-M000027
Figure JPOXMLDOC01-appb-M000027
 式(27)のアベレージシステムは、ヤコビアンの値が大きいときにはδの影響が小さくなるため、式(25)のアベレージシステムと同様の動きをする。一方で、ヤコビアンの値が小さいときにはδの影響が大きくなるため、式(27)のアベレージシステムはヤコビアンに比例するような動きをすることになる。そのため、第2の方法に係る正則化においても、極値近傍におけるチャタリングの防止のためにδを設定するだけで、基本的には式(25)と同様の考え方で積分ゲインKI0’を調整することができる。 The average system of equation (27) behaves in the same way as the average system of equation (25) because the effect of δ is small when the Jacobian value is large. On the other hand, when the Jacobian value is small, the influence of δ becomes large, so that the average system of Eq. (27) behaves in proportion to the Jacobian. Therefore, even in the regularization according to the second method, the integral gain KI0'is basically adjusted in the same way as in Eq. (25) by simply setting δ to prevent chattering near the extreme value. be able to.
[正則化信号を生成する第4の方法]
 第4の方法による正則化信号は、ヤコビアン信号の符号推定値を連続関数で近似した近似符号推定値によって表される信号である。
 第4の方法により生成された正則化信号によりヤコビアン信号を正則化すると、第3の方法による正則化信号により正則化後のヤコビアン信号(符号信号)を滑らかな連続関数で近似したものであって、第2の方法による正則化信号により正則化後のヤコビアン信号を一般化したものに相当する信号となる。
[Fourth method of generating a regularized signal]
The regularized signal by the fourth method is a signal represented by an approximate code estimated value obtained by approximating the code estimated value of the Jacobian signal with a continuous function.
When the Jacobian signal is regularized by the regularized signal generated by the fourth method, the Jacobian signal (coded signal) after the regularization is approximated by a smooth continuous function by the regularized signal by the third method. , The signal corresponding to the generalized Jacobian signal after the regularization by the regularization signal by the second method.
 第2の方法においてδ=0としたものが第3の方法による正則化信号である。換言すると、第3の方法において微小定数δ>0を導入したものが第2の方法による正則化信号である。しかし、第3の方法による正則化信号である符号関数を近似しながら、先述の正則化信号の条件1-4を満たす近似関数は第2の方法による正則化信号に限定されない。 In the second method, δ = 0 is the regularized signal by the third method. In other words, the signal introduced by the fine constant δ> 0 in the third method is the regularized signal by the second method. However, the approximation function that satisfies the above-mentioned conditions 1-4 of the regularization signal while approximating the sign function that is the regularization signal by the third method is not limited to the regularization signal by the second method.
 例えば、符号関数を連続関数、あるいは、滑らかな連続関数で近似するものであれば、条件1-4の正則化信号の定義を満たす。このような符号関数の(滑らかな)近似関数は多数存在するが、上記の第2の方法による正則化信号による近似関数に限らず、例えば、以下の様な近似関数が考えられる。 For example, if the sign function is approximated by a continuous function or a smooth continuous function, the definition of the regularized signal in Condition 1-4 is satisfied. There are many (smooth) approximation functions of such a sign function, but the approximation function is not limited to the regularization signal by the second method described above, and for example, the following approximation functions can be considered.
 A.飽和関数
Figure JPOXMLDOC01-appb-M000028
 ここで、m(G(t))は、m(0)=0を満たすG(t)の厳密な単調増加関数であり、典型的な例としては、-sgn(G(t))/α・|G(t)|ρ,α>0,ρ>0の様なG(t)のべき乗関数であり、例えばρ=1の場合は、勾配G(t)を±1で打ち切ったものに相当する。
A. Saturation function
Figure JPOXMLDOC01-appb-M000028
Here, m (G (t)) is a strict monotonous increasing function of G (t) satisfying m (0) = 0, and as a typical example, −sgn (G (t)) / α. · | G (t) | A power function of G (t) such as ρ , α> 0, ρ> 0. For example, when ρ = 1, the gradient G (t) is cut off by ± 1. Equivalent to.
  B.シグモイド関数(ハイパボリックタンジェント)
Figure JPOXMLDOC01-appb-M000029
  C.アークタンジェント
Figure JPOXMLDOC01-appb-M000030
B. Sigmoid function (hyperbolic tangent)
Figure JPOXMLDOC01-appb-M000029
C. Arc tangent
Figure JPOXMLDOC01-appb-M000030
 上記A-Cの例の他、広義のシグモイド関数に含まれる、累積正規分布関数、ゴンペルツ関数、グーデルマン関数などを原点が中心になる様に平行移動し、値域が±1に適当にスケール変換した関数なども含まれる。あるいは、安定でオーバーシュートや振動が生じない伝達関数(例:高次遅れ系)のステップ応答(例:1次遅れ系の場合1-exp(t))の時間tをG(t)に置換(例:1次遅れ系の場合1-exp(G(t)))して、原点を中心に点対称になる様に折り返して接合した関数(例:1次遅れ系の場合sgn(G(t))(1-exp(|G(t)|))なども符号関数の滑らかな近似関数として機能するので、正則化信号として利用することができる。 In addition to the above examples of AC, the cumulative normal distribution function, Gompertz function, Gudermannian function, etc. included in the sigmoid function in a broad sense were translated so that the origin was at the center, and the value range was appropriately scale-converted to ± 1. Functions etc. are also included. Alternatively, replace the time t of the step response (eg, 1-exp (t) in the case of the first-order lag system) of the transfer function (eg, higher-order lag system) that is stable and does not cause overshoot or vibration with G (t). (Example: 1-exp (G (t)) in the case of a first-order lag system), and then folded back and joined so as to be point-symmetrical about the origin (Example: In the case of a first-order lag system, sgn (G (G (t))) Since t)) (1-exp (| G (t) |)) also functions as a smooth approximation function of the sign function, they can be used as a regularized signal.
 図12は、第1の実施形態において、第4の方法によって生成された正則化信号の一例について説明するための図である。図12は、勾配(ヤコビアン信号)G(t)と正則化後のヤコビアン信号Gn(t)との関係を図示したものであるが、G(t)=Gn(t)=0の原点が極値探索の極値に対応する。 FIG. 12 is a diagram for explaining an example of a regularized signal generated by the fourth method in the first embodiment. FIG. 12 illustrates the relationship between the gradient (Jacobian signal) G (t) and the regularized Jacobian signal Gn (t), and the origin of G (t) = Gn (t) = 0 is the pole. Corresponds to the extremum of value search.
 第3の方法による正則化後のヤコビアン信号(符号関数)を連続関数で近似するということは、極値近傍での正則化信号の挙動を調整していることに相当する。すなわち、単純に勾配を±1で打ち切った飽和関数を適用すると、極値近傍において従来の勾配型の極値制御をそのまま適用することとなる。なお、従来の極値制御は、評価関数がちょうど(操作量に関して)2次関数であるときにこれを微分した勾配が1次関数(線形関数)となる。このとき、勾配法を適用すると、ちょうど極値近傍で線形システムの収束特性(=指数関数的な収束)を持つ。このような収束は一般に好ましいと考えられる。 Approximating the Jacobian signal (sign function) after regularization by the third method with a continuous function corresponds to adjusting the behavior of the regularized signal near the extremum. That is, if a saturation function in which the gradient is simply cut off by ± 1 is applied, the conventional gradient type extreme value control is applied as it is in the vicinity of the extreme value. In the conventional extreme value control, when the evaluation function is just a quadratic function (with respect to the manipulated variable), the gradient obtained by differentiating it becomes a linear function (linear function). At this time, when the gradient method is applied, it has the convergence characteristic (= exponential convergence) of the linear system just near the extremum. Such convergence is generally considered preferable.
 そこで、もし評価関数の勾配の形状がG(t)∝uの様に操作量uのべき乗に比例している(評価関数はu(n+1)に比例する)とすると、Gn(t)=G(t)(1/n)とすると、Gn(t)∝uとなるので、極値近傍において指数関数的に収束させることができる。実際には、極値近傍での評価関数形状は未知であるため、nを理論的に求めることはできないが、極値近傍の挙動を確認しながら、符号関数の近似関数の選択やそのパラメータを調整することで、極値近傍の挙動のファインチューニングが可能になる。
 例えば、図12に示すように、シグモイド関数においてαの値を変えると、正則化信号Gn(t)の絶対値が1よりも小さな値になるときのG(t)の値が変化するため、極値の「近傍」の範囲A自身を調整することが可能になる。
Therefore, if the shape of the gradient of the evaluation function and G is proportional to a power of the manipulated variable u as a (t) αu n (evaluation function is proportional to u (n + 1)), Gn (t) = If G (t) (1 / n) is set, then Gn (t) ∝u, so that it can be converged exponentially in the vicinity of the extremum. Actually, since the shape of the evaluation function near the extremum is unknown, n cannot be theoretically obtained, but while checking the behavior near the extremum, the selection of the approximate function of the sign function and its parameters are selected. By adjusting, fine tuning of the behavior near the extreme value becomes possible.
For example, as shown in FIG. 12, when the value of α is changed in the sigmoid function, the value of G (t) when the absolute value of the regularization signal Gn (t) becomes smaller than 1 changes. It is possible to adjust the range A itself in the "neighborhood" of the extreme value.
 図13は、第1の実施形態において、第4の方法によって生成された正則化信号の他の例について説明するための図である。図13も図12と同様に、ヤコビアン信号G(t)と正則化後のヤコビアン信号Gn(t)との関係を図示したものであるが、G(t)=Gn(t)=0の原点が極値探索の極値に対応する。 FIG. 13 is a diagram for explaining another example of the regularization signal generated by the fourth method in the first embodiment. Similar to FIG. 12, FIG. 13 also illustrates the relationship between the Jacobian signal G (t) and the regularized Jacobian signal Gn (t), but the origin of G (t) = Gn (t) = 0. Corresponds to the extremum of the extremum search.
 例えば、図13に示すように、極値近傍Bにおいて、上に凸な形状の連続関数と下に凸な形状の連続関数とを切り替えると、極値近傍でチャタリングを生じる状況を抑制したり、極値探索制御で極値に収束せずにその付近で止まってしまう現象を改善させたりすることができる。 For example, as shown in FIG. 13, when the continuous function having a convex shape upward and the continuous function having a convex shape downward are switched in the extremum neighborhood B, the situation where chattering occurs near the extremum value can be suppressed. It is possible to improve the phenomenon that the extremum search control does not converge to the extremum and stops in the vicinity.
 上に凸な形状(例えばシグモイド関数にてα=3)の正則化後のヤコビアン信号Gn(t)を選ぶと、Gn(t)は純粋な符号関数に近くなっていくので、収束値が極値に到達しない場合にこのような形状の関数を選択することで、真の極値に近い探索が可能となる。ただし、上に凸の傾向が強すぎる(符号化関数に近づけすぎる)とチャタリングを起こす場合があることに留意すべきである。
 一方、下に凸な形状(例えば飽和関数にてα=10,ρ=1.5)の正則化信号Gn(t)を選ぶと、極値近傍でチャタリングを起こしている様な場合に、それを抑制する効果が認められる。下に凸の傾向が弱くなると、極値の探索性能が劣化することに留意すべきである。
If the Jacobian signal Gn (t) after regularization of an upwardly convex shape (for example, α = 3 in the sigmoid function) is selected, Gn (t) becomes closer to a pure sign function, so the convergence value is extreme. By selecting a function with such a shape when the value is not reached, a search close to the true extremum becomes possible. However, it should be noted that chattering may occur if the upward convex tendency is too strong (too close to the coding function).
On the other hand, if a regularized signal Gn (t) having a downwardly convex shape (for example, α = 10, ρ = 1.5 in the saturation function) is selected, chattering occurs near the extreme value. The effect of suppressing is recognized. It should be noted that the weakening of the downward convexity reduces the extremum search performance.
 図14Aは、第1の実施形態において、正則化信号を用いることなく操作量の応答をシミュレーションした結果の一例を説明するための図である。ここでは、1次遅れ系にy=uという関数を付加した仮想的な制御対象に対して、様々な正則化信号を適用した場合の操作量の応答をシミュレーションした結果を例示している。
 図14Aに示す例は、正則化信号を適用しなかった比較例であり、初期値が2の場合にうまく動作する様に調整している。この時、初期値を3に変更すると極値制御は発散してしまう。これは、y=uの一次微分はy´=5uであり、初期値が2のときにy´=5・2=160であるのに対し、初期値が3のときにはy´=5・3=1215となり、勾配が大きく変化してしまうためである。
FIG. 14A is a diagram for explaining an example of the result of simulating the response of the manipulated variable without using the regularization signal in the first embodiment. Here, the result of simulating the response of the operation amount when various regularized signals are applied to a virtual control target in which a function of y = u 5 is added to the first-order lag system is illustrated.
The example shown in FIG. 14A is a comparative example to which the regularization signal is not applied, and is adjusted so as to operate well when the initial value is 2. At this time, if the initial value is changed to 3, the extreme value control diverges. This is because the first derivative of y = u 5 is y'= 5u 4 , and when the initial value is 2, y'= 5.2 5 = 160, whereas when the initial value is 3, y'= This is because 5 ・ 3 5 = 1215, and the gradient changes significantly.
 また、初期値を2とした場合でも、操作量は、所定時間以内に極値(=1)が極値に収束することがなかった。これは、uが0に近くなるとy´が急速に0に近くなるためである。すなわち、仮想的な制御対象において、極値の付近はほぼ勾配がフラットになるためである。 Further, even when the initial value was set to 2, the extreme value (= 1) of the manipulated variable did not converge to the extreme value within a predetermined time. This is because when u approaches 0, y'quickly approaches 0. That is, in a virtual control target, the gradient becomes almost flat near the extreme value.
 図14Bは、第1の実施形態において、正則化信号として勾配の符号信号を用いて操作量の応答をシミュレーションした結果の一例を説明するための図である。
 図14Aに示す比較例に対し、図14B-14Dに示した例は正則化信号を適用したものである。図14Bに示した例は、第3の方法による正則化信号である符号関数を適用した場合である。この時、初期値が2でも3でも同じような収束速度で極値の探索が可能になっているが、極値である1付近でチャタリングを起こしている。これは、符号関数が最も強い不連続なスイッチング関数になっているためである。
FIG. 14B is a diagram for explaining an example of the result of simulating the response of the manipulated variable using the coded signal of the gradient as the regularized signal in the first embodiment.
In contrast to the comparative example shown in FIG. 14A, the example shown in FIGS. 14B-14D applies a regularization signal. The example shown in FIG. 14B is a case where a sign function which is a regularized signal by the third method is applied. At this time, the extremum can be searched at the same convergence speed regardless of whether the initial value is 2 or 3, but chattering occurs near the extremum 1. This is because the sign function is the strongest discontinuous switching function.
 図14Cおよび14Dは、第1の実施形態において、第4の方法によって生成された正則化信号を用いたときの効果の一例を説明するための図である。
 図14Cに示した例は、符号関数の連続近似関数として第4の方法のAに示した飽和関数(ρ=1)を適用したものである。図14Bと同様に初期値に関わらず同じような収束速度で極値の探索が可能になるが、図14Aと同じように所定時間以内に極値に到達していない。これは、極値近傍で勾配そのものを用いているため、極値近傍では図14Aと同じ現象が生じるためである。
14C and 14D are diagrams for explaining an example of the effect when the regularized signal generated by the fourth method is used in the first embodiment.
In the example shown in FIG. 14C, the saturation function (ρ = 1) shown in A of the fourth method is applied as a continuous approximation function of the sign function. Similar to FIG. 14B, the extreme value can be searched at the same convergence speed regardless of the initial value, but the extreme value is not reached within a predetermined time as in FIG. 14A. This is because the gradient itself is used near the extreme value, so that the same phenomenon as in FIG. 14A occurs near the extreme value.
 図14Dに示した例は、符号関数の連続近似関数として例えば、第4の方法のAに示した-sgn(G(t))/α・|G(t)|ρ,α>0,ρ>0の,ρ≠1様なG(t)のべき乗関数である飽和関数を適用したものであって、α=1,ρ=1/5としたものである。この例では、初期値によらず極値の探索が可能であり、チャタリングを抑制できた。この例のように、符号関数を近似する連続関数をうまく利用することにより、極値探索制御の応答を調整することが可能である。 The example shown in FIG. 14D is, for example, -sgn (G (t)) / α · | G (t) | ρ , α> 0, ρ shown in A of the fourth method as a continuous approximation function of the sign function. The saturation function, which is a power function of G (t) such as> 0, ρ ≠ 1, is applied, and α = 1, ρ = 1/5. In this example, the extremum can be searched regardless of the initial value, and chattering can be suppressed. As in this example, it is possible to adjust the response of the extreme value search control by making good use of the continuous function that approximates the sign function.
 第4の方法により生成された正則化信号を用いた極値制御システムは、例えば図11に示す構成例において、符号関数sgn()をG(t)による近似関数(一例として、上述の関数A-C)とすることにより、実現することが出来る。
 また、第4の方法により生成された正則化信号を用いた極値制御システムは、例えば図9及び図10に示す構成例において、正則化信号出力部23にて、近似関数に対応する演算を行うことにより、実現することが出来る。すなわち、正則化信号出力部23の出力が、正則化後のヤコビアン信号をG(t)で除した値となるように、正則化信号出力部23での演算式を設定しておけばよい。
In the extremum control system using the regularized signal generated by the fourth method, for example, in the configuration example shown in FIG. 11, the sign function sgn () is approximated by G (t) (as an example, the above-mentioned function A). -C) can be realized.
Further, in the extremum control system using the regularized signal generated by the fourth method, for example, in the configuration examples shown in FIGS. 9 and 10, the regularized signal output unit 23 performs an operation corresponding to the approximate function. By doing so, it can be realized. That is, the arithmetic expression in the regularized signal output unit 23 may be set so that the output of the regularized signal output unit 23 is a value obtained by dividing the Jacobian signal after the regularization by G (t).
 上記第4の方法によれば、正則化関数は第3の方法によるものに限定されず、連続近似関数の選択やその関数が持つパラメータを調整することにより、探索された極値近傍での挙動を容易に調整することが可能になり、よりきめ細やかな極値探索制御を実現できる。 According to the fourth method, the regularization function is not limited to that of the third method, and the behavior near the searched extreme value is obtained by selecting a continuous approximation function and adjusting the parameters of the function. Can be easily adjusted, and more detailed extremum search control can be realized.
 以上説明したように、正則化信号出力部23が生成する正則化信号は、操作量を動かすべき方向及び量を与えるアベレージシステムから評価関数の非線形要素による複雑な挙動を追い出す役割を持ち、パラメータ調整部2は、このような性質を有する正則化信号を用いて勾配推定部22の出力するヤコビアン信号を正則化する。そして、積分器15が、調整された積分ゲインを用いて正則化されたヤコビアンを積分することにより、極値制御の探索方向をより適切に制御することが可能となる。 As described above, the regularization signal generated by the regularization signal output unit 23 has a role of expelling complicated behavior due to the non-linear element of the evaluation function from the average system that gives the direction and amount to move the operation amount, and parameter adjustment. The unit 2 regularizes the Jacobian signal output by the gradient estimation unit 22 by using the regularization signal having such a property. Then, the integrator 15 can more appropriately control the search direction of the extreme value control by integrating the regularized Jacobian using the adjusted integration gain.
 このように構成された第1の実施形態の極値制御システム1は、ヤコビアン信号を正則化し、正則化したヤコビアン信号に基づいて積分ゲインを適応的に調整するパラメータ調整部を備えることにより、制御対象プロセスの極値制御を、そのダイナミクスに適応してより安定的に動作させることが可能となる。 The extremum control system 1 of the first embodiment configured as described above is controlled by providing a parameter adjusting unit that regularizes the Jacobian signal and adaptively adjusts the integral gain based on the regularized Jacobian signal. It is possible to adapt the extreme value control of the target process to its dynamics and operate it more stably.
 具体的には、第1の実施形態の極値制御システム1は、ヤコビアン信号の正則化によって以下の(1)及び(2)を実現することにより、極値制御の安定性を損なうことなく積分ゲインを容易に調整することを可能とする。(1)積分ゲインの算出時におけるゼロ割りを回避する。(2)ヤコビアン信号の符号情報を維持する(急激な符号の反転を回避する)。
 これにより、制御対象プロセスのオペレータは極値制御の収束速度をより容易に、かつ安全に調整することが可能となる。
Specifically, the extremum control system 1 of the first embodiment realizes the following (1) and (2) by regularizing the Jacobian signal, thereby integrating without impairing the stability of the extremum control. It makes it possible to easily adjust the gain. (1) Avoid zero division when calculating the integral gain. (2) Maintain the sign information of the Jacobian signal (avoid sudden sign inversion).
As a result, the operator of the controlled process can more easily and safely adjust the convergence speed of the extreme value control.
(第2の実施形態)
 図12は、第2の実施形態の極値制御システム1aの構成例を示す図である。極値制御システム1aは、操作量補正部3をさらに備える点で第1の実施形態の極値制御システム1と異なる。図12に示す極値制御システム1aは、第1の実施形態において第2の方法によって正則化信号を生成した極値制御システムの構成例のうち、図9に示した極値制御システム1に操作量補正部3を追加して構成したものである。極値制御システム1aのそれ以外の構成は第1の実施形態の極値制御システム1と同様のため、ここでは図3と同じ符号を付すことにより、それらの同様の機能部についての説明を省略する。
(Second Embodiment)
FIG. 12 is a diagram showing a configuration example of the extreme value control system 1a of the second embodiment. The extreme value control system 1a is different from the extreme value control system 1 of the first embodiment in that it further includes an operation amount correction unit 3. The extreme value control system 1a shown in FIG. 12 is operated by the extreme value control system 1 shown in FIG. 9 among the configuration examples of the extreme value control system that generated the regularized signal by the second method in the first embodiment. It is configured by adding the amount correction unit 3. Since the other configurations of the extreme value control system 1a are the same as those of the extreme value control system 1 of the first embodiment, the same reference numerals as those in FIG. 3 are given here, and the description of those similar functional parts is omitted. To do.
 評価関数のヘシアンに基づく正則化信号を用いた場合、式(22)のような線形近似システムが得られるため、ヘシアンの推定精度が高ければ、操作量や評価量を指数関数的に極値に収束させることができると考えられる。このような収束の態様は(一次の)線形システムの特徴でもあり、ヘシアンの推定精度を高めることによりこのような特徴が期待どおりに得られやすくなると考えられる。 When a regularization signal based on the evaluation function Hessian is used, a linear approximation system as shown in Eq. (22) can be obtained. Therefore, if the estimation accuracy of Hessian is high, the operation amount and the evaluation amount are exponentially extremized. It is thought that it can be converged. Such a mode of convergence is also a feature of a (first-order) linear system, and it is considered that such a feature can be easily obtained as expected by increasing the estimation accuracy of Hessian.
 一方、評価関数のヤコビアンに基づく正則化信号を用いた場合、式(25)のようなアベレージシステムが得られるため、極値への収束は直線的になると考えられる。このような直線的な探索が求められる場合もあるが、一般には、操作量や評価量が目標とする極値から大きく離れている場合には速やかに極値に近づけ、極値近傍では少しずつ極値に近づけることが求められる場合が多い。 On the other hand, when the regularization signal based on the Jacobian evaluation function is used, the average system as shown in Eq. (25) can be obtained, so that the convergence to the extreme value is considered to be linear. Such a linear search may be required, but in general, when the manipulated variable or evaluation amount is far from the target extremum, it quickly approaches the extremum, and gradually approaches the extremum. It is often required to approach the extreme value.
 式(25)のようなアベレージシステムで表される極値制御では、極値近傍でチャタリングを生じる可能性が高い。このようなチャタリングは、上述の式(27)のように、式(25)を微小な定数δ(>0)で調整することで抑制することができるが、これは極値近傍における極値探索の挙動を調整するものであり、極値探索の動きを全体的に調整するものではない。 In extreme value control represented by an average system as in equation (25), there is a high possibility that chattering will occur near the extreme value. Such chattering can be suppressed by adjusting the equation (25) with a minute constant δ (> 0) as in the above equation (27), but this is an extremum search near the extremum. It adjusts the behavior of, and does not adjust the movement of the extremum search as a whole.
 そこで、式(19)又は(22)のアベレージシステムの挙動を全体的に調整するためには、操作量u又は操作量偏差u-を右辺に持つようにすればよい。すなわち、式(22)を式(29)のように変形することができれば良い。 Therefore, in order to adjust the behavior of the average system of the equation (19) or (22) as a whole, it is sufficient to have the manipulated variable u or the manipulated variable deviation u- on the right side. That is, it suffices if the equation (22) can be transformed as in the equation (29).
Figure JPOXMLDOC01-appb-M000031
Figure JPOXMLDOC01-appb-M000031
 ここで、F(u~)はF(0)=0を満たし、u~の符号とF(u~)の符号とが一致する関数である。すなわちF(u~)は、u~×F(u~)>0を満たす。最も単純な例はF(u~)=u~であり、この場合、式(29)は線形微分方程式となり、u~は指数関数的にゼロに収束する。なお、uではなくu~を考えるのは、uの最適値u*が未知であるのに対してu~はゼロに収束すればよいため、式(29)の平衡点を0とすればよいからである。また、u~の適当なべき乗関数を用いると、極値から離れている場合に、動作点をより早く極値方向に移動させ、極値近傍では緩やかに動作点を動かすようにすることもできる。 Here, F (u ~) is a function that satisfies F (0) = 0 and the sign of u ~ and the sign of F (u ~) match. That is, F (u ~) satisfies u ~ × F (u ~)> 0. The simplest example is F (u ~) = u ~, in which case equation (29) is a linear differential equation and u ~ exponentially converges to zero. It should be noted that the reason why u ~ is considered instead of u is that the optimum value u * of u is unknown, whereas u ~ needs to converge to zero. Therefore, the equilibrium point of the equation (29) may be set to 0. Because. Further, by using an appropriate exponentiation function of u to, it is possible to move the operating point toward the extremum faster when it is far from the extremum, and to move the operating point gently near the extremum. ..
 操作量補正部3は、式(29)のような挙動を持つアベレージシステムを得るためのものであり、操作量uをフィードバックして正則化信号と掛け合わせることにより、極値から離れた動作点ほどより大きく極値方向に動かすように操作量を補正する。例えば、第3の方法で正則化信号を生成する場合には式(25)のアベレージシステムが得られるが、式(29)式のようなシステムを得るためには、uの微分とu~の微分が等しいことに着目して式(25)の右辺にF(u~)を掛ければよい。 The operation amount correction unit 3 is for obtaining an average system having the behavior as shown in the equation (29), and by feeding back the operation amount u and multiplying it with the regularization signal, the operating point away from the extreme value. Correct the operation amount so that it moves in the extreme value direction more moderately. For example, when the regularization signal is generated by the third method, the average system of Eq. (25) can be obtained, but in order to obtain the system of Eq. (29), the derivative of u and u to Focusing on the fact that the derivatives are equal, the right side of equation (25) may be multiplied by F (u ~).
 ただし、u~=u-u*であり、操作量の最適値が未知であるため、u~を直接的に用いることはできない。しかしながら、操作量uは用いることができるので、u~の推定値としてuにハイパスフィルタを作用させて未知の定数項u*を除去することで近似的にu~を得ることができる。すなわち、u~を式(30)で推定することができる。 However, u ~ = u-u *, and the optimum value of the manipulated variable is unknown, so u ~ cannot be used directly. However, since the manipulated variable u can be used, u ~ can be approximately obtained by applying a high-pass filter to u as an estimated value of u ~ and removing the unknown constant term u *. That is, u ~ can be estimated by the equation (30).
Figure JPOXMLDOC01-appb-M000032
Figure JPOXMLDOC01-appb-M000032
 式(30)によって推定したu~の信号を、正規化されたヤコビアン信号に掛け合わせることで、指数関数的な応答特性を持つように極値探索を修正することができる。なお、指数関数的な応答特性よりもさらに早い速度で動作点を極値近傍に移動させたい場合、例えば操作量u~の推定値を式(30)で得られる値のべき乗値とすることで動作点の動き幅をより大きくするようにしてもよい。 By multiplying the signal of u ~ estimated by the equation (30) with the normalized Jacobian signal, the extremum search can be modified so as to have an exponential response characteristic. If you want to move the operating point to the vicinity of the extreme value at a speed even faster than the exponential response characteristic, for example, by setting the estimated value of the operation amount u to the power value of the value obtained by the equation (30). The movement width of the operating point may be made larger.
 図13及び図14は、第2の実施形態における極値制御システム1aの構成例を示す図である。具体的には、図13は、第1の実施形態において第2の方法によって正則化信号を生成した極値制御システム1の構成例のうち、図10に示した極値制御システム1に操作量補正部3を追加して構成される極値制御システム1aの例を示す。図14は、第1の実施形態において第3の方法によって正則化信号を生成した極値制御システム1(図11参照)に操作量補正部3を追加して構成される極値制御システム1aの例を示す。 13 and 14 are diagrams showing a configuration example of the extreme value control system 1a according to the second embodiment. Specifically, FIG. 13 shows the operation amount of the extreme value control system 1 shown in FIG. 10 among the configuration examples of the extreme value control system 1 that generated the regularized signal by the second method in the first embodiment. An example of the extreme value control system 1a configured by adding the correction unit 3 is shown. FIG. 14 shows the extreme value control system 1a configured by adding the manipulated variable correction unit 3 to the extreme value control system 1 (see FIG. 11) that generated the regularized signal by the third method in the first embodiment. An example is shown.
 このように構成された第2の実施形態の極値制御システム1aは、第1の実施形態の極値制御システム1と同様の効果を奏することに加え、極値探索の速度をより細かく調整することが可能になる。これにより、動作点が極値から大きく離れているときには速やかに動作点を極値近傍に移動させ、極値近傍では細かく動作点を動かすことができるため、極値探索の速度及び精度を向上させることが可能となる。 The extreme value control system 1a of the second embodiment configured in this way has the same effect as the extreme value control system 1 of the first embodiment, and in addition, finely adjusts the speed of the extreme value search. Will be possible. As a result, when the operating point is far away from the extremum, the operating point can be quickly moved to the vicinity of the extremum, and the operating point can be finely moved near the extremum, thus improving the speed and accuracy of the extremum search. It becomes possible.
(適用例)
 図15は、第1の実施形態の極値制御システム又は第2の実施形態の極値制御システムの適用例を示す図である。図15は、実施形態の極値制御システムを生物学的排水処理プロセスを実現する水処理プラント4に適用した例を示す。例えば、図15に示す水処理プラント4は、嫌気槽41、無酸素槽42、好気槽43及び最終沈澱池44の各設備を備える。嫌気槽41は、微生物を活性化させるための設備である。無酸素槽42は、窒素を除去するための設備である。好気槽43は有機物の分解やリンの除去、アンモニアの硝化を行うための設備である。最終沈澱池44は、活性汚泥を沈殿させるための設備である。
(Application example)
FIG. 15 is a diagram showing an application example of the extreme value control system of the first embodiment or the extreme value control system of the second embodiment. FIG. 15 shows an example in which the extremum control system of the embodiment is applied to a water treatment plant 4 that realizes a biological wastewater treatment process. For example, the water treatment plant 4 shown in FIG. 15 includes facilities of an anaerobic tank 41, an oxygen-free tank 42, an aerobic tank 43, and a final settling pond 44. The anaerobic tank 41 is a facility for activating microorganisms. The oxygen-free tank 42 is a facility for removing nitrogen. The aerobic tank 43 is a facility for decomposing organic substances, removing phosphorus, and nitrifying ammonia. The final sedimentation pond 44 is a facility for precipitating activated sludge.
 水処理プラント4には、上記設備間で水や汚泥を搬送するポンプや、槽内に空気を供給するブロワ、空気中又は水中の物質の濃度を計測するセンサー等の設備が設置される。薬品投入ポンプ411は、微生物を活性化させる炭素源等の薬品を嫌気槽41に投入するポンプである。循環ポンプ431は、好気槽43と無酸素槽42との間で循環する被処理水の循環量を制御するポンプである。ブロワ432は、好気槽43に空気を供給して曝気量を制御する。返送汚泥ポンプ441は、最終沈澱池44から無酸素槽42に汚泥を返送するポンプである。余剰汚泥引き抜きポンプ442は、最終沈澱池44から過剰な汚泥を引き抜くポンプである。センサー412及びセンサー443は、それぞれ、嫌気槽41及び最終沈澱池44における放流水の水質を計測する。 The water treatment plant 4 is equipped with equipment such as a pump for transporting water and sludge between the above equipment, a blower for supplying air into the tank, and a sensor for measuring the concentration of substances in the air or water. The chemical input pump 411 is a pump that charges a chemical such as a carbon source that activates microorganisms into the anaerobic tank 41. The circulation pump 431 is a pump that controls the circulation amount of the water to be treated that circulates between the aerobic tank 43 and the anoxic tank 42. The blower 432 supplies air to the aerobic tank 43 to control the amount of aeration. The return sludge pump 441 is a pump that returns sludge from the final settling pond 44 to the oxygen-free tank 42. The excess sludge extraction pump 442 is a pump that extracts excess sludge from the final sedimentation pond 44. The sensor 412 and the sensor 443 measure the quality of the discharged water in the anaerobic tank 41 and the final settling pond 44, respectively.
 一般に、このような生物学的廃水処理プロセスでは、操作量は返送汚泥の返送率であり、制御量は放流水に含まれる窒素の濃度(以下「放流窒素濃度」という。)及びリンの濃度(以下「放流リン濃度」という。)である。返送率は、返送汚泥ポンプ441の放流量を流入量で割ることによって得られる。放流窒素濃度及び放流リン濃度は、センサー412及びセンサー443によって取得される。なお、制御量を、放流水に含まれる窒素の量(以下「放流窒素量」という。)及びリンの量(以下「放流リン量」という。)としてもよい。この場合、放流窒素量及び放流リン量は、それぞれ放流窒素濃度及び放流リン濃度に放流量を乗算することにより得られる。 Generally, in such a biological wastewater treatment process, the manipulated amount is the return rate of the returned sludge, and the controlled amount is the concentration of nitrogen contained in the discharged water (hereinafter referred to as "discharged nitrogen concentration") and the concentration of phosphorus (hereinafter referred to as "discharged nitrogen concentration"). Hereinafter referred to as "released phosphorus concentration"). The return rate is obtained by dividing the discharge rate of the return sludge pump 441 by the inflow rate. The released nitrogen concentration and the released phosphorus concentration are acquired by the sensor 412 and the sensor 443. The controlled amount may be the amount of nitrogen contained in the discharged water (hereinafter referred to as "the amount of discharged nitrogen") and the amount of phosphorus (hereinafter referred to as "the amount of discharged phosphorus"). In this case, the amount of discharged nitrogen and the amount of discharged phosphorus are obtained by multiplying the discharged nitrogen concentration and the discharged phosphorus concentration by the discharged flow rate, respectively.
 適用例の極値制御システム1bは、このような水処理プラント4から放流窒素量や放流リン量等の制御量に基づく評価量を入力して極値制御を実行することにより、評価量を最適値に近づけるように操作量を更新していく。この場合に用いる評価関数の一例として、評価量を排水賦課金の考え方に基づく水質コストと、返送汚泥ポンプ441の電力コストとの総和(以下「総コスト」という。)として表す方法が考えられる。返送汚泥ポンプ441の電力コストは、返送汚泥流量と返送汚泥ポンプ441の定格電力などから算出することができる。一般に、排水賦課金の考え方では、水質コストは以下の式で表される。 The extreme value control system 1b of the application example optimizes the evaluation amount by inputting the evaluation amount based on the control amount such as the amount of discharged nitrogen and the amount of discharged phosphorus from the water treatment plant 4 and executing the extreme value control. The operation amount is updated so that it approaches the value. As an example of the evaluation function used in this case, a method of expressing the evaluation amount as the sum of the water quality cost based on the concept of wastewater levy and the electric power cost of the return sludge pump 441 (hereinafter referred to as "total cost") can be considered. The electric power cost of the return sludge pump 441 can be calculated from the return sludge flow rate, the rated power of the return sludge pump 441, and the like. Generally, in the concept of wastewater levy, the water quality cost is expressed by the following formula.
Figure JPOXMLDOC01-appb-M000033
Figure JPOXMLDOC01-appb-M000033
 式(31)においてCODは化学的酸素要求量、BODは生物化学的酸素要求量、TNは放流される窒素、TPは放流されるリンを意味する。各コストの換算係数は、実際の排水賦課金に基づいて決定されても良いし、他の方法によって決定されてもよい。一般に、TN及びTPは、返送率を変えることによって大きく変化することが知られているため、ここでは返送率の制御に関する水質コストJ1を式(32)のように定義することにする。 In formula (31), COD means chemical oxygen demand, BOD means biochemical oxygen demand, TN means released nitrogen, and TP means released phosphorus. The conversion factor for each cost may be determined based on the actual wastewater levy or may be determined by other methods. In general, it is known that TN and TP change significantly by changing the return rate. Therefore, here, the water quality cost J1 relating to the control of the return rate is defined as in the equation (32).
Figure JPOXMLDOC01-appb-M000034
Figure JPOXMLDOC01-appb-M000034
 このような水質コストに加え、返送流量を変化させることによって間接的に変化するブロワの電力コストと、返送ポンプの電力コストとを合計した運転コストJ2を定義し、その運転コストJ2と水質コストJ1との合計を総コストとする関数を評価関数として定義してもよい。例えば、運転コストJ2は式(33)のように定義することができる。 In addition to such water quality cost, the operating cost J2, which is the sum of the power cost of the blower indirectly changed by changing the return flow rate and the power cost of the return pump, is defined, and the operating cost J2 and the water quality cost J1 are defined. A function whose total cost is the sum of and may be defined as an evaluation function. For example, the operating cost J2 can be defined as in the equation (33).
Figure JPOXMLDOC01-appb-M000035
Figure JPOXMLDOC01-appb-M000035
 適用例の極値制御システム1bは、このような評価関数によって取得される評価量からヤコビアン信号を抽出し、そのヤコビアン信号に対して上述の正則化信号を作用させて極値制御を行うことにより、水処理プラント4のダイナミクスに対して適応的に積分ゲインを更新することができる。これにより、適用例の極値制御システム1bは、総コストを最小化する最適な操作量を、より安定した動作で探索することが可能となる。なお、実施形態の極値制御システムは、操作量の入力に対して制御量を出力する任意のプロセスの制御に適用可能である。例えば、制御対象プロセスは、下水処理プロセスや燃焼プロセス、石油化学プロセスなどであってもよい。 The extreme value control system 1b of the application example extracts a Jacobian signal from the evaluation amount acquired by such an evaluation function, and applies the above-mentioned regularized signal to the Jacobian signal to perform extreme value control. , The integrated gain can be updated adaptively to the dynamics of the water treatment plant 4. As a result, the extreme value control system 1b of the application example can search for the optimum operation amount that minimizes the total cost with more stable operation. The extreme value control system of the embodiment can be applied to the control of an arbitrary process that outputs a control amount with respect to the input of the operation amount. For example, the controlled process may be a sewage treatment process, a combustion process, a petrochemical process, or the like.
 以上説明した少なくともひとつの実施形態によれば、評価関数のヤコビアンを推定する第1の勾配推定部と、ヤコビアンの推定値を積分することにより操作量を動かすべき方向及び量を決定する操作量決定部と、評価関数のヘシアンを推定する第2の勾配推定部と、操作量決定部に入力されるヤコビアンの推定値を、評価関数のヤコビアン又はヘシアンの推定値に基づく値であって0とならないように調整された正則化信号で除することにより、操作量決定部の積分ゲインを評価関数の変化に応じて調整するパラメータ調整部と、を持つことにより、制御対象プロセスのダイナミクスに適応して極値制御をより安定的に動作させることができる。 According to at least one embodiment described above, the first gradient estimation unit that estimates the Jacobian of the evaluation function and the operation amount determination that determines the direction and amount to move the operation amount by integrating the estimated values of the Jacobian. The Jacobian estimation value input to the unit, the second gradient estimation unit that estimates the evaluation function Hessian, and the manipulation amount determination unit is a value based on the Jacobian or Hessian estimation value of the evaluation function and does not become 0. By dividing by the regularized signal adjusted in this way, it has a parameter adjustment unit that adjusts the integral gain of the operation amount determination unit according to the change of the evaluation function, and adapts to the dynamics of the controlled process. Extreme value control can be operated more stably.
 本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 Although some embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the gist of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, as well as in the scope of the invention described in the claims and the equivalent scope thereof.

Claims (8)

  1.  制御対象プロセスの制御量に基づく値であって前記制御対象プロセスの操作量に対して未知の評価関数によって表される値である評価量を、前記評価関数の最適値に近づけるように前記操作量を更新する極値制御を実行する最適制御装置であって、
     前記評価量を示す信号に基づいて前記評価関数のヤコビアンを推定する第1の勾配推定部と、
     前記ヤコビアンの推定値を積分することにより前記操作量を動かすべき方向及び量を決定する操作量決定部と、
     前記評価量を示す信号に基づいて前記評価関数のヘシアンを推定する第2の勾配推定部と、
     前記操作量決定部に入力される前記ヤコビアンの推定値を、前記評価関数のヤコビアン又はヘシアンの推定値に基づく値であって0とならないように調整された正則化信号で除することにより、前記操作量決定部の積分ゲインを前記評価関数の変化に応じて調整するパラメータ調整部と、
     を備える最適制御装置。
    The manipulated variable so that the evaluation quantity, which is a value based on the controlled quantity of the controlled target process and is a value represented by an unknown evaluation function with respect to the manipulated variable of the controlled target process, approaches the optimum value of the evaluation function. It is an optimal control device that executes extreme value control to update
    A first gradient estimation unit that estimates the Jacobian of the evaluation function based on a signal indicating the evaluation amount, and
    An operation amount determination unit that determines the direction and amount to move the operation amount by integrating the Jacobian estimates.
    A second gradient estimation unit that estimates the hesian of the evaluation function based on the signal indicating the evaluation amount, and
    By dividing the estimated value of the Jacobian input to the manipulated variable determination unit by a regularized signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0, the said A parameter adjusting unit that adjusts the integrated gain of the manipulated variable determination unit according to changes in the evaluation function,
    Optimal control device equipped with.
  2.  前記正則化信号は、前記ヘシアンの推定値の絶対値のN乗値(N>0)に微小な正の定数δを加算した値のN乗根、又は前記ヘシアンの推定値の絶対値のM-1乗値(M≧1)を、前記ヘシアンの推定値の絶対値のM乗値に微小な正の定数δを加算した値で除した値によって表される信号である、
     請求項1に記載の最適制御装置。
    The regularization signal is the Nth root of the value obtained by adding a small positive constant δ to the Nth root value (N> 0) of the absolute value of the Hessian estimate, or the absolute value M of the Hessian estimate. It is a signal represented by a value obtained by dividing the -1 power value (M ≧ 1) by the value obtained by adding a minute positive constant δ to the M power value of the absolute value of the estimated value of Hesian.
    The optimal control device according to claim 1.
  3.  前記正則化信号は、前記ヤコビアンの推定値の絶対値のN乗値(N>0)に微小な正の定数δを加算した値のN乗根、又は前記ヤコビアンの推定値の絶対値のM-1乗値(M≧1)を、前記ヤコビアンの推定値の絶対値のM乗値に微小な正の定数δを加算した値で除した値によって表される信号である、
     請求項1に記載の最適制御装置。
    The regularization signal is the Nth root of the value obtained by adding a minute positive constant δ to the Nth root value (N> 0) of the absolute value of the Jacobian estimated value, or the M of the absolute value of the Jacobian estimated value. It is a signal represented by a value obtained by dividing the -1 power value (M ≧ 1) by the value obtained by adding a minute positive constant δ to the M power value of the absolute value of the estimated value of Jacobian.
    The optimal control device according to claim 1.
  4.  前記正則化信号は、前記ヤコビアンの推定値の絶対値によって表される信号である、
     請求項1に記載の最適制御装置。
    The regularization signal is a signal represented by the absolute value of the Jacobian estimate.
    The optimal control device according to claim 1.
  5.  前記正則化信号は、前記ヤコビアンの符号推定値を連続関数で近似した近似符号推定値によって表される信号である、
     請求項1に記載の最適制御装置。
    The regularized signal is a signal represented by an approximate code estimate obtained by approximating the Jacobian code estimate with a continuous function.
    The optimal control device according to claim 1.
  6.  前記操作量決定部によって決定された操作量をフィードバックして前記正則化信号に掛け合わせることにより、極値から離れた動作点ほどより大きく極値方向に動かすように前記操作量を補正する操作量補正部をさらに備える、
     請求項3乃至5のいずれか1項に記載の最適制御装置。
    By feeding back the operation amount determined by the operation amount determination unit and multiplying it by the regularization signal, the operation amount is corrected so that the operating point farther from the extreme value moves in the extreme value direction. Further equipped with a correction unit,
    The optimal control device according to any one of claims 3 to 5.
  7.  制御対象プロセスの制御量に基づく値であって前記制御対象プロセスの操作量に対して未知の評価関数によって表される値である評価量を、前記評価関数の最適値に近づけるように前記操作量を更新する極値制御を実行する方法であって、
     前記評価量を示す信号に基づいて前記評価関数のヤコビアンを推定するステップと、
     前記ヤコビアンの推定値を積分することにより前記操作量を動かすべき方向及び量を決定する操作量決定ステップと、
     前記評価量を示す信号に基づいて前記評価関数のヘシアンを推定するステップと、
     前記操作量決定ステップに入力する前記ヤコビアンの推定値を、前記評価関数のヤコビアン又はヘシアンの推定値に基づく値であって0とならないように調整された正則化信号で除することにより、前記操作量決定ステップにおける積分ゲインを前記評価関数の変化に応じて調整するステップと、
     を有する最適制御方法。
    The manipulated variable so that the evaluation quantity, which is a value based on the controlled quantity of the controlled target process and is a value represented by an unknown evaluation function with respect to the manipulated variable of the controlled target process, approaches the optimum value of the evaluation function. Is a way to perform extremum control to update
    The step of estimating the Jacobian of the evaluation function based on the signal indicating the evaluation amount, and
    The operation amount determination step of determining the direction and amount to move the operation amount by integrating the Jacobian estimates, and
    The step of estimating the hesian of the evaluation function based on the signal indicating the evaluation amount, and
    The operation is performed by dividing the estimated value of the Jacobian input to the operation amount determination step by a regularization signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0. A step of adjusting the integrated gain in the quantity determination step according to a change in the evaluation function, and
    Optimal control method with.
  8.  制御対象プロセスの制御量に基づく値であって前記制御対象プロセスの操作量に対して未知の評価関数によって表される値である評価量を、前記評価関数の最適値に近づけるように前記操作量を更新する極値制御を実行する最適制御装置として機能するコンピュータに、
     前記評価量を示す信号に基づいて前記評価関数のヤコビアンを推定するステップと、
     前記ヤコビアンの推定値を積分することにより前記操作量を動かすべき方向及び量を決定する操作量決定ステップと、
     前記評価量を示す信号に基づいて前記評価関数のヘシアンを推定するステップと、
     前記操作量決定ステップに入力する前記ヤコビアンの推定値を、前記評価関数のヤコビアン又はヘシアンの推定値に基づく値であって0とならないように調整された正則化信号で除することにより、前記操作量決定ステップにおける積分ゲインを前記評価関数の変化に応じて調整するステップと、
     を実行させるためのコンピュータプログラム。
    The manipulated value so that the evaluation quantity, which is a value based on the controlled quantity of the controlled target process and is a value represented by an unknown evaluation function with respect to the manipulated quantity of the controlled target process, approaches the optimum value of the evaluation function. To a computer that functions as an optimal control device that performs extreme value control
    The step of estimating the Jacobian of the evaluation function based on the signal indicating the evaluation amount, and
    The operation amount determination step of determining the direction and amount to move the operation amount by integrating the Jacobian estimates, and
    The step of estimating the hesian of the evaluation function based on the signal indicating the evaluation amount, and
    The operation is performed by dividing the estimated value of the Jacobian input to the operation amount determination step by a regularization signal that is based on the estimated value of the Jacobian or Hesian of the evaluation function and is adjusted so as not to be 0. A step of adjusting the integrated gain in the quantity determination step according to a change in the evaluation function, and
    A computer program to run.
PCT/JP2020/020816 2019-05-29 2020-05-27 Optimum control device, optimum control method and computer program WO2020241657A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2021522795A JP7183411B2 (en) 2019-05-29 2020-05-27 Optimal control device, optimal control method and computer program
CN202080039159.8A CN113874794A (en) 2019-05-29 2020-05-27 Optimal control device, optimal control method, and computer program

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019100408 2019-05-29
JP2019-100408 2019-05-29

Publications (1)

Publication Number Publication Date
WO2020241657A1 true WO2020241657A1 (en) 2020-12-03

Family

ID=73552241

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/020816 WO2020241657A1 (en) 2019-05-29 2020-05-27 Optimum control device, optimum control method and computer program

Country Status (3)

Country Link
JP (1) JP7183411B2 (en)
CN (1) CN113874794A (en)
WO (1) WO2020241657A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112737454A (en) * 2020-12-15 2021-04-30 武汉船用电力推进装置研究所(中国船舶重工集团公司第七一二研究所) Automatic optimization control method for permanent magnet synchronous motor
CN115276687A (en) * 2022-06-02 2022-11-01 智己汽车科技有限公司 Signal control method and system
CN116382066A (en) * 2023-03-23 2023-07-04 哈尔滨工业大学 Signal differential estimation method based on sigmoid type integral enhancement differentiator
CN116382066B (en) * 2023-03-23 2024-05-10 哈尔滨工业大学 Signal differential estimation method based on sigmoid type integral enhancement differentiator

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001016775A (en) * 1999-06-24 2001-01-19 Takeo Kawamura Optimum power flow calculating system based on nonlinear programming method
JP2017033104A (en) * 2015-07-29 2017-02-09 株式会社東芝 Optimum control device, optimal control method, computer program and optimal control system
JP2017224176A (en) * 2016-06-15 2017-12-21 株式会社東芝 Controller, control method, and computer program

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101009533A (en) * 2006-01-27 2007-08-01 松下电器产业株式会社 Detection method for MIMO system
JP6228079B2 (en) * 2014-07-16 2017-11-08 本田技研工業株式会社 Mobile robot motion target generator
EP3136192A1 (en) * 2015-08-24 2017-03-01 Siemens Aktiengesellschaft Control method for the movement of a tool and control device
CN106873379B (en) * 2017-03-31 2019-12-27 北京工业大学 Sewage treatment optimal control method based on iterative ADP algorithm
CN107742885B (en) * 2017-11-13 2020-11-20 中国南方电网有限责任公司电网技术研究中心 Power distribution network voltage power sensitivity estimation method based on regular matching pursuit
CN109740117B (en) * 2019-01-31 2021-03-23 广东工业大学 Robust and fast magnetic positioning algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001016775A (en) * 1999-06-24 2001-01-19 Takeo Kawamura Optimum power flow calculating system based on nonlinear programming method
JP2017033104A (en) * 2015-07-29 2017-02-09 株式会社東芝 Optimum control device, optimal control method, computer program and optimal control system
JP2017224176A (en) * 2016-06-15 2017-12-21 株式会社東芝 Controller, control method, and computer program

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112737454A (en) * 2020-12-15 2021-04-30 武汉船用电力推进装置研究所(中国船舶重工集团公司第七一二研究所) Automatic optimization control method for permanent magnet synchronous motor
CN112737454B (en) * 2020-12-15 2022-05-17 武汉船用电力推进装置研究所(中国船舶重工集团公司第七一二研究所) Automatic optimization control method for permanent magnet synchronous motor
CN115276687A (en) * 2022-06-02 2022-11-01 智己汽车科技有限公司 Signal control method and system
CN116382066A (en) * 2023-03-23 2023-07-04 哈尔滨工业大学 Signal differential estimation method based on sigmoid type integral enhancement differentiator
CN116382066B (en) * 2023-03-23 2024-05-10 哈尔滨工业大学 Signal differential estimation method based on sigmoid type integral enhancement differentiator

Also Published As

Publication number Publication date
JP7183411B2 (en) 2022-12-05
JPWO2020241657A1 (en) 2021-11-18
CN113874794A (en) 2021-12-31

Similar Documents

Publication Publication Date Title
WO2020241657A1 (en) Optimum control device, optimum control method and computer program
CN107924162B (en) Optimal control device, optimal control method, recording medium, and optimal control system
Li et al. Two-degree-of-freedom fractional order-PID controllers design for fractional order processes with dead-time
Kim et al. Disturbance-observer-based model predictive control for output voltage regulation of three-phase inverter for uninterruptible-power-supply applications
Bodson Performance of an adaptive algorithm for sinusoidal disturbance rejection in high noise
Bianchin et al. Online optimization of switched LTI systems using continuous-time and hybrid accelerated gradient flows
Atta et al. Adaptive amplitude fast proportional integral phasor extremum seeking control for a class of nonlinear system
Roy et al. Level control of two tank system by fractional order integral state feedback controller tuned by PSO with experimental validation
Steinboeck et al. A design technique for fast sampled-data nonlinear model predictive control with convergence and stability results
Wang Synthesis of PID controllers for high-order plants with time-delay
JP7154774B2 (en) Optimal control device, control method and computer program
JP6744145B2 (en) Control device, control method, and computer program
Yadbantung et al. Tube-based robust output feedback MPC for constrained LTV systems with applications in chemical processes
CN112400142B (en) Control device, control method, and computer storage medium
JP6290115B2 (en) Control system, control device, control method, and computer program
Kotzapetros et al. Design of a modern automatic control system for the activated sludge process in wastewater treatment
Yamanaka et al. Adaptive Extremum Seeking Controller Design Utilizing A Simple Nonlinear Benchmark Process Model
Atta et al. Fast proportional integral phasor extremum seeking control for a class of nonlinear system
JP7267779B2 (en) Optimal control device, optimal control method and computer program
Messineo et al. Adaptive feedforward disturbance rejection in nonlinear systems
Zhao et al. Terminal sliding mode control with adaptive law for uncertain nonlinear system
Rascón et al. Improving first order sliding mode control on second order mechanical systems
Junior et al. Design and stability analysis of a variable structure adaptive pole placement controller for first order systems
Praly et al. Dynamic versus static weighting of Lyapunov functions
US11947323B2 (en) Reward to risk ratio maximization in operational control problems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20814603

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021522795

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20814603

Country of ref document: EP

Kind code of ref document: A1