TWI809592B

TWI809592B - Model prediction control system and method thereof

Info

Publication number: TWI809592B
Application number: TW110145943A
Authority: TW
Inventors: 鄭儀誠; 陳俊彥
Original assignee: 財團法人工業技術研究院
Priority date: 2021-12-08
Filing date: 2021-12-08
Publication date: 2023-07-21
Also published as: CN116243599A; TW202324329A

Abstract

A model prediction control (MPC) system and a method thereof are disclosed. The model prediction control system includes a model simulation unit, an observation unit and a proxy model prediction control unit. The model simulation unit is configured to simulate dynamic behaviors of a target object according to a behavior model of the target object and a plurality of control decision parameters, and correspondingly generating a plurality of real state parameters. The observation unit is configured to observe these control decision parameters and these real state parameters, and generate a plurality of estimated state parameters. The proxy model prediction control unit is configured to perform approximate model prediction control operations based on the estimated state parameters and a plurality of target parameters to replace model prediction control operations with higher computation complexity.

Description

Model Predictive Control System and Its Method

本揭示係關於一種控制系統及控制方法，特別有關於一種用於預測並控制目標物之行為的模型預測控制系統及其方法。 The present disclosure relates to a control system and a control method, in particular to a model predictive control system and method for predicting and controlling the behavior of an object.

模型預測控制機制係根據目標物之動態行為建立目標物之動態模型，並根據動態模型以預測目標物的動態變化，而後藉由最佳化演算法計算控制決策輸入值的最佳解，以使得目標物的行為能滿足預定目標。 The model predictive control mechanism is to establish the dynamic model of the target object according to the dynamic behavior of the target object, and predict the dynamic change of the target object according to the dynamic model, and then use the optimization algorithm to calculate the optimal solution of the control decision input value, so that The behavior of the object satisfies the intended purpose.

然而，模型預測控制機制的最佳化演算法涉及維度較大的最佳化求解過程，當目標物之動態變化較快或目標物之動態模型較複雜時，最佳化演算法將耗費大量運算時間而導致無法在有限時間的決策週期內完成最佳化求解，此將造成目標物的控制決策延遲而影響目標物之效能表現甚至引發意外事故。 However, the optimization algorithm of the model predictive control mechanism involves a large-dimensional optimization solution process. When the dynamic change of the target object is fast or the dynamic model of the target object is complex, the optimization algorithm will consume a lot of calculations As a result, the optimal solution cannot be completed within the limited time decision-making cycle, which will cause the delay of the control decision of the target object, affect the performance of the target object and even cause accidents.

因此，本技術領域之相關產業之技術人員係致力於改良模型預測控制機制，可於各種應用情境(例如金融交易，自駕車，無人機，工廠產線，等)中快速計算出適合於控制目標的控制決策輸入值，而能夠避免控制決策延遲；並且，改良的模型預測控制機制亦可大幅減少運算量以降低硬體規格需求。 Therefore, technicians in related industries in this technical field are committed to improving the model predictive control mechanism, which can quickly calculate the suitable control target in various application scenarios (such as financial transactions, self-driving cars, drones, factory production lines, etc.) The input value of the control decision can avoid the delay of the control decision; moreover, the improved model predictive control mechanism can also greatly reduce the amount of calculation to reduce the hardware specification requirements.

本揭示係提供一種模型預測控制系統，包括模型模擬單元、監測單元及代理模型預測控制單元。模型模擬單元用於根據目標物之行為模型以及多個前狀態的控制決策參數以模擬此目標物之動態行為，且對應產生多個實際狀態參數。監測單元，用於監測此些前一狀態的控制決策參數及此些實際狀態參數，並根據此些前一狀態的控制決策參數及此些實際狀態參數產生多個估計狀態參數。代理模型預測控制單元，用於根據此些估計狀態參數及多個目標參數執行近似的模型預測控制運算，以得到多個目前狀態的控制決策參數，並將此些目前狀態的控制決策參數回傳至此模型模擬單元。其中，此代理模型預測控制單元執行此近似的模型預測控制運算以取代運算複雜度較高的模型預測控制運算。 The disclosure provides a model predictive control system, including a model simulation unit, a monitoring unit and an agent model predictive control unit. The model simulation unit is used for simulating the dynamic behavior of the target according to the behavior model of the target and a plurality of control decision parameters of the previous state, and correspondingly generates a plurality of actual state parameters. The monitoring unit is configured to monitor the control decision parameters of the previous state and the actual state parameters, and generate a plurality of estimated state parameters according to the control decision parameters of the previous state and the actual state parameters. The agent model predictive control unit is used to perform approximate model predictive control operations according to the estimated state parameters and multiple target parameters, so as to obtain multiple control decision parameters of the current state, and return the control decision parameters of the current state So far the model simulates the unit. Wherein, the proxy model predictive control unit executes the approximate model predictive control operation to replace the model predictive control operation with high computational complexity.

本揭示亦提供一種模型預測控制方法，包括以下步驟。建立目標物之行為模型。根據此目標物之此行為模型以及多個前狀態的控制決策參數以模擬此目標物之動態行為。對應產生多個實際狀態參數。監測此些前一狀態的控制決策參數及此些實際狀態參數。根據此些前一狀態的控制決策參數及此些實際狀態參數產生多個估計狀態參數。設置代理模型預測控制單元。根據此些估計狀態參數及多個目標參數藉由此代理模型預測控制單元執行近似的模型預測控制運算，以取代運算複雜度較高的模型預測控制運算。以及，將此些目前狀態的控制決策參數回傳至此模型模擬單元。 The disclosure also provides a model predictive control method, including the following steps. Create a behavioral model of the target. The dynamic behavior of the target is simulated according to the behavior model of the target and multiple control decision parameters of the previous state. Correspondingly generate a plurality of actual state parameters. Such previous state control decision parameters and such actual state parameters are monitored. A plurality of estimated state parameters are generated according to the control decision parameters of the previous state and the actual state parameters. Set up the surrogate model predictive control unit. Based on the estimated state parameters and multiple target parameters, the proxy model predictive control unit performs approximate model predictive control operations to replace model predictive control operations with high computational complexity. And, return the control decision parameters of these current states to the model simulation unit.

透過閱讀以下圖式、詳細說明以及申請專利範圍，可見本揭示之其他方面以及優點。 By reading the following drawings, detailed descriptions and scope of patent application, Other aspects and advantages of the present disclosure can be seen.

1000:模型預測控制系統 1000: Model Predictive Control Systems

100:模型模擬單元 100:Model simulation unit

200:監測單元 200: Monitoring unit

300:最佳化模型預測控制單元 300: Optimizing Model Predictive Control Units

400:代理模型預測控制單元 400: Proxy Model Predictive Control Unit

500:更新單元 500: update unit

700:目標物 700: target

2000:工廠產線 2000: Factory production line

u1,u2,u3:控制決策參數 u1, u2, u3: control decision parameters

y1,y2:實際狀態參數 y1, y2: actual state parameters

x1,x2:估計狀態參數 x1, x2: Estimated state parameters

S:目標參數集合 S: set of target parameters

s1~sN:目標參數 s1~sN: target parameters

track1:目標移動軌跡 track1: target movement track

track2:實際移動軌跡 track2: actual moving track

u1_mean,u2_mean:平均值 u1_mean, u2_mean: average value

u1_var,u2_var:變異值 u1_var, u2_var: variation value

2100:饋入量 2100: Feed amount

2200:蒸氣量 2200: steam volume

2300:再沸騰溫度或壓力 2300: Reboiling temperature or pressure

S110~S180:步驟 S110~S180: steps

第1圖為本揭示一實施例之模型預測控制系統的方塊圖。 FIG. 1 is a block diagram of a model predictive control system according to an embodiment of the present disclosure.

第2圖為本揭示之模型預測控制系統應用於自駕車之實施例的示意圖。 Fig. 2 is a schematic diagram of an embodiment of the application of the model predictive control system disclosed in the present disclosure to a self-driving car.

第3圖為本揭示之代理模型預測控制單元執行模型預測控制運算的示意圖。 FIG. 3 is a schematic diagram of an agent model predictive control unit performing model predictive control calculations according to the present disclosure.

第4圖為本揭示之模型預測控制系統應用於工廠產線之實施例的示意圖。 FIG. 4 is a schematic diagram of an embodiment of applying the model predictive control system of the present disclosure to a factory production line.

第5圖為本揭示一實施例之模型預測控制方法的流程圖。 FIG. 5 is a flowchart of a model predictive control method according to an embodiment of the present disclosure.

本說明書的技術用語係參照本技術領域之習慣用語，如本說明書對部分用語有加以說明或定義，該部分用語之解釋係以本說明書之說明或定義為準。本揭露之各個實施例分別具有一或多個技術特徵。在可能實施的前提下，本技術領域具有通常知識者可選擇性地實施任一實施例中部分或全部的技術特徵，或者選擇性地將這些實施例中部分或全部的技術特徵加以組合。 The technical terms in this specification refer to the customary terms in this technical field. If some terms are explained or defined in this specification, the explanations or definitions of these terms shall prevail. Each embodiment of the disclosure has one or more technical features. On the premise of possible implementation, those skilled in the art may selectively implement some or all of the technical features in any embodiment, or selectively combine some or all of the technical features in these embodiments.

第1圖為本揭示一實施例之模型預測控制(model prediction control，MPC)系統1000的方塊圖。請參見第1圖，模型預測控制系統1000包括模型模擬(model simulation)單元100、監測(observation)單元200、最佳化模型預測控制(optimal model prediction control)單元300、代理模型預測控制(proxy model prediction control)單元400、更新單元500。 FIG. 1 is a block diagram of a model prediction control (MPC) system 1000 according to an embodiment of the present disclosure. Please refer to Fig. 1, the model predictive control system 1000 includes a model simulation (model simulation) unit 100, a monitoring (observation) unit 200, an optimal model predictive control (optimal model prediction control) unit 300, proxy model prediction control (proxy model prediction control) unit 400, and update unit 500.

模型模擬單元100可建立目標物700的行為模型以模擬目標物700的動態行為。在一種示例中，目標物700(或可稱為「目標對象」、「操作標的」，泛指模型預測控制系統1000進行預測控制的控制標的)例如為自駕車，模型模擬單元100可建立自駕車的行為模型以模擬自駕車的「車速」、「行進方向」等等的動態行為。其中，自駕車的行為「車速」關聯的控制對象為自駕車的「油門」，而「行進方向」關聯的控制對象為自駕車的「方向盤」。在運作上，模型模擬單元100可接收複數個控制決策參數，其中的一個控制決策參數u1例如為「油門大小」，其關聯的控制對象為「油門」；模型模擬單元100可根據控制決策參數u1模擬控制自駕車的「油門」以調整自駕車的行為「車速」。並且，模型模擬單元100可模擬計算出「車速」的實際測量值(measurement)，即，「車速」的實際狀態參數y1；換言之，實際狀態參數y1表示自駕車的「實際車速」。 The model simulation unit 100 can establish a behavior model of the object 700 to simulate the dynamic behavior of the object 700 . In one example, the target object 700 (or may be referred to as "target object", "operating target", generally refers to the control target for which the model predictive control system 1000 performs predictive control) is, for example, a self-driving car, and the model simulation unit 100 can establish a self-driving car The behavioral model of the self-driving car simulates the dynamic behavior of the "vehicle speed", "traveling direction" and so on. Among them, the control object associated with the behavior "vehicle speed" of the self-driving car is the "accelerator" of the self-driving car, and the control object associated with the "direction of travel" is the "steering wheel" of the self-driving car. In operation, the model simulation unit 100 can receive a plurality of control decision parameters, one of the control decision parameters u1 is, for example, "throttle size", and its associated control object is "throttle". Simulate controlling the "accelerator" of the self-driving car to adjust the behavior of the self-driving car "vehicle speed". Moreover, the model simulation unit 100 can simulate and calculate the actual measurement of the "vehicle speed", that is, the actual state parameter y1 of the "vehicle speed"; in other words, the actual state parameter y1 represents the "actual vehicle speed" of the self-driving car.

另一方面，模型模擬單元100的另一個控制決策參數u2例如為「方向盤旋轉量」，其關聯的控制對象為「方向盤」；模型模擬單元100可根據控制決策參數u2模擬控制自駕車的「方向盤」以調整自駕車的行為「行進方向」。並且，模型模擬單元100可模擬計算出「行進方向」的實際測量值，即，實際狀態參數y2；換言之，實際狀態參數y2表示自駕車的「實際行進方向」。 On the other hand, another control decision parameter u2 of the model simulation unit 100 is, for example, the "steering wheel rotation amount", and its associated control object is the "steering wheel"; the model simulation unit 100 can simulate and control the "steering wheel" of the self-driving car according to the control decision parameter u2 ” to adjust the self-driving car’s behavior “direction of travel”. Moreover, the model simulation unit 100 can simulate and calculate the actual measurement value of the "traveling direction", that is, the actual state parameter y2; in other words, the actual state parameter y2 represents the "actual traveling direction" of the self-driving car.

綜上所述，模型模擬單元100可根據控制決策參數u1「油門大小」模擬控制自駕車的控制對象「油門」以調整自駕車的行為「車速」，模型模擬單元100並對應輸出實際狀態參數y1「實際車速」。另一方面，模型模擬單元100可根據控制決策參數u2「方向盤旋轉量」模擬控制自駕車的控制對象「方向盤」以調整自駕車的行為「行進方向」，模型模擬單元100並對應輸出實際狀態參數y2「實際行進方向」。 To sum up, the model simulation unit 100 can simulate and control the control object "throttle" of the self-driving car according to the control decision parameter u1 "throttle size" to adjust the behavior of the self-driving car "vehicle speed", and the model simulation unit 100 correspondingly outputs the actual state parameter y1 "Actual Speed". On the other hand, the model simulation unit 100 can simulate and control the control object "steering wheel" of the self-driving car according to the control decision parameter u2 "steering wheel rotation amount" to adjust the behavior "traveling direction" of the self-driving car, and the model simulation unit 100 outputs the actual state parameters correspondingly y2 "actual direction of travel".

監測單元200連接於模型模擬單元100，監測單元200可從模型模擬單元100接收實際狀態參數y1、y2，並根據控制決策參數u1、u2產生估計(estimation)狀態參數x1、x2。其中，估計狀態參數x1表示自駕車的「估計車速」，估計狀態參數x2表示自駕車的「估計行進方向」。更具體而言，監測單元200可接收目前狀態(或稱為：目前疊代(iteration)，即，第k個疊代)的實際狀態參數y1(k)、y2(k)，並且接收前一個狀態(或稱為：前一個疊代，即，第(k-1)個疊代)的控制決策參數u1(k-1)、u2(k-1)。監測單元200根據目前疊代的實際狀態參數y1(k)、y2(k)與前一個疊代的控制決策參數u1(k-1)、u2(k-1)產生目前疊代的估計狀態參數x1(k)、x2(k)。 The monitoring unit 200 is connected to the model simulation unit 100. The monitoring unit 200 can receive actual state parameters y1, y2 from the model simulation unit 100, and generate estimated state parameters x1, x2 according to the control decision parameters u1, u2. Among them, the estimated state parameter x1 represents the "estimated speed" of the self-driving car, and the estimated state parameter x2 represents the "estimated direction of travel" of the self-driving car. More specifically, the monitoring unit 200 can receive the actual state parameters y1(k) and y2(k) of the current state (or called: the current iteration (iteration), ie, the kth iteration), and receive the previous The control decision parameters u1(k-1), u2(k-1) of the state (or called: the previous iteration, ie, the (k-1)th iteration). The monitoring unit 200 generates the estimated state parameters of the current iteration according to the actual state parameters y1(k) and y2(k) of the current iteration and the control decision parameters u1(k-1) and u2(k-1) of the previous iteration x1(k), x2(k).

並且，最佳化模型預測控制單元300連接於監測單元200，最佳化模型預測控制單元300可從監測單元200接收估計狀態參數x1(k)、x2(k)。此外，最佳化模型預測控制單元300可接收目標參數集合S，目標參數集合S可包括複數個目標參數s1、 s2、...、sN而可表示為S={s1、s2、...、sN}。請配合參見第2圖，其繪示本揭示之模型預測控制系統1000應用於自駕車之實施例的示意圖；目標參數集合S可表示使用者期待目標物700(自駕車)應達成的目標行為，第2圖所示自駕車應達成的目標移動軌跡track1即為目標參數集合S。並且，自駕車的目標移動軌跡track1的每一點的座標即為目標參數集合S的各個目標參數s1、s2、...、sN。根據目標參數集合S以及估計狀態參數x1(k)、x2(k)，最佳化模型預測控制單元300可執行模型預測控制以計算出對應的控制決策參數u1(k)、u2(k)。換言之，為了使自駕車的行為滿足目標參數集合S(即，使自駕車的實際移動軌跡track2滿足目標移動軌跡track1)，最佳化模型預測控制單元300根據估計狀態參數x1(k)、x2(k)(即，自駕車的「估計車速」與「估計行進方向」)計算出對應的控制決策參數u1(k)、u2(k)(即，自駕車的「油門大小」及「方向盤旋轉量」)以分別控制自駕車的「油門」及「方向盤」，據以調整自駕車的行為「車速」及「行進方向」。調整後的「車速」及「行進方向」並可經由模型模擬單元100的模擬而反映在實際狀態參數y1、y2(即，自駕車的「實際車速」及「實際行進方向」)。 Moreover, the optimized model predictive control unit 300 is connected to the monitoring unit 200 , and the optimized model predictive control unit 300 can receive estimated state parameters x1(k), x2(k) from the monitoring unit 200 . In addition, the optimization model predictive control unit 300 can receive a target parameter set S, and the target parameter set S can include a plurality of target parameters s1, s2, . . . , sN can be expressed as S={s1, s2, . . . , sN}. Please refer to FIG. 2, which shows a schematic diagram of an embodiment of the disclosed model predictive control system 1000 applied to a self-driving car; the target parameter set S can represent the target behavior that the user expects the target object 700 (self-driving car) to achieve, The target trajectory track1 that the self-driving car should achieve as shown in Figure 2 is the target parameter set S. Moreover, the coordinates of each point of the target moving track track1 of the self-driving car are the target parameters s1, s2, . . . , sN of the target parameter set S. According to the target parameter set S and estimated state parameters x1(k), x2(k), the optimized model predictive control unit 300 can perform model predictive control to calculate corresponding control decision parameters u1(k), u2(k). In other words, in order to make the behavior of the self-driving car meet the target parameter set S (that is, to make the actual moving track track2 of the self-driving car satisfy the target moving track track1), the optimization model predictive control unit 300 estimates the state parameters x1(k), x2( k) (that is, the "estimated vehicle speed" and "estimated direction of travel" of the self-driving car) calculate the corresponding control decision parameters u1(k), u2(k) (that is, the "accelerator size" and "steering wheel rotation amount" of the self-driving car ") to control the "accelerator" and "steering wheel" of the self-driving car respectively, so as to adjust the behavior of the self-driving car "vehicle speed" and "direction of travel". The adjusted "vehicle speed" and "traveling direction" can be reflected in the actual state parameters y1, y2 (ie, the "actual speed" and "actual traveling direction" of the self-driving car) through the simulation of the model simulation unit 100 .

在本實施例中，最佳化模型預測控制單元300根據最佳化的演算法以執行模型預測控制，據以計算出可滿足目標參數集合S的控制決策參數u1、u2的最佳解(optimal solution)。然而，若考量自駕車的其他更多的控制對象(例如，排檔、引擎溫度，等)，則最佳化模型預測控制單元300必須對於數量較多的控制決策參數u1、u2、u3、...、uM進行最佳化的運算，此將耗費大量運算時間而無法在較短的決策週期(例如，自駕車的行進駕駛的決策週期可短至約0.1秒)內計算得到控制決策參數u1、u2、u3、...、uM的最佳解，因而無法達到即時控制，其可能造成自駕車的性能減損或安全事故。 In this embodiment, the optimized model predictive control unit 300 executes the model predictive control according to the optimized algorithm, so as to calculate the optimal solution (optimal solution). However, if considering other more control objects of self-driving cars (for example, gear, engine temperature degree, etc.), then the optimization model predictive control unit 300 must perform optimization operations on a large number of control decision parameters u1, u2, u3, ..., uM, which will consume a lot of computing time and cannot be used in The optimal solution of the control decision parameters u1, u2, u3, ..., uM can be calculated in a short decision cycle (for example, the decision cycle of self-driving car driving can be as short as about 0.1 second), so real-time control cannot be achieved , which may cause performance impairment or safety accidents of self-driving cars.

針對於此，可採用預先建立的代理模型預測控制單元400取代最佳化模型預測控制單元300。係以代理模型預測控制單元400執行運算複雜度較低(因而決策週期較短)的模型預測控制的運算，以取代最佳化模型預測控制單元300執行的運算複雜度較高(因而決策週期較長)的模型預測控制的運算。並且，代理模型預測控制單元400雖非執行最佳化的運算，然而代理模型預測控制單元400仍能夠得到近似最佳解的結果，因而仍能有效調整控制目標物的行為。在本實施例中，可用離線(off-line)方式預先建立代理模型預測控制單元400的模型(即，離線建模)，建立模型的方式可包括模仿學習(imitation learning)等之機器學習方式。並且，參見第3圖繪示的代理模型預測控制單元400執行模型預測控制運算的示意圖；代理模型預測控制單元400可根據隨機程序運算以執行模型預測控制，例如可根據高斯程序(Gaussian Process，GP)的運算執行模型預測控制。仍然以自駕車的兩個控制決策參數u1、u2(即，自駕車的「油門大小」及「方向盤旋轉量」)為例，代理模型預測控制單元400根據估計狀態參數x1、x2 以及目標參數集合S執行高斯程序的運算以計算出控制決策參數u1、u2各自的平均值u1_mean、u2_mean以及變異值u1_var、u2_var。在一示例中，係以平均值u1_mean、u2_mean作為最終輸出的控制決策參數u1、u2(即，控制決策參數u1=平均值u1_mean，控制決策參數u2=平均值u2_mean)。另一方面，係以控制決策參數u1、u2的變異值u1_var、u2_var作為模型可靠度(reliability)指標rindex，據以判斷是否需要更新代理模型預測控制單元400。 In view of this, the pre-established agent model predictive control unit 400 can be used to replace the optimized model predictive control unit 300 . The model predictive control unit 400 is used as an agent to perform model predictive control operations with lower computational complexity (thus the decision cycle is shorter) to replace the computational complexity performed by the optimized model predictive control unit 300 (thus the decision cycle is shorter) Long) operation of model predictive control. Moreover, although the surrogate model predictive control unit 400 does not perform optimal calculations, the surrogate model predictive control unit 400 can still obtain a result close to the optimal solution, and thus can effectively adjust the behavior of the control target. In this embodiment, the model of the agent model prediction control unit 400 can be pre-established in an off-line manner (ie, offline modeling), and the manner of establishing the model can include imitation learning and other machine learning methods. Also, refer to the schematic diagram of the agent model predictive control unit 400 shown in FIG. 3 performing model predictive control operations; the agent model predictive control unit 400 can operate according to a random program to perform model predictive control, for example, according to a Gaussian process (Gaussian Process, GP ) to perform model predictive control. Still taking the two control decision parameters u1, u2 of the self-driving car (that is, the "accelerator size" and "steering wheel rotation" of the self-driving car) as an example, the agent model prediction control unit 400 estimates the state parameters x1, x2 And the target parameter set S executes the operation of the Gaussian program to calculate the respective mean values u1_mean, u2_mean and variation values u1_var, u2_var of the control decision parameters u1, u2. In one example, the control decision parameters u1 and u2 are output with mean values u1_mean and u2_mean (ie, control decision parameter u1=mean value u1_mean, control decision parameter u2=mean value u2_mean). On the other hand, the variation values u1_var and u2_var of the control decision parameters u1 and u2 are used as the model reliability index rindex to determine whether to update the proxy model predictive control unit 400 .

承上，更新單元500可設定模型可靠度指標rindex的閥值t1，並且比較模型可靠度指標rindex與閥值t1。當模型可靠度指標rindex大於閥值t1(即，控制決策參數u1、u2的變異值u1_var、u2_var大於閥值t1)時，表示目前採用的代理模型預測控制單元400的可靠度較低而需進行適應性(adaptive)更新。在一示例中，可用線上(on-line)方式收集取得各項操作數據(例如自駕車的「車速」、「行進方向」關聯的操作數據)，並以此些操作數據對於代理模型預測控制單元400進行適應性更新。 Continuing from the above, the update unit 500 can set the threshold t1 of the model reliability index rindex, and compare the model reliability index rindex with the threshold t1. When the model reliability index rindex is greater than the threshold value t1 (that is, the variation values u1_var and u2_var of the control decision parameters u1 and u2 are greater than the threshold value t1), it means that the reliability of the currently used proxy model predictive control unit 400 is low and needs to be carried out. Adaptive update. In one example, various operation data (such as the operation data related to the "vehicle speed" and "traveling direction" of the self-driving car) can be collected online (on-line), and these operation data can be used to predict the control unit of the agent model 400 for adaptive updates.

由上，可藉由離線方式預先建立代理模型預測控制單元400的模型(即，離線建模)，並藉由線上方式對於代理模型預測控制單元400進行適應性更新(即，線上更新)。並且，當代理模型預測控制單元400正在進行適應性更新時，代理模型預測控制單元400無法即時計算控制決策參數u1、u2，在此狀況下則回歸至使用最佳化模型預測控制單元300執行最佳化的模型預測控制(即，運算複雜度較高(因而決策週期較長)的模型預測控制)以計算控制決策參數u1、u2。 From the above, the model of the agent model predictive control unit 400 can be pre-established offline (ie, offline modeling), and the agent model predictive control unit 400 can be adaptively updated online (ie, online update). Moreover, when the agent model predictive control unit 400 is performing adaptive updating, the agent model predictive control unit 400 cannot calculate the control decision parameters u1, u2 in real time. Optimized model predictions control (that is, model predictive control with high computational complexity (and therefore long decision-making period)) to calculate the control decision parameters u1, u2.

第4圖繪示本揭示之模型預測控制系統1000應用於工廠產線2000的實施例的示意圖，如第4圖所示，欲控制的目標物為工廠產線2000中的「蒸餾塔」，模型預測控制系統1000的控制對象包括整個蒸餾塔系統之中的各個PID控制器設定點的控制變數，例如「饋入(feed)量」2100、「蒸氣(steam)量」2200、「再沸騰(reboiler)溫度或壓力」2300，等。模型預測控制系統1000的控制決策參數u1、u2、u3分別對應於上述各個控制對象的PID控制器設定點，例如：控制決策參數u1為「饋入量」2100的PID控制器設定點。控制決策參數u2為「蒸氣(steam)量」2200的PID控制器設定點。並且，控制決策參數u3為「再沸騰溫度或壓力」2300的PID控制器設定點。 Figure 4 shows a schematic diagram of an embodiment of the model predictive control system 1000 disclosed herein applied to a factory production line 2000. As shown in Figure 4, the object to be controlled is the "distillation tower" in the factory production line 2000, and the model The control objects of the predictive control system 1000 include the control variables of each PID controller set point in the entire distillation column system, such as "feed (feed) amount" 2100, "steam (steam) amount" 2200, "reboiler (reboiler) ) temperature or pressure" 2300, etc. The control decision parameters u1, u2, u3 of the model predictive control system 1000 respectively correspond to the PID controller set points of the above-mentioned control objects, for example: the control decision parameter u1 is the PID controller set point of the “feed amount” 2100 . The control decision parameter u2 is the set point of the PID controller of the “steam amount” 2200 . Moreover, the control decision parameter u3 is the set point of the PID controller of the “reboiling temperature or pressure” 2300 .

並且，目標物「蒸餾塔」的目標參數s1例如為「蒸餾液的單位時間產出量」：為了達成使用者期望的單位時間產出量，可藉由模型預測控制系統1000的代理模型預測控制單元400執行近似的模型預測控制，以計算出近似於最佳解的控制決策參數u1、u2、u3，並將控制決策參數u1、u2、u3導入實際的工廠產線2000中的整個蒸餾塔系統的對應PID控制器設定點。上述之工廠產線2000(以蒸餾塔系統的化學工廠為例)的決策週期約為15秒~30秒，可藉由代理模型預測控制單元400執行較為快速的近似模型預測控制，而能夠在15秒~30秒的決策週期內計算出近似於最佳解的控制決策參數u1、u2、u3。 In addition, the target parameter s1 of the target "distillation tower" is, for example, "the output per unit time of the distillate": in order to achieve the output per unit time desired by the user, the proxy model predictive control of the model predictive control system 1000 can be used The unit 400 performs approximate model predictive control to calculate the control decision parameters u1, u2, u3 that are close to the optimal solution, and import the control decision parameters u1, u2, u3 into the entire distillation column system in the actual factory production line 2000 The corresponding PID controller set point of . The decision-making cycle of the above-mentioned factory production line 2000 (taking the chemical factory of the distillation tower system as an example) is about 15 seconds to 30 seconds, and the agent model predictive control unit 400 can be used to execute relatively fast Approximate model predictive control can calculate the control decision parameters u1, u2, u3 that are close to the optimal solution within a decision period of 15 seconds to 30 seconds.

此外，在工廠產線2000中，本實施例之代理模型預測控制系統1000亦可選擇性的配合控制迴路性能評估(CLPA)系統、全廠擾動源追朔(PDT)系統、控制器參數調諧(PID tunner)系統或系統識別(建模)系統以控制調整工廠產線2000中的各項變數。 In addition, in the factory production line 2000, the proxy model predictive control system 1000 of this embodiment can also optionally cooperate with the control loop performance assessment (CLPA) system, the plant-wide disturbance source tracking (PDT) system, and controller parameter tuning ( PID tunner) system or system identification (modeling) system to control and adjust various variables in the factory production line 2000.

第5圖為本揭示一實施例之模型預測控制方法的流程圖。請參見第5圖，模型預測控制方法可配合第1圖之模型預測控制系統1000及第3圖之代理模型預測控制單元400而實施。首先，在步驟S110中，藉由模型模擬單元100接收複數個控制決策參數，例如接收自駕車的兩個控制決策參數u1「油門大小」及控制決策參數u2「方向盤旋轉量」。並且，藉由模型模擬單元100根據控制決策參數u1、u2模擬產生實際狀態參數y1「實際車速」及實際狀態參數y2「實際行進方向」。 FIG. 5 is a flowchart of a model predictive control method according to an embodiment of the present disclosure. Please refer to FIG. 5 , the model predictive control method can be implemented in cooperation with the model predictive control system 1000 in FIG. 1 and the agent model predictive control unit 400 in FIG. 3 . First, in step S110 , a plurality of control decision parameters are received by the model simulation unit 100 , for example, two control decision parameters u1 "accelerator size" and control decision parameters u2 "steering wheel rotation amount" of the self-driving car are received. Moreover, the actual state parameter y1 "actual vehicle speed" and the actual state parameter y2 "actual traveling direction" are simulated and generated according to the control decision parameters u1 and u2 by the model simulation unit 100 .

而後，在步驟S120中，模型模擬單元100將目前狀態的實際狀態參數y1(k)、y2(k)傳送至監測單元200，並且監測單元200更接收前一個狀態的控制決策參數u1(k-1)、u2(k-1)。據此，監測單元200根據實際狀態參數y1(k)、y2(k)及控制決策參數u1(k-1)、u2(k-1)產生估計狀態參數x1(k)「估計車速」及估計狀態參數x2(k)「估計行進方向」。 Then, in step S120, the model simulation unit 100 transmits the actual state parameters y1(k), y2(k) of the current state to the monitoring unit 200, and the monitoring unit 200 further receives the control decision parameter u1(k- 1), u2(k-1). Accordingly, the monitoring unit 200 generates an estimated state parameter x1(k) "estimated vehicle speed" and an estimated The state parameter x2(k) "estimated direction of travel".

而後，在步驟S130中，判斷是否處於模型預測控制系統1000的決策週期之內；以自駕車為例，決策週期約為0.1秒。若仍處於決策週期之內，則執行步驟S135，使用代理模型預測控制單元400執行近似模型預測控制；其中，代理模型預測控制單元400接收目標參數集合S以及估計狀態參數x1(k)、x2(k)並據以產生控制決策參數u1、u2各自的平均值u1_mean、u2_mean以及變異值u1_var、u2_var，並將變異值u1_var、u2_var作為模型可靠度指標rindex。產生的控制決策參數u1、u2可再提供至步驟S110的模型模擬單元100執行模擬。 Then, in step S130, it is judged whether it is within the decision-making period of the model predictive control system 1000; taking the self-driving car as an example, the decision-making period is about 0.1 second. If it is still within the decision cycle, then execute step S135, and use the proxy model predictive control unit 400 to perform approximate model predictive control; wherein, the proxy model predictive control unit 400 receives the target parameter set S and the estimated state parameters x1(k), x2( k) Based on this, the average values u1_mean and u2_mean and the variation values u1_var and u2_var of the control decision parameters u1 and u2 are generated, and the variation values u1_var and u2_var are used as the model reliability index rindex. The generated control decision parameters u1 and u2 can be provided to the model simulation unit 100 in step S110 to perform simulation.

而後，在步驟S140中，設定模型可靠度指標rindex的閥值t1；並且判斷模型可靠度指標rindex是否大於閥值t1。若模型可靠度指標rindex小於閥值t1(即，變異值u1_var、u2_var小於閥值t1)，則執行步驟S150，將平均值u1_mean、u2_mean作為最終輸出的控制決策參數u1、u2(即，控制決策參數u1=平均值u1_mean，控制決策參數u2=平均值u2_mean)。 Then, in step S140, a threshold t1 of the model reliability index rindex is set; and it is judged whether the model reliability index rindex is greater than the threshold t1. If the model reliability index rindex is less than the threshold t1 (that is, the variation values u1_var and u2_var are less than the threshold t1), then execute step S150, and use the average values u1_mean and u2_mean as the final output control decision parameters u1 and u2 (that is, the control decision parameter u1=mean value u1_mean, control decision parameter u2=mean value u2_mean).

若模型可靠度指標rindex大於閥值t1(即，變異值u1_var、u2_var大於閥值t1)，表示必須更新代理模型預測控制單元400，則執行步驟S160，進行代理模型預測控制單元400更新的預備。而後，執行步驟S170，使用最佳化模型預測控制單元300計算控制決策參數u1、u2。而後，執行步驟S180，執行更新代理模型預測控制單元400；例如，可使用線上適應性更新方式，從線上收集關聯於目標物的行為模型的各項操作數據，藉由此些數據對於代理模型預測控制單元400進行更新。更新後，可執行步驟S150，藉由更新後的代理模型預測控制單元400計算控制決策參數u1、u2。 If the model reliability index rindex is greater than the threshold t1 (that is, the variation values u1_var and u2_var are greater than the threshold t1), it means that the proxy model predictive control unit 400 must be updated, and step S160 is performed to prepare for updating the proxy model predictive control unit 400. Then, step S170 is executed, and the optimal model predictive control unit 300 is used to calculate the control decision parameters u1 and u2 . Then, step S180 is executed to update the agent model predictive control unit 400; for example, an online adaptive update method can be used to collect various operational data related to the behavior model of the target object from the line, by The agent model predictive control unit 400 is updated from these data. After the update, step S150 can be executed to predict the control unit 400 to calculate the control decision parameters u1 and u2 through the updated agent model.

綜上所述，本揭示之模型預測控制系統1000配合模型預測控制方法之實施，係以離線建模的模仿學習或機器學習的方式預先建立代理模型預測控制單元400；並使用代理模型預測控制單元400執行近似的模型預測控制運算，以取代最佳化模型預測控制單元300執行運算複雜度較高(因而決策週期較長)的最佳化模型預測控制的運算。因此，代理模型預測控制單元400可大幅減少運算時間，而能夠在較短的決策週期(特別是針對於自駕車約為0.1秒的較短決策週期)內計算出近似最佳解的控制決策參數。並且，代理模型預測控制單元400根據高斯程序模型執行模型預測控制運算以得到控制決策參數的平均值與變異值，並將控制決策參數的變異值作為模型可靠度指標。若模型可靠度指標大於閥值則對於代理模型預測控制單元400進行線上適應性更新。 To sum up, the model predictive control system 1000 disclosed in this disclosure cooperates with the implementation of the model predictive control method. The agent model predictive control unit 400 is pre-established in the form of offline modeling imitation learning or machine learning; and the agent model predictive control unit is used 400 performs an approximate model predictive control operation to replace the optimized model predictive control unit 300 performing an optimal model predictive control operation with higher computational complexity (and therefore a longer decision period). Therefore, the agent model predictive control unit 400 can greatly reduce the calculation time, and can calculate the control decision parameters of the approximate optimal solution in a short decision cycle (especially for a short decision cycle of about 0.1 seconds for self-driving cars) . Moreover, the agent model predictive control unit 400 performs model predictive control operations according to the Gaussian program model to obtain the average value and variation value of the control decision parameters, and uses the variation value of the control decision parameters as the model reliability index. If the model reliability index is greater than the threshold value, online adaptive update is performed on the agent model predictive control unit 400 .

雖然本發明已以較佳實施例及範例詳細揭露如上，可理解的是，此些範例意指說明而非限制之意義。可預期的是，所屬技術領域中具有通常知識者可想到多種修改及組合，其多種修改及組合落在本發明之精神以及後附之申請專利範圍之範圍內。 Although the present invention has been disclosed above in detail with preferred embodiments and examples, it should be understood that these examples are meant to be illustrative rather than limiting. It is expected that those skilled in the art can conceive various modifications and combinations, and the various modifications and combinations fall within the spirit of the present invention and the scope of the appended patent application.

1000:模型預測控制系統 1000: Model Predictive Control Systems

100:模型模擬單元 100:Model simulation unit

200:監測單元 200: Monitoring unit

400:代理模型預測控制單元 400: Proxy Model Predictive Control Unit

500:更新單元 500: update unit

700:目標物 700: target

u1,u2:控制決策參數 u1, u2: control decision parameters

y1,y2:實際狀態參數 y1, y2: actual state parameters

x1,x2:估計狀態參數 x1, x2: Estimated state parameters

S:目標參數集合 S: set of target parameters

s1~sN:目標參數 s1~sN: target parameters

Claims

A model predictive control system, comprising: a model simulation unit, used to simulate a dynamic behavior of a target object according to a behavior model of a target object and a plurality of control decision parameters of the previous state, and correspondingly generate a plurality of actual states Parameters; a monitoring unit for monitoring the control decision parameters of the previous state and the actual state parameters, and generating a plurality of estimated state parameters according to the control decision parameters of the previous state and the actual state parameters; and An agent model predictive control unit, which is used to perform an approximate model predictive control operation according to the estimated state parameters and a plurality of target parameters, so as to obtain a plurality of control decision parameters of the current state, and convert the control decision parameters of the current state and sent back to the model simulation unit; wherein, the proxy model predictive control unit executes the approximate model predictive control operation to replace a model predictive control operation with higher computational complexity.

The model predictive control system as in Claim 1, wherein the target object has a plurality of control objects, and these control objects are simulated and controlled according to the control decision parameters, so that the actual state parameters of the target object satisfy the target parameter.

The model predictive control system according to claim 1, wherein the proxy model predictive control unit executes a random program operation to perform the approximate model predictive control operation.

The model predictive control system of claim 3, wherein the stochastic program operation performed by the agent model predictive control unit is a Gaussian program operation, and the agent model predictive control unit executes the Gaussian program operation to obtain the respective control decision parameters mean and variance.

The model predictive control system as claimed in claim 4, wherein the average values are used as the final output control decision parameters, and the variation values are used as a model reliability index.

For example, the model predictive control system of claim item 5 further includes: an updating unit for judging whether the model reliability index is greater than a threshold value, and if the model reliability index is greater than the threshold value, the proxy model predictive control unit performs renew.

Such as the model predictive control system of claim 6, further comprising: an optimized model predictive control unit, when the proxy model predictive control unit is updated, the optimized model predictive control unit is used to perform the calculation with high complexity The model predictive control operation to obtain the control decision parameters satisfying the optimal solution.

As the model predictive control system of claim 6, wherein the agent model The predictive control unit is updated in an online adaptive manner.

The model predictive control system according to claim 1, wherein the agent model predictive control unit pre-establishes the behavior model of the target in an offline manner.

The model predictive control system according to claim 9, wherein the agent model predictive control unit pre-establishes the behavior model by means of imitation learning or machine learning.

A model predictive control method, comprising: establishing a behavior model of a target object; simulating a dynamic behavior of the target object according to the behavior model of the target object and a plurality of control decision parameters of a previous state; correspondingly generating a plurality of actual State parameters; monitoring the control decision parameters of the previous state and the actual state parameters; generating a plurality of estimated state parameters according to the control decision parameters of the previous state and the actual state parameters; setting an agent model predictive control unit ; according to the estimated state parameters and a plurality of target parameters, an approximate model predictive control operation is performed by the agent model predictive control unit to replace a model predictive control operation with higher computational complexity; and the current state of these The control decision parameters are sent back to a model simulation unit.

The model predictive control method according to claim 11, wherein the target object has a plurality of control objects, and the model predictive control method includes: simulating and controlling the control objects according to the control decision parameters, so that the actual The state parameters satisfy these target parameters.

The model predictive control method according to claim 11, wherein the proxy model predictive control unit executes a random program operation to execute the approximate model predictive control operation.

The model predictive control method of claim 13, wherein the stochastic program operation performed by the agent model predictive control unit is a Gaussian program operation, and the model predictive control method includes: executing the Gaussian program operation by the model predictive control unit, In order to obtain the average value and variation value of these control decision parameters.

The model predictive control method as claimed in claim 14 further includes: using the average values as the final output control decision parameters; and using the variation values as a model reliability index.

For example, the model predictive control method of claim 15 further includes: judging whether the model reliability index is greater than a threshold; and If the model reliability index is greater than the threshold, the agent model predictive control unit updates.

The model predictive control method as claimed in claim 16 further includes: setting an optimized model predictive control unit; and when the agent model predictive control unit is updated, the optimized model predictive control unit executes the computational complexity Higher model predictive control operations to obtain the control decision parameters satisfying the optimal solution.

The model predictive control method according to claim 16, wherein the agent model predictive control unit is updated in an online adaptive manner.

The model predictive control method according to claim 11, wherein the behavior model of the target is pre-established by the agent model predictive control unit in an off-line manner.

The model predictive control method according to claim 19, wherein the behavior model is pre-established by the agent model predictive control unit by means of imitation learning or machine learning.