TW202338651A

TW202338651A - Method for optimizing resource allocation based on prediction with reinforcement learning

Info

Publication number: TW202338651A
Application number: TW111108098A
Authority: TW
Inventors: 陳文賢; 銘杰許; 弘毅曾
Original assignee: 先智雲端數據股份有限公司
Priority date: 2022-03-24
Filing date: 2022-03-24
Publication date: 2023-10-01
Also published as: TWI805257B

Abstract

A method for optimizing resource allocation based on prediction with reinforcement learning is disclosed, comprising the steps of: (a) providing a prediction on the number of units of the resource for a workload in more than N timepoints after a 0-th timepoint to the processor; (b) calculating at least one 0-th possible operation cost (POC) based on at least one possible provisioned number (PPN) at 1-th timepoint; (c) repeating the following sub-steps for the i-th timepoint with i from 1 to N by the processor: (c1) calculating at least one i-th possible operation cost (POCi); (c2) finding out the smallest and the second small POCi; and (c3) setting PPNi used to calculate the smallest POCi as an i-th assigned number; and (d) provisioning 1 unit of the resource at the 0-th timepoint and i-th assigned number of units of the resource at the i-th timepoint for the workload by the processor.

Description

Methods to optimize resource allocation based on reinforcement learning predictions

本發明關於優化配置的方法。詳言之，本發明關於根據強化學習的預測來優化資源配置的方法。The present invention relates to methods of optimizing configurations. In particular, the present invention relates to a method for optimizing resource allocation based on reinforcement learning predictions.

在電腦叢集中，硬體資源動態分配給工作負載。由於工作負載所要利用之諸如RAM模組或CPU的資源需求會隨時間而變，所以有許多預測方法對未來有粗估數量。雖然電腦叢集的建造成本一般是固定的，但包含壽命攤銷和耗電的運算成本會隨硬體資源的數量改變而增加。依據任何預測之工作負載的硬體資源的重大改變以總成本的觀點來看可能不經濟。在沒有預測時，硬體資源數量的調整是被動的，導致響應延遲很大。In a computer cluster, hardware resources are dynamically allocated to workloads. Because the resource requirements utilized by workloads, such as RAM modules or CPUs, change over time, many forecasting methods provide rough estimates of the future. Although the construction cost of a computer cluster is generally fixed, the computing cost including lifetime amortization and power consumption will increase with the number of hardware resources. Significant changes in hardware resources based on any forecast workload may not be economical from a total cost perspective. When there is no prediction, the adjustment of the number of hardware resources is passive, resulting in a large response delay.

因此，為降低上述問題的衝擊，需要更準確預測工作負載。過去引入的強化學習 (RL) 方法是利用學習過程來探索未來任何時間點的行動可能性，做出優化決定以滿足某些任務或遊戲的目標。當RL方法用來預測工作負載時，它導致所有可能路徑問題。也就是說，如果工作負載有一選擇是一次利用n個資源在未來需預測m個時間點，則RL方法必須根據長期學習旅程來考慮n ^m條路徑的資源部署。顯然，計算優化結果所需的決策過程的計算複雜度是指數級的。計算成本是管理者的另一個麻煩，因為它需要更多的軟體開發人力。 Therefore, to reduce the impact of the above problems, more accurate prediction of workloads is needed. Reinforcement learning (RL) methods introduced in the past use the learning process to explore action possibilities at any future point in time and make optimization decisions to meet the goals of certain tasks or games. When RL methods are used to predict workloads, it leads to all possible paths problems. That is, if the workload has a choice of utilizing n resources at a time and needs to predict m time points in the future, the RL method must consider the resource deployment of n ^m paths based on the long-term learning journey. Clearly, the computational complexity of the decision-making process required to compute the optimization results is exponential. Calculating costs is another hassle for managers because it requires more software development manpower.

從平衡的角度來看，如果硬體資源數量的調整盡可能少，而RL方法的計算成本受限，則工作負載的總成本可以最小。然而，現有技術中並沒有提供用於電腦叢集的該功能的方法。From a balance point of view, if the adjustment of the number of hardware resources is as small as possible and the computational cost of the RL method is limited, the total cost of the workload can be minimized. However, the prior art does not provide a method for this function of a computer cluster.

本段摘錄並彙編本發明的某些特性；其他特性將在後續段落中揭露。其旨在涵蓋申請專利範圍的精神和範疇內所包括的各種修改和類似配置。This paragraph excerpts and compiles certain features of the invention; other features will be disclosed in subsequent paragraphs. It is intended to cover various modifications and similar arrangements included within the spirit and scope of the claimed patent.

依據本發明的一個觀點，揭露了根據強化學習的預測來優化資源配置的方法。處理器決定要利用的電腦叢集中的資源單位數量。該方法包括下列步驟：a) 提供在第0個時間點後超過N個時間點的工作負載所需之資源單位數量的預測給處理器，其中可提供最大M個單位的來源，U _i是依據預測在第i個時間點所需的單位數量， N、M、 i都是正整數；b) 由處理器來計算至少一第0個可能的運算成本 (POC ₀)，它是根據在U ₁至M中的第1個時間點 (PPN ₁) 的至少一可能的提供數量 (PPN)，其中POC ₀得自 POC ₀= K + RF x | PPN ₁– K | + PPN ₁，其中RF是介於0與1之間的再平衡因子，且K為一實數；c) 由處理器對第i個時間點依序重複下列子步驟，i是從1至N：c1) 計算至少一第i個可能的運算成本 (POC _i)，其中POC _i得自POC _i= POC _(i-1)+ RF x | PPN _(i+1)- PPN _i| + PPN _(i+1)，其中 POC _(i-1)是對第 (i-1) 個時間點所計算的可能的運算成本，PPN _(i+1)是在U _(i+1)至M中之第 (i+1) 個時間點的PPN，PPN _i是在U _i至M中之第i個時間點的PPN，用來計算POC _i和 POC _(i-1)的PPN _i具有相同值；c2) 找出最小和第二小POC _i；及 c3) 如果最小和第二小POC _i是從計算自相同PPN _i，則將用來計算最小POC _i的PPN _i設為第i個指定數量，從下一時間點的計算的第i個指定數量除去未計算的POC _i；及d) 由處理器在第0個時間點提供1單位的資源和在第i個時間點提供第i個指定單位數量的資源給工作負載。 According to one aspect of the present invention, a method for optimizing resource allocation based on predictions of reinforcement learning is disclosed. The processor determines the number of resource units in the computer cluster to be utilized. The method includes the following steps: a) Provide the processor with a prediction of the number of resource units required for the workload exceeding N time points after the 0th time point, where a maximum of M unit sources can be provided, U _i is based on Predict the number of units required at the i-th time point, N, M, and i are all positive integers; b) The processor calculates at least one 0th possible operation cost (POC ₀ ), which is based on U ₁ to At least one possible offering quantity (PPN) at the 1st time point (PPN ₁ ) in M, where POC ₀ is derived from POC ₀ = K + RF x | PPN ₁ – K | + PPN ₁ , where RF is between The rebalancing factor between 0 and 1, and K is a real number; c) The processor repeats the following sub-steps in sequence for the i-th time point, i is from 1 to N: c1) Calculate at least one i-th possible The operational cost of (POC _i ), where POC _i is derived from POC _i = POC _(i-1) + RF x | PPN _(i+1) - PPN _i | + PPN _(i+1) , where POC _{(i-1 )} is the possible operation cost calculated for the (i-1)th time point, PPN _(i+1) is the PPN at the _{(i+1)th time point in U (i+1)} to M, PPN _i is the PPN _at the i-th time point in U _i to M, and is used to calculate the PPN i of POC _i and POC _(i-1) with the same value; c2) Find the smallest and second smallest POC _i ; and c3) If the smallest and second smallest POC _i are calculated from the same PPN _i , then the PPN _i used to calculate the smallest POC _i is set to the i-th specified quantity, calculated from the i-th specified quantity at the next time point Remove the uncalculated POC _i ; and d) provide 1 unit of resources by the processor at the 0th time point and provide the i-th specified unit amount of resources to the workload at the i-th time point.

該方法可進一步包括下列子步驟：c4) 如果未自相同PPN _i計算最小和第二小POC _i，則個別使用PPN _i來計算 POC _(i+1)，將用來計算最小 POC _(i+1)的PPN _i設為第i個指定數量，從下一時間點的計算的第i個指定數量除去未計算的POC _i，其中 POC _(i+1)是對第 (i+1) 個時間點所計算的可能的運算成本。 The method may further include the following sub-steps: c4) If the smallest and second smallest POC _i are not calculated from the same PPN _i , then individually use PPN _i to calculate POC _(i+1) , which will be used to calculate the smallest POC _{(i+1 )} of PPN _i is set to the i-th specified quantity, and the uncalculated POC _i is removed from the calculated i-th specified quantity at the next time point, where POC _(i+1) is the calculated i-th specified quantity at the (i+1)th time point The calculated possible operational cost.

優選地，資源是記憶體模組、CPU、I/O吞吐量、響應時間、每秒請求數、或延遲。Preferably, the resource is memory modules, CPU, I/O throughput, response time, requests per second, or latency.

本發明也揭露了根據強化學習的預測來優化資源配置的另一方法。處理器決定要利用的電腦叢集中的資源單位數量。該方法包括下列步驟：a) 提供在第0個時間點後超過N個時間點的工作負載所需之資源單位數量的預測給處理器，其中可提供最大M個單位的來源，U _i是依據預測在第i個時間點所需的單位數量， N、M、 i都是正整數；b) 由處理器來計算至少一第0個可能的運算成本 (POC ₀)，它是根據在U ₁至U ₁+A與M二者中之最小者中的第1個時間點 (PPN ₁) 的至少一可能的提供數量 (PPN)，其中POC ₀得自POC ₀= K + RF x | PPN ₁– K | + PPN ₁，其中RF是介於0與1之間的再平衡因子，A是整數，且K為一實數；c) 由處理器對第i個時間點依序重複下列子步驟，i是從1至N：c1) 計算至少一第i個可能的運算成本 (POC _i)，其中POC _i得自POC _i= POC _(i-1)+ RF x | PPN _(i+1)- PPN _i| + PPN _(i+1)，其中 POC _(i-1)是對第 (i-1) 個時間點所計算的可能的運算成本，PPN _(i+1)是在U _(i+1)至U _(i+1)+A與M二者中最小者中之第 (i+1) 個時間點的PPN，PPN _i是在U _i至U _i+A與M二者中之最小者中之第i個時間點的PPN，用來計算POC _i和 POC _(i-1)的PPN _i具有相同值；c2) 找出最小和第二小POC _i；及 c3) 如果最小和第二小POC _i是從計算自相同PPN _i，則將用來計算最小POC _i的PPN _i設為第i個指定數量，從下一時間點的計算的第i個指定數量除去未計算的POC _i；及 d) 由處理器在第0個時間點提供1單位的資源和在第i個時間點提供第i個指定單位數量的資源給工作負載。 The present invention also discloses another method of optimizing resource allocation based on predictions of reinforcement learning. The processor determines the number of resource units in the computer cluster to be utilized. The method includes the following steps: a) Provide the processor with a prediction of the number of resource units required for the workload exceeding N time points after the 0th time point, where a maximum of M unit sources can be provided, U _i is based on Predict the number of units required at the i-th time point, N, M, and i are all positive integers; b) The processor calculates at least one 0th possible operation cost (POC ₀ ), which is based on U ₁ to U ₁ + at least one possible supply quantity (PPN) at the first time point (PPN ₁ ) in the smallest of A and M, where POC ₀ is derived from POC ₀ = K + RF x | PPN ₁ – K | + PPN ₁ , where RF is a rebalancing factor between 0 and 1, A is an integer, and K is a real number; c) The processor repeats the following sub-steps in sequence for the i-th time point, i is from 1 to N: c1) Calculate at least one i-th possible operation cost (POC _i ), where POC _i is derived from POC _i = POC _(i-1) + RF x | PPN _(i+1) - PPN _i | + PPN _(i+1) , where POC _(i-1) is the possible operation cost calculated for the (i-1)th time point, and PPN _(i+1) is between U _(i+1) and The PPN at the _(i+1) th time point of U (i+1) + the smallest of A and M, PPN _i is the smallest of U _i to U _i +A and M The PPN at the i-th time point is used to calculate the PPN _i of POC _i and POC _(i-1) with the same value; c2) find the smallest and second smallest POC _i ; and c3) if the smallest and second smallest POC _i is calculated from the same PPN _i , then the PPN _i used to calculate the minimum POC _i is set to the i-th specified quantity, and the uncalculated POC _i is removed from the i-th specified quantity calculated at the next time point; and d) The processor provides 1 unit of resources at the 0th time point and provides the ith specified unit amount of resources to the workload at the ith time point.

另一種根據強化學習的預測來優化資源配置的方法，是藉由處理器決定所要用的電腦叢集中的資源的單位數量來實施，包括下列步驟：a) 提供在第0個時間點後超過N個時間點的工作負載所需之資源單位數量的預測給處理器，其中可提供最大M個單位的來源，U _i是依據預測在第i個時間點所需的單位數量， N、M、 i都是正整數；b) 由處理器來計算至少一第1個可能的運算成本 (POC ₁)，它是根據在U ₁至M中的第1個時間點 (PPN ₁) 的至少一可能的提供數量 (PPN) 和在U ₂至M中的第2個時間點 (PPN ₂) 的至少一PPN，其中POC ₀得自POC ₁= K + RF x |PPN ₁- K| + PPN ₁+ RF x |PPN ₂– PPN ₁| + PPN ₂，其中RF是介於0與1之間的再平衡因子，且K為一實數；c) 將用來計算最小POC ₁的PPN ₁設為第1個指定數量；d) 由處理器對時間點依序重複下列子步驟，i是從2至2 x [N/2] 的偶數：d1) 計算至少一第 (i+1) 個可能的運算成本 POC _(i+1)，其中 POC _(i+1)得自POC _(i+1)= POC _(i-1)+ RF x | PPN _(i+1)- PPN _i| + PPN _(i+1)+ Wi，其中Wi是RF x | PPN _(i+2)– PPN _(i+1)| + PPN _(i+2)，POC _(i-1)是對第 (i-1) 個時間點所計算的可能的運算成本，PPN _(i+2)是在U _(i+2)至M中的第 (i+2) 個時間點的PPN，PPN _(i+1)是在U _(i+1)to 至M中的第 (i+1) 個時間點的PPN，PPN _i是在U _i至M中的第i個時間點的PPN，用來計算 POC _(i+1)和 POC _(i-1)的PPN _i具有相同值；其中如果 (i+2) 大於N，則從計算省略Wi；d2) 找出最小和第二小POC _i；及 d3) 如果最小和第二小POC _i計算自相同PPN _i，則將用來計算最小 POC _(i+1)的PPN _i設為第i個指定數量且用來計算最小 POC _(i+1)的 PPN _(i+1)設為第 (i+1) 個指定數量，從下一時間點的計算的第i個指定數量除去未計算的 POC _(i+1)；及 e) 由處理器在第0個時間點提供1單位的資源和在第j個時間點提供第j個指定單位數量的資源給工作負載，其中j是在1至N中。 Another method of optimizing resource allocation based on reinforcement learning predictions is implemented by the processor determining the number of resource units in the computer cluster to be used, including the following steps: a) Provide more than N after the 0th time point The prediction of the number of resource units required by the workload at a time point is given to the processor, where a maximum of M units can be provided. U _i is the number of units required at the i-th time point according to the prediction, N, M, i are all positive integers; b) the processor calculates at least one possible operation cost (POC ₁ ), which is based on at least one possible provision at the first time point (PPN ₁ ) in U ₁ to M Amount (PPN) and at least one PPN at the 2nd time point (PPN ₂ ) in U ₂ to M, where POC ₀ is derived from POC ₁ = K + RF x |PPN ₁ - K| + PPN ₁ + RF x |PPN ₂ – PPN ₁ | + PPN ₂ , where RF is a rebalancing factor between 0 and 1, and K is a real number; c) Set PPN _{1 used to calculate the minimum POC 1} _as the first specified Quantity; d) The processor repeats the following sub-steps sequentially for time points, i is an even number from 2 to 2 x [N/2]: d1) Calculate at least one (i+1)th possible operation cost POC _{( i+1)} , where POC _(i+1) is derived from POC _(i+1) = POC _(i-1) + RF x | PPN _(i+1) - PPN _i | + PPN _(i+1) + Wi , where Wi is RF x | PPN _(i+2) – PPN _(i+1) | + PPN _(i+2) , POC _(i-1) is the possibility calculated for the (i-1)th time point The operation cost of , PPN _(i+2) is the PPN at the ( _i+2 )th time point in U _(i+2) _to PPN at the (i+1)th time point in M, PPN _i is the PPN at the i-th time point in U _i to M, used to calculate POC _(i+1) and POC _(i-1) PPN _i has the same value; where if (i+2) is greater than N, then Wi is omitted from the calculation; d2) find the smallest and second smallest POC _i ; and d3) if the smallest and second smallest POC _i are calculated from the same PPN _i , then the PPN _i used to calculate the minimum POC _(i+1) is set to the i-th specified number and the PPN (i _{+1) used to calculate the minimum POC (i+} ₁₎ is set to the (i+1)-th Specify the quantity, remove the uncalculated POC _(i+1) from the calculated ith specified quantity at the next time point; and e) provide 1 unit of resources by the processor at the 0th time point and at the jth time Point provides the jth specified unit amount of resources to the workload, where j is in the range 1 to N.

現在參照以下實施例來更詳細說明本發明。The invention will now be described in more detail with reference to the following examples.

請參照圖1。它顯示本發明應用的硬體架構。電腦叢集10 (例如，資料中心) 有許多計算單元100，各有多個CPU 101和記憶體模組102 (諸如SDRAM)，分享儲存在儲存裝置103 (諸如硬碟) 的資料。計算單元100可經由內部資料網路110來合作支援工作負載的運算。電腦叢集10進一步經由網路通訊介面120將用戶端30連接網際網路20。舉例來說，計算單元100 提供串流服務給用戶端30做為工作負載。提供給工作負載的CPU 101和記憶體模組102的數量決定工作負載的請求可以多快得到反應。CPU和記憶體模組在實施例中稱為資源。雖然愈多資源用於工作負載會提高工作負載的性能，但也對整個系統造成負荷。消耗掉資源的壽命，且其他工作負載不能共享相同資源。因此，本發明是優化資源配置的方法。配置是根據資源需求的預測，但由強化學習來修改。該方法由其中一個計算單元100中的處理器 101a (ASIC或其中一個CPU 100) 來實施。處理器101a決定要利用的電腦叢集10中的資源的單位數量，利用是依據方法的結果。預測可由另一 CPU 101b來計算並提供，或從電腦叢集10之外的外部系統輸入。Please refer to Figure 1. It shows the hardware architecture to which the invention is applied. The computer cluster 10 (eg, data center) has many computing units 100, each with multiple CPUs 101 and memory modules 102 (such as SDRAM), sharing data stored in the storage device 103 (such as a hard disk). The computing units 100 may cooperate via the internal data network 110 to support the computing of workloads. The computer cluster 10 further connects the client 30 to the Internet 20 through the network communication interface 120 . For example, the computing unit 100 provides streaming services to the client 30 as a workload. The number of CPUs 101 and memory modules 102 provided to a workload determines how quickly the workload's requests can be responded to. CPU and memory modules are called resources in the embodiment. While more resources available to a workload improves the performance of the workload, it also places a load on the entire system. The life of the resource is consumed and other workloads cannot share the same resource. Therefore, the present invention is a method for optimizing resource allocation. Configurations are based on predictions of resource requirements but modified by reinforcement learning. The method is implemented by a processor 101a (ASIC or one of the CPUs 100) in one of the computing units 100. The processor 101a determines the number of units of resources in the computer cluster 10 to be utilized, the utilization being a result of the method. Predictions may be calculated and provided by another CPU 101b, or input from an external system outside of the computer cluster 10.

請參照圖2。它是揭露於本發明第一實施例的方法的流程圖。該方法的第一步驟是提供在第0個時間點後超過N個時間點的工作負載所需之資源單位數量的預測給處理器，其中可提供最大M個單位的來源，U _i是依據預測在第i個時間點所需的單位數量， N、M、 i都是正整數 (S01)。預測可以是根據演算法、公式、經驗甚或人工智慧的任何方法，只要有未來的工作負載所需之資源的單位數量。依據本發明，並非預測的所有結果都用於計算。只使用有限的數量N。N是本發明的使用者所定義的數字。此實施例中，N設為9。預測可對超過N個時間點提供資源需求。額外資料可用於另一階段的資源配置計算。本實施例中，M以6為例。有鑑於M和N的值，為了下一步驟的運算，預測結果顯示於圖3。實心點代表在不同時間點之資源 (例如，CPU) 的預測單位數量。舉例來說，依據預測，第3個時間點會需要6單位的資源。廣義來說，可增進工作負載性能的系統的所有可控運算結果視為一種資源，例如但不限於I/O吞吐量、響應時間、每秒請求數、延遲。 Please refer to Figure 2. It is a flow chart of the method disclosed in the first embodiment of the present invention. The first step of the method is to provide the processor with a prediction of the number of resource units required for the workload exceeding N time points after the 0th time point, where a maximum of M units can be provided, U _i is based on the prediction The number of units required at the i-th time point, N, M, and i are all positive integers (S01). The prediction can be any method based on algorithms, formulas, experience, or even artificial intelligence, as long as there are unit quantities of resources required for future workloads. According to the present invention, not all predicted results are used in the calculation. Only a limited amount N is used. N is a number defined by the user of this invention. In this example, N is set to 9. Forecasts can provide resource requirements for more than N points in time. The additional information can be used in another stage of resource allocation calculations. In this embodiment, M is 6 as an example. Given the values of M and N, for the next step of the operation, the prediction results are shown in Figure 3. The solid dots represent the predicted number of units of the resource (for example, CPU) at different points in time. For example, according to predictions, 6 units of resources will be required at the third time point. Broadly speaking, all controllable computing results of the system that can improve workload performance are considered a resource, such as but not limited to I/O throughput, response time, requests per second, and latency.

該方法的第二步驟是由處理器來計算至少一第0個可能的運算成本 (POC ₀)，它是根據在U ₁至M中的第1個時間點 (PPN ₁) 的至少一可能的提供數量 (PPN)，其中POC ₀得自 POC ₀= K + RF x | PPN ₁– K | + PPN ₁，其中RF是介於0與1之間的再平衡因子，且K為一實數 (S02)。此步驟由處理器101a進行初步計算。為了更明瞭，請參照圖5。它列出對一部分時間點的計算。每一計算分成三個矩陣。左矩陣列出在先前時間點的所有PPN。中矩陣與圖4做比較，顯示在過渡成本上的一部分計算。右矩陣將POC加到同一列的每一過渡成本，顯示對目前時間點找到指定數量的結果。本發明的基本觀念是對一時間點決定適當單位數量的資源，它借助考慮從在先前時間點提供給工作負載所計算之資源的所有種類組合所累積的運算成本，找出最小累積成本，使用可能的提供數量做為資源的指定數量以利用到工作負載。如果需要的話，考慮下一時間點的計算。整個過程延伸自強化學習 (動態規畫) 的邏輯，使用諸如成本函數、最大獎勵 (最小成本)、迭代的類似觀念。然而，為降低冗長計算程序，省略合理且非必要的計算。PPN表示一時間點的候選的所有數量。PPN _i是從預測數量U _i至系統可提供之最大數量M中的任一數量。舉例來說，在第2個時間點，所需資源的預測單位數量是3，可利用最多有6單位的資源，用於第1個和第2個時間點之計算的PPN ₂是3、4、5、6。如果U _i和M相同，則只使用一數量。在任何時間點，當利用一單位的資源時，會降低資源的壽命且耗電。為簡化及量化結果，所有隨後的計算使用最小成本，1，作為K。為了方便說明，在所有實施例中，將K設為1。當然，兩個後續時間點之間的資源改變數量也很貴，但不及只用一個。因此，以再平衡因子來減少改變。RF隨不同種類的資源而變。選擇RF是根據資源的特性。此實施例中，以0.6為例。 The second step of the method is for the processor to calculate at least one 0th possible operation cost (POC ₀ ) based on at least one possible 1st time point (PPN ₁ ) in U ₁ to M Provides the quantity (PPN) where POC ₀ is obtained from POC ₀ = K + RF x | PPN ₁ – K | + PPN ₁ , where RF is a rebalancing factor between 0 and 1 and K is a real number (S02 ). This step is performed by the processor 101a. For better clarity, please refer to Figure 5. It lists calculations for a subset of time points. Each calculation is divided into three matrices. The left matrix lists all PPNs at previous time points. The middle matrix is compared to Figure 4, showing part of the calculation on transition costs. The right matrix adds the POC to each transition cost in the same column, showing the results of finding the specified number for the current point in time. The basic idea of the present invention is to determine the appropriate unit number of resources for a point in time. It finds the minimum cumulative cost by considering the cumulative computational cost from all kinds of combinations of resources provided to the workload for computing at previous points in time, using It is possible to provide a quantity as a specified amount of resources to be utilized by the workload. If necessary, consider calculations at the next point in time. The entire process extends from the logic of reinforcement learning (dynamic programming), using similar concepts such as cost function, maximum reward (minimum cost), and iteration. However, to reduce lengthy calculation procedures, reasonable and unnecessary calculations are omitted. PPN represents the total number of candidates at a point in time. PPN _i is any number from the predicted quantity U _i to the maximum quantity M that the system can provide. For example, at the second time point, the predicted number of units of required resources is 3, and up to 6 units of resources can be utilized. The PPN ₂ used for the calculations at the first and second time points are 3 and 4. ,5,6. If U _i and M are the same, only one quantity is used. At any point in time, when a unit of a resource is utilized, the life of the resource is reduced and power is consumed. To simplify and quantify the results, all subsequent calculations use the minimum cost, 1, as K. For convenience of explanation, K is set to 1 in all embodiments. Of course, the number of resource changes between two subsequent time points is also expensive, but not as much as using just one. Therefore, rebalancing factors are used to reduce changes. RF varies with different kinds of resources. The selection of RF is based on the characteristics of the resource. In this embodiment, 0.6 is taken as an example.

對第0個時間點的計算，這是例外。由於沒有先前時間點，所以應指定起始值做為累積成本。依據本發明，數量是 “1”。上述計算公式中，“|PPN ₁- 1|” 得到在第0個和第1個時間點之間的資源改變數量，“RF x |PPN ₁- 1|” 將|PPN ₁– 1| 減少40% (RF = 0.6)，對第1個時間點的計算得到兩個POC ₀：8.4、10。應注意在其他實例中，POC ₀的數量可大於2，也可以是1。 This is an exception for calculations at time point 0. Since there is no previous time point, a starting value should be specified as the cumulative cost. According to the present invention, the number is "1". In the above calculation formula, "|PPN ₁ - 1|" gets the number of resource changes between the 0th and 1st time points, "RF x |PPN ₁ - 1|" reduces |PPN ₁ - 1| by 40 % (RF = 0.6), the calculation of the first time point resulted in two POC ₀ : 8.4, 10. It should be noted that in other instances, the number of POC ₀ can be greater than 2, and can also be 1.

該方法的第三步驟是由處理器對第i個時間點依序重複下列子步驟，i是從1至N (S03)。此步驟是對所有N個時間點的計算。該子步驟詳述如下。The third step of the method is for the processor to sequentially repeat the following sub-steps for the i-th time point, where i is from 1 to N (S03). This step is the calculation of all N time points. This sub-step is detailed below.

第一子步驟是計算至少一第i個可能的運算成本 (POC _i)，其中POC _i得自POC _i= POC _(i-1)+ RF x | PPN _(i+1)- PPN _i| + PPN _(i+1)，其中 POC _(i-1)是對第 (i-1) 個時間點所計算的可能的運算成本，PPN _(i+1)是在U _(i+1)至M中之第 (i+1) 個時間點的PPN，PPN _i是在U _i至M中之第i個時間點的PPN，用來計算POC _i和 POC _(i-1)的PPN _i具有相同值 (S03-1)。舉例來說，計算第1個時間點。圖5中，兩個POC ₀做為第一子步驟上述計算公式的 “POC _(i-1)”。此處，PPN _(i+1)是PPN ₂，包含3、4、5、6。PPN _i是PPN ₁，包含5和6如上述。為簡化 “RF x | PPN _(i+1)- PPN _i|” 的運算，圖4的查找表顯示了考慮從1至6的U _i和U _(i+1)的計算，M對 PPN _(i+1)和PPN _i都維持6，RF等於0.6。由於右矩陣只將PPN加到同列的每一過渡成本，所以8.4加到4.2、4.6、5、6.6，但不加到4.8、5.2、5.6、6。因此，得到的一列POC ₁(POC _i) 為12.6、13、13.4、15。同樣地，10加到4.8、5.2、5.6、6，但不加到4.2、4.6、5、6.6。得到的另一列POC ₁為14.8、15.2、15.6、16。PPN ₁中的 “5” 用來計算POC ₀中的 “8.4” 以及POC ₁中的12.6、13、13.4、15。 The first sub-step is to calculate at least one ith possible operation cost (POC _i ), where POC _i is derived from POC _i = POC _(i-1) + RF x | PPN _(i+1) - PPN _i | + PPN _(i+1) , where POC _(i-1) is the possible operation cost calculated for the (i-1)th time point, and PPN _(i+1) is among U _(i+1) to M PPN at the (i+1)th time point, PPN _i is the PPN at the i-th time point in U _i to M, used to calculate the PPN i of POC _i and POC _(i-1) with the same _value (S03 -1). For example, calculate the 1st time point. In Figure 5, two POC ₀ are used as "POC _(i-1) " of the above calculation formula in the first sub-step. Here, PPN _(i+1) is PPN ₂ and includes 3, 4, 5, and 6. PPN _i is PPN ₁ , including 5 and 6 as above. To simplify the operation of "RF x | PPN _(i+1) - PPN _i |", the lookup table in Figure 4 shows the calculation considering U _i and U _(i+1) _from 1 to 6, ₊₁₎ and PPN _i both remain at 6, and RF is equal to 0.6. Since the right matrix only adds PPN to each transition cost in the same column, 8.4 is added to 4.2, 4.6, 5, 6.6, but not to 4.8, 5.2, 5.6, 6. Therefore, the resulting column of POC ₁ (POC _i ) is 12.6, 13, 13.4, and 15. Likewise, 10 adds to 4.8, 5.2, 5.6, and 6, but not to 4.2, 4.6, 5, and 6.6. The other column of POC ₁ obtained is 14.8, 15.2, 15.6, 16. "5" in PPN ₁ is used to calculate "8.4" in POC ₀ and 12.6, 13, 13.4, and 15 in POC ₁ .

第二子步驟是找出最小和第二小POC _i(S03-2)。圖5中，最小和第二小POC ₁分別是12.6和13。都來自相同PPN ₁，但PPN ₂不同。 The second sub-step is to find the smallest and second smallest POC _i (S03-2). In Figure 5, the smallest and second smallest POC ₁ are 12.6 and 13 respectively. Both come from the same PPN ₁ , but PPN ₂ is different.

第三子步驟是如果最小和第二小POC _i是從計算自相同PPN _i，則將用來計算最小POC _i的PPN _i設為第i個指定數量，從下一時間點的計算的第i個指定數量除去未計算的POC _i(S03-3)。如上所示，最小和第二小POC ₁來自相同PPN ₁，因而滿足前提，用來計算最小POC _i(12.6) 的PPN ₁為5 (與12.6由箭號相連) 並指定為第1個指定數量。對下一時間點的計算除去未計算自第1個指定數量5 (也就是14.8、15.2、15.6、16) 的POC ₁。只有12.6、13、13.4、15留給第2個時間點。相同計算參見圖5至圖7。 The third sub-step is that if the smallest and second smallest POC _i are calculated from the same PPN _i , then the PPN _{i used to calculate the smallest POC i} _is set to the i-th specified quantity, from the i-th calculation at the next time point The specified number is removed from the uncalculated POC _i (S03-3). As shown above, the smallest and second smallest POC ₁ come from the same PPN ₁ , so the premise is met. The PPN ₁ used to calculate the smallest POC _i (12.6) is 5 (connected to 12.6 by an arrow) and is designated as the first specified quantity. . The calculation of the next time point excludes POC ₁ that is not calculated from the first specified quantity 5 (that is, 14.8, 15.2, 15.6, 16). Only 12.6, 13, 13.4, and 15 are left for the second time point. See Figures 5 to 7 for the same calculations.

應注意從步驟S01至S03的計算只用於第1個時間點來臨時的資源配置，此刻不利用任何東西。計算是在第1個時間點之前已準備，甚至在第0個時間點之前。It should be noted that the calculations from steps S01 to S03 are only used for resource allocation when the first point in time comes, and nothing is used at this time. The calculation is prepared before time point 1, even before time point 0.

該方法的最後步驟是由處理器在第0個時間點提供1單位的資源和在第i個時間點提供第i個指定單位數量的資源給工作負載 (S04)。此步驟是資源配置步驟。依據先前步驟所計算的相關資料，資源的單位數量對第1個時間點至第9個時間點分別是5、5、6、5、5、5、5、5、6。結果不同於對第1個時間點至第9個時間點分別要求5、2、6、1、5、6、5、2、9的預測。資源的提供單位數量使運算的成本最低。The final step of the method is for the processor to provide 1 unit of resources at the 0th time point and provide the ith specified unit amount of resources to the workload at the ith time point (S04). This step is a resource configuration step. Based on the relevant data calculated in the previous steps, the unit quantities of resources are 5, 5, 6, 5, 5, 5, 5, 5, and 6 for the first time point to the ninth time point respectively. The result is different from the prediction that requires 5, 2, 6, 1, 5, 6, 5, 2, and 9 respectively from the 1st time point to the 9th time point. The number of units provided by the resource minimizes the cost of the operation.

雖然對第2個時間點至第9個時間點執行相同計算，但在不同情況下有兩個例外。Although the same calculation is performed for time points 2 to 9, there are two exceptions in different situations.

第一例外以圖6顯示。第三子步驟的前提要求最小和第二小POC _i應計算自相同PPN _i。然而，對第4個時間點處理計算時，最小和第二小POC ₄計算自不同的PPN ₄。也就是說，最小POC ₄(30.6) 計算自5，而第二小POC ₄(32) 計算自6。這意味著難以知道哪個PPN ₄對第5個時間點有最低計算成本。依據本發明，此情況的解決之道是個別計算的兩個PPN ₄都用於第5個時間點，依據兩個計算的結果來決定第4個指定數量。因此，需要第四子步驟，如果未自相同PPN _i計算最小和第二小POC _i，則個別使用PPN _i來計算 POC _(i+1)，將用來計算最小 POC _(i+1)的PPN _i設為第i個指定數量，從下一時間點的計算的第i個指定數量除去未計算的POC _i，其中 POC _(i+1)是對第 (i+1) 個時間點所計算的可能的運算成本。圖6的實例中，PPN ₄的5及相關POC ₄(30.6和32.2) 用於第5個時間點的計算，版本1；PPN ₄的6及相關POC ₄(32.2和32) 用於第5個時間點的計算，版本2。在此情況下，二個版本顯示5是計算最小POC ₅(34.4和35.4) 的PPN ₄。然後5回頭設給第4個指定數量。其餘與第三子步驟相同，不再重複。 The first exception is shown in Figure 6. The prerequisite of the third sub-step requires that the smallest and second smallest POC _i should be calculated from the same PPN _i . However, when processing calculations for the fourth time point, the smallest and second smallest POC ₄ are calculated from different PPN ₄ s. That is, the smallest POC ₄ (30.6) is calculated from 5, and the second smallest POC ₄ (32) is calculated from 6. This means that it is difficult to know which PPN ₄ has the lowest computational cost for the 5th time point. According to the present invention, the solution to this situation is to use the two individually calculated PPN _4s for the fifth time point, and determine the fourth designated quantity based on the results of the two calculations. Therefore, a fourth sub-step is required, if the smallest and second smallest POC _i are not calculated from the same PPN _i , then individually use PPN _i to calculate POC _(i+1) , which will be used to calculate the PPN of the smallest POC _(i+1) Set _i to the i-th specified quantity, and remove the uncalculated POC _i from the i-th specified quantity calculated at the next time point, where POC _(i+1) is calculated for the (i+1)-th time point Possible operational costs. In the example of Figure 6, 5 of PPN ₄ and related POC ₄ (30.6 and 32.2) are used for the calculation of the 5th time point, version 1; 6 of PPN ₄ and related POC ₄ (32.2 and 32) are used for the 5th time point. Calculation of time points, version 2. In this case, the two versions showing 5 are PPN ₄ to calculate the minimum POC ₅ (34.4 and 35.4). Then 5 is set back to the 4th specified quantity. The rest is the same as the third sub-step and will not be repeated.

第二例外發生在第9個時間點的計算。因為本方法只接受9個時間點的預測，所以即使可預測第10個時間點，也不會使用。因此，PPN ₁₀設為 “0”。計算顯示於圖7。 The second exception occurs in the calculation at the 9th time point. Because this method only accepts predictions for 9 time points, even if the 10th time point can be predicted, it will not be used. Therefore, PPN ₁₀ is set to "0". The calculation is shown in Figure 7.

由傳統強化學習的觀點來看，如果不要失去找到最小成本的任何改變，則應計算圖5至圖7中的右矩陣的所有資料。舉例來說，第1個時間點的計算的右矩陣只包含8個資料，而應做36個計算已得到36個資料。對此時間點，節省28個計算；從第0個時間點至第9個時間點，整個過程節省257個計算；實際情形有數千個資源單位和數百個時間點；可節省數百萬個計算。藉助於本方法，可降低指數型複雜度。在有限時間內可達成接近最佳且可接受的結果，大為增加可用性並降低運算成本。From a traditional reinforcement learning point of view, if you do not lose any change in finding the minimum cost, you should calculate all the information of the right matrix in Figure 5 to Figure 7. For example, the right matrix calculated at the first time point only contains 8 data, but 36 calculations should be done to obtain 36 data. At this time point, 28 calculations are saved; from the 0th time point to the 9th time point, the entire process saves 257 calculations; the actual situation has thousands of resource units and hundreds of time points; millions can be saved calculation. With the help of this method, exponential complexity can be reduced. Close to optimal and acceptable results can be achieved within a limited time, greatly increasing availability and reducing computing costs.

依據本發明的精神，計算量在另一改良方法可進一步降低。改良方法顯示於第二實施例。According to the spirit of the present invention, the calculation amount can be further reduced in another improved method. An improved method is shown in the second example.

請參照圖8。它是揭露於本發明第二實施例的改良方法的流程圖。為簡化第二實施例的說明，此處仍應用第一實施例的資源需求預測。此外，第二實施例也應用於圖1的架構。可提供的來源最大單位M維持6。第一實施例的方法與改良方法間的唯一差異是PPN的定義。用於此實施例的每一時間點的PPN量進一步降低。PPN包含在第i個時間點所需的單位數量U _i和其他較大數量。較大數量可不含M。改良方法以圖10和圖11來顯示，列出所有時間點的計算。 Please refer to Figure 8. It is a flow chart of an improved method disclosed in the second embodiment of the present invention. To simplify the description of the second embodiment, the resource demand prediction of the first embodiment is still applied here. In addition, the second embodiment is also applied to the architecture of FIG. 1 . The maximum unit M of available sources remains 6. The only difference between the method of the first embodiment and the improved method is the definition of PPN. The amount of PPN at each time point for this example was further reduced. PPN contains the number of units U _i required at the i-th time point and other larger quantities. Larger quantities may not contain M. The improved method is shown in Figures 10 and 11, which list the calculations at all time points.

改良方法的第一步驟是提供在第0個時間點後超過N個時間點的工作負載所需之資源單位數量的預測給處理器，其中可提供最大M個單位的來源，U _i是依據預測在第i個時間點所需的單位數量， N、M、 i都是正整數 (S11)。此步驟與步驟S01相同。如上述，N和M維持相同以說明第二實施例。 The first step of the improved method is to provide the processor with a prediction of the number of resource units required for the workload beyond N time points after the 0th time point, where a maximum of M units can be provided, U _i is based on the prediction The number of units required at the i-th time point, N, M, and i are all positive integers (S11). This step is the same as step S01. As mentioned above, N and M remain the same to illustrate the second embodiment.

改良方法的第二步驟是由處理器來計算至少一第0個可能的運算成本 (POC ₀)，它是根據在U ₁至U ₁+A與M二者中之最小者中的第1個時間點 (PPN ₁) 的至少一可能的提供數量 (PPN)，其中POC ₀得自POC ₀= K + RF x | PPN ₁– K | + PPN ₁，其中RF是介於0與1之間的再平衡因子，A是整數，且K可為任一實數，在此將K設為1 (S12)。RF也設為0.6。步驟S02與步驟S12間的差異是PPN ₁受限於其上限。為更了解差異，請參照圖9。如同圖 3，具有相同預測結果，但可能的提供數量不同。此實施例中，A為2。對每一時間點的計算，底線是U _i，上限是U ₁+A或M中的較小者。對第0個時間點的計算，U ₁+A等於7而M是6。因此，因為最多可用的資源有6單位，所以PPN ₁是6。不能是7。由於一開始1是給予最小成本，所以第0個時間點的POC ₀的計算結果也是8.4和10。請回到圖7。如果對每一第i個時間點，PPN _i的上限由點線連接，PPN _i的下限由實線連接，則形成 “隧道”。所有時間點的資源單位的指定數量落入隧道中。 The second step of the improved method is for the processor to calculate at least a 0th possible operation cost (POC ₀ ) based on the 1st of the smallest of U ₁ to U ₁ +A and M At least one possible offering quantity (PPN) at a point in time (PPN ₁ ), where POC ₀ is derived from POC ₀ = K + RF x | PPN ₁ – K | + PPN ₁ , where RF is between 0 and 1 Rebalancing factor, A is an integer, and K can be any real number, here K is set to 1 (S12). RF is also set to 0.6. The difference between step S02 and step S12 is that PPN ₁ is limited to its upper limit. To better understand the differences, please refer to Figure 9. As in Figure 3, with the same prediction results, but different numbers of possible offers. In this example, A is 2. For each time point calculation, the bottom line is U _i and the upper limit is the smaller of U ₁ +A or M. For the calculation at the 0th time point, U ₁ +A is equal to 7 and M is 6. Therefore, since the maximum available resource is 6 units, PPN ₁ is 6. It cannot be 7. Since 1 is given as the minimum cost at the beginning, the calculation results of POC ₀ at the 0th time point are also 8.4 and 10. Please return to Figure 7. If for each i-th time point, the upper limit of PPN _i is connected by a dotted line and the lower limit of PPN _i is connected by a solid line, a "tunnel" is formed. A specified number of resource units fall into the tunnel at all points in time.

改良方法的第三步驟是由處理器對第i個時間點依序重複下列子步驟，i是從1至N (S13)。似乎與步驟S03相同。然而，子步驟不同。以下是這些子步驟的說明。The third step of the improved method is for the processor to sequentially repeat the following sub-steps for the i-th time point, where i is from 1 to N (S13). Seems to be the same as step S03. However, the sub-steps are different. Below are descriptions of these sub-steps.

第一子步驟是計算至少一第i個可能的運算成本 (POC _i)，其中POC _i得自POC _i= POC _(i-1)+ RF x | PPN _(i+1)- PPN _i| + PPN _(i+1)，其中 POC _(i-1)是對第 (i-1) 個時間點所計算的可能的運算成本，PPN _(i+1)是在U _(i+1)至U _(i+1)+A與M二者中最小者中之第 (i+1) 個時間點的PPN，PPN _i是在U _i至U _i+A與M二者中之最小者中之第i個時間點的PPN，用來計算POC _i和 POC _(i-1)的PPN _i具有相同值 (S13-1)。顯然，PPN _(i+1)和PPN _i的上限已改變。對第1個時間點的計算，PPN ₂包含3、4、5。U ₂依據預測是3。由於U ₂+2和6中的最小者是U ₂+2 (5)，所以不再使用6。同樣地，PPN ₃只包含6。PPN ₄包含1、2、3。PPN ₅包含5和6。PPN ₆包含2、3、4。PPN ₇包含5和6。PPN ₈包含2、3、4。PPN ₉包含6。由於PPN有不同定義，所以POC _i的計算結果跟著變化。舉例來說，第1個時間點的計算中，8.4加到4.2、4.6、5，但不加到4.8、5.2、5.6。因此，得到的一列的POC ₁(POC _i) 是12.6、13、13.4。同樣地，10會加到4.8、5.2、5.6，但不加到4.2、4.6、5。得到的另一列的POC ₁是14.8、15.2、15.6。PPN ₁中的 “5” 用來計算POC ₀中的 “8.4” 以及POC ₁中的12.6、13、13.4。 The first sub-step is to calculate at least one ith possible operation cost (POC _i ), where POC _i is derived from POC _i = POC _(i-1) + RF x | PPN _(i+1) - PPN _i | + PPN _(i+1) , where POC _(i-1) is the possible operation cost calculated for the (i-1)th time point, and PPN _(i+1) is between U _(i+1) and U _{(i +1)} +The PPN at the (i+1)th time point of the smallest of A and M, PPN _i is the i-th of the smallest of A _and M from U _i to U i The PPN at the time point used to calculate the _PPN i of POC _i and POC _(i-1) has the same value (S13-1). Obviously, the upper limits of PPN _(i+1) and PPN _i have changed. For the calculation of the first time point, PPN ₂ includes 3, 4, and 5. U ₂ is predicted to be 3. Since the smallest of U ₂ +2 and 6 is U ₂ +2 (5), 6 is no longer used. Likewise, PPN ₃ only contains 6. PPN ₄ contains 1, 2, and 3. PPN ₅ contains 5 and 6. PPN ₆ includes 2, 3, and 4. PPN ₇ contains 5 and 6. PPN ₈ includes 2, 3, and 4. PPN ₉ contains 6. Since PPN has different definitions, the calculation results of POC _i change accordingly. For example, in the calculation at the first time point, 8.4 is added to 4.2, 4.6, and 5, but not to 4.8, 5.2, and 5.6. Therefore, the resulting POC ₁ (POC _i ) of a column is 12.6, 13, 13.4. Likewise, 10 will add to 4.8, 5.2, 5.6, but not 4.2, 4.6, 5. The resulting POC ₁ of the other column is 14.8, 15.2, 15.6. "5" in PPN ₁ is used to calculate "8.4" in POC ₀ and 12.6, 13, and 13.4 in POC ₁ .

第二子步驟是找出最小和第二小POC _i(S13-2)。此步驟與子步驟 S03-2相同。圖10中，最小和第二小POC ₁分別是12.6和13。它們來自相同PPN ₁，但PPN ₂不同。 The second sub-step is to find the smallest and second smallest POC _i (S13-2). This step is the same as sub-step S03-2. In Figure 10, the smallest and second smallest POC ₁ are 12.6 and 13 respectively. They are from the same PPN ₁ , but PPN ₂ is different.

第三子步驟是如果最小和第二小POC _i是從計算自相同PPN _i，則將用來計算最小POC _i的PPN _i設為第i個指定數量，從下一時間點的計算的第i個指定數量除去未計算的POC _i(S13-3)。此步驟與子步驟S03-3相同。由於二方法的第二步驟及第一子步驟不同，所以後續結果也不同。舉例來說，利用相同預測，第二實施例的第4個指定數量是3，而第一實施例的第4個指定數量是5。 The third sub-step is that if the smallest and second smallest POC _i are calculated from the same PPN _i , then the PPN _{i used to calculate the smallest POC i} _is set to the i-th specified quantity, from the i-th calculation at the next time point The uncalculated POC _i is removed by a specified number (S13-3). This step is the same as sub-step S03-3. Since the second step and first sub-step of the two methods are different, the subsequent results are also different. For example, using the same prediction, the 4th designated quantity for the second embodiment is 3, while the 4th designated quantity for the first embodiment is 5.

同樣地，改良方法的最後步驟是由處理器在第0個時間點提供1單位的資源和在第i個時間點提供第i個指定單位數量的資源給工作負載 (S14)。與步驟S04相同，不再詳述。Similarly, the final step of the improved method is for the processor to provide 1 unit of resources at the 0th time point and provide the ith specified unit amount of resources to the workload at the ith time point (S14). It is the same as step S04 and will not be described in detail.

如同第一實施例，改良方法會碰到兩種例外。第二實施例中，未見到第一例外。然而，可能出現在其他實例。在此情況下，執行以下子步驟：如果未自相同PPN _i計算最小和第二小POC _i，則個別使用PPN _i來計算 POC _(i+1)，將用來計算最小 POC _(i+1)的PPN _i設為第i個指定數量，從下一時間點的計算的第i個指定數量除去未計算的POC _i，其中 POC _(i+1)是對第 (i+1) 個時間點所計算的可能的運算成本。 Like the first embodiment, the improved method encounters two exceptions. In the second embodiment, the first exception is not seen. However, it may occur in other instances. In this case, the following substep is performed: If the smallest and second smallest POC _i are not calculated from the same PPN _i , then individually use PPN _i to calculate POC _(i+1) , which will be used to calculate the smallest POC _(i+1) The PPN _i is set to the i-th specified quantity, and the uncalculated POC _i is removed from the calculated i-th specified quantity at the next time point, where POC _(i+1) is the calculated i-th specified quantity at the (i+1)th time point. The possible operational cost of the calculation.

第二例外也發生在第二實施例的第9個時間點的計算。第一實施例的相同手段用來處理此處的PPN ₁₀。PPN ₁₀設為 “0”。計算顯示於圖11。 The second exception also occurs in the calculation of the 9th time point in the second embodiment. The same means as in the first embodiment are used to process PPN ₁₀ here. PPN ₁₀ is set to "0". The calculation is shown in Figure 11.

第一實施例的方法的計算總數是67 (與第5個時間點的計算無關，版本2不會發生在第二實施例)，而第二實施例的改良方法是41。相較於第一實施例，第二實施例的改良方法的計算較少。節省26個計算。雖然每一時間點的資源的提供數量不相同，但其間沒有重大差異，對相同預測都有效。The total number of calculations for the method of the first embodiment is 67 (irrelevant to the calculation at the 5th time point, version 2 will not occur in the second embodiment), while the improved method of the second embodiment is 41. Compared with the first embodiment, the improvement method of the second embodiment requires less calculation. Save 26 calculations. Although the amount of resources provided at each time point is different, there is no significant difference and it is valid for the same prediction.

上述實施例中，一指定數量得自兩個時間點的計算。依據本發明，超過一指定數量可得自兩個時間點的計算。從另一觀點來看，先前實施例收集資料以在兩個時間點之間的一 “空窗” 決定提供的資源。更多空窗可用於收集資料。計算數量會增加，但可節省時間。根據強化學習的預測來優化資源配置的另一方法顯示於第三實施例。In the above embodiment, a specified quantity is derived from calculations at two points in time. According to the present invention, more than a specified amount can be obtained from the calculation of two points in time. From another perspective, the previous embodiment collects data to determine the resources to provide during a "window" between two points in time. More empty windows can be used to collect information. The number of calculations increases, but time is saved. Another method of optimizing resource allocation based on reinforcement learning predictions is shown in the third embodiment.

請參照圖12。它是揭露於本發明第三實施例的方法的流程圖。此實施例中，仍應用先前實施例的資源需求預測。同樣地，第三實施例也可用於圖1的架構。可提供的來源最大單位M維持6。該方法以圖15至圖19說明，列出所有時間點的計算。Please refer to Figure 12. It is a flow chart of the method disclosed in the third embodiment of the present invention. In this embodiment, the resource demand prediction in the previous embodiment is still applied. Likewise, the third embodiment can also be used in the architecture of FIG. 1 . The maximum unit M of available sources remains 6. The method is illustrated in Figures 15 to 19, which list the calculations for all time points.

該方法的第一步驟是提供在第0個時間點後超過N個時間點的工作負載所需之資源單位數量的預測給處理器，其中可提供最大M個單位的來源，U _i是依據預測在第i個時間點所需的單位數量， N、M、 i都是正整數 (S21)。此步驟與步驟S01和S11相同。如上述，N和M維持相同以說明本實施例。 The first step of the method is to provide the processor with a prediction of the number of resource units required for the workload exceeding N time points after the 0th time point, where a maximum of M units can be provided, U _i is based on the prediction The number of units required at the i-th time point, N, M, and i are all positive integers (S21). This step is the same as steps S01 and S11. As mentioned above, N and M remain the same to illustrate this embodiment.

該方法的第二步驟是由處理器來計算至少一第1個可能的運算成本 (POC ₁)，它是根據在U ₁至M中的第1個時間點 (PPN ₁) 的至少一可能的提供數量 (PPN) 和在U ₂至M中的第2個時間點 (PPN ₂) 的至少一PPN，其中POC ₀得自POC ₁= K + RF x |PPN ₁- K| + PPN ₁+ RF x |PPN ₂– PPN ₁| + PPN ₂，其中RF是介於0與1之間的再平衡因子，且K為一實數 (S22)。RF也設為0.6，而K為1。為更了解差異，請參照圖13和圖15。圖與13圖3有相同預測結果。圖14包含二個計算表。上表列出RF x |PPN ₁- 1| + PPN ₁+ RF x |PPN ₂– PPN ₁| + PPN ₂的所有計算，而下表顯示POC ₁= 1 + RF x |PPN ₁- 1| + PPN ₁+ RF x |PPN ₂– PPN ₁| + PPN ₂的所有計算。 “1” (不是在PPN ₁和POC ₁的) 是下表中的指定數量。起始運算成本設為只提供一資源。計算POC ₁的公式類似得到POC _i的公式，但計算橫跨三個時間點 (二個空窗)。 The second step of the method is for the processor to calculate at least a first possible operation cost (POC ₁ ) based on at least a possible first time point (PPN ₁ ) in U ₁ to M Provide the quantity (PPN) and at least one PPN at the 2nd time point (PPN ₂ ) in U ₂ to M, where POC ₀ is derived from POC ₁ = K + RF x |PPN ₁ - K| + PPN ₁ + RF x |PPN ₂ – PPN ₁ | + PPN ₂ , where RF is a rebalancing factor between 0 and 1, and K is a real number (S22). RF is also set to 0.6 and K is 1. To better understand the differences, please refer to Figure 13 and Figure 15. Figure 13 has the same prediction results as Figure 3. Figure 14 contains two calculation tables. The above table lists all calculations for RF x |PPN ₁ - 1| + PPN ₁ + RF x |PPN ₂ – PPN ₁ | + PPN ₂ , while the below table shows POC ₁ = 1 + RF x |PPN ₁ - 1| + All calculations of PPN ₁ + RF x |PPN ₂ – PPN ₁ | + PPN ₂ . "1" (not in PPN ₁ and POC ₁ ) is the number specified in the table below. The initial operation cost is set to provide only one resource. The formula for calculating POC ₁ is similar to the formula for POC _i , but the calculation spans three time points (two empty windows).

該方法的第三步驟是將用來計算最小POC ₁的PPN ₁設為第1個指定數量 (S23)。指定數量的函數與第一實施例相同。由於POC ₁最小，所以PPN ₁可引導結果並選為第1個指定數量。 The third step of the method is to set the PPN ₁ used to calculate the minimum POC ₁ as the first specified quantity (S23). The specified number of functions is the same as in the first embodiment. Since POC ₁ is the smallest, PPN ₁ can lead the results and be selected as the 1st specified quantity.

該方法的第四步驟是由處理器對時間點依序重複下列子步驟，i是從2至2 x [N/2] 的偶數 (S24)。“i” 的定義不同於先前實施例。第一，“i” 是偶數，例如2、4、6 …等等。[N/2] 在高斯符號下的計算。此實施例中，N是9，因而 [N/2] 是8。也就是說，2、4、6、8取為不同迭代的計算的 “i”。以下說明這些子步驟。The fourth step of the method is for the processor to sequentially repeat the following sub-steps for time points, i is an even number from 2 to 2 x [N/2] (S24). The definition of "i" is different from the previous embodiment. First, "i" is an even number, such as 2, 4, 6...etc. [N/2] Calculated in Gaussian notation. In this example, N is 9, so [N/2] is 8. In other words, 2, 4, 6, and 8 are taken as "i" of calculations in different iterations. These sub-steps are described below.

第一子步驟是計算至少一第 (i+1) 個可能的運算成本 POC _(i+1)，其中 POC _(i+1)得自POC _(i+1)= POC _(i-1)+ RF x | PPN _(i+1)- PPN _i| + PPN _(i+1)+ Wi，其中Wi是RF x | PPN _(i+2)– PPN _(i+1)| + PPN _(i+2)，POC _(i-1)是對第 (i-1) 個時間點所計算的可能的運算成本，PPN _(i+2)是在U _(i+2)至M中的第 (i+2) 個時間點的PPN，PPN _(i+1)是在U _(i+1)to 至M中的第 (i+1) 個時間點的PPN，PPN _i是在U _i至M中的第i個時間點的PPN，用來計算 POC _(i+1)和 POC _(i-1)的PPN _i具有相同值；其中如果 (i+2) 大於N，則從計算省略Wi (S24-1)。為了方便起見，POC _(i+1)= POC _(i-1)+ RF x | PPN _(i+1)- PPN _i| + PPN _(i+1)+ RF x | PPN _(i+2)– PPN _(i+1)| + PPN _(i+2)的結果見圖14，而RF=0.6。舉例來說，令i = 2。請見圖16。圖16列出所有時間點的計算。此處，POC ₃計算自POC ₁。圖15中，POC ₁是12.6、13、13.4、15。公式變成POC ₃= POC ₁+ RF x | PPN ₃- PPN ₂| + PPN ₃+ RF x | PPN ₄– PPN ₃| + PPN ₄。上表顯示RF x | PPN ₃- PPN ₂| + PPN ₃+ RF x | PPN ₄– PPN ₃| + PPN ₄的結果。圖 16中，PPN ₂是3至6，PPN ₃只是6，PPN ₄是1至6。因此，只計算24個數量並顯示於上表。POC ₃的結果在下表的右底部。 The first sub-step is to calculate at least one (i+1)th possible operation cost POC _(i+1) , where POC _(i+1) is obtained from POC _(i+1) = POC _(i-1) + RF x | PPN _(i+1) - PPN _i | + PPN _(i+1) + Wi, where Wi is RF x | PPN _(i+2) – PPN _(i+1) | + PPN _(i+2) , POC _(i-1) is the possible operation cost calculated for the (i-1)th point in time, PPN _(i+2) is the (i+2)th one from U _(i+2) to M PPN at time point, PPN _(i+1) is the PPN at the _{(i+1)th time point in U (i+1)} to M, PPN _i is the i-th time in U _i to M The PPN of the point, used to calculate the PPN _i of POC _(i+1) and POC _(i-1), has the same value; where if (i+2) is greater than N, Wi (S24-1) is omitted from the calculation. For convenience, POC _(i+1) = POC _(i-1) + RF x | PPN _(i+1) - PPN _i | + PPN _(i+1) + RF x | PPN _(i+2) – The results for PPN _(i+1) | + PPN _(i+2) are shown in Figure 14, while RF=0.6. For example, let i = 2. See Figure 16. Figure 16 lists the calculations for all time points. Here, POC ₃ is calculated from POC ₁ . In Figure 15, POC ₁ is 12.6, 13, 13.4, and 15. The formula becomes POC ₃ = POC ₁ + RF x | PPN ₃ - PPN ₂ | + PPN ₃ + RF x | PPN ₄ - PPN ₃ | + PPN ₄ . The table above shows the results for RF x | PPN ₃ – PPN ₂ | + PPN ₃ + RF x | PPN ₄ – PPN ₃ | + PPN ₄ . In Figure 16, PPN ₂ is 3 to 6, PPN ₃ is only 6, and PPN ₄ is 1 to 6. Therefore, only 24 quantities are calculated and shown in the table above. The results for POC ₃ are at the bottom right of the table below.

第二子步驟是找出最小和第二小POC _i(S24-2)。此步驟與子步驟S03-2及S13-2相同。圖16中，最小和第二小POC ₃分別是24和24.2。它們來自不同PPN ₂。 The second sub-step is to find the smallest and second smallest POC _i (S24-2). This step is the same as sub-steps S03-2 and S13-2. In Figure 16, the smallest and second smallest POC ₃ are 24 and 24.2 respectively. They are from different PPN ₂ .

第三子步驟是如果最小和第二小POC _i計算自相同PPN _i，則將用來計算最小 POC _(i+1)的PPN _i設為第i個指定數量且用來計算最小 POC _(i+1)的 PPN _(i+1)設為第 (i+1) 個指定數量，從下一時間點的計算的第i個指定數量除去未計算的 POC _(i+1)(S24-3)。此實施例中，雖然最小和第二小POC ₃計算自相同PPN ₂，但所用的PPN ₂可由第二實施例的相同程序找到。此處省略不再重複。24是PPN ₂(點背景)。因此，第2個指定數量是5，第3個指定數量是6 (點背景)。用於下一迭代的計算的POC ₃是24、24.4、24.8、25.2、25.6、26。對i=4和i=6的計算結果顯示於圖17和圖18。 The third sub-step is that if the smallest and second smallest POC _i are calculated from the same PPN _i , then the PPN _i used to calculate the smallest POC _(i+1) is set to the i-th specified number and used to calculate the smallest POC _(i+ The PPN _(i+1 ) of ₁₎ is set to the (i+1)-th specified quantity, and the uncalculated POC _(i+1) is removed from the calculated i-th specified quantity at the next time point (S24-3). In this embodiment, although the smallest and second smallest POC ₃ are calculated from the same PPN ₂ , the PPN ₂ used can be found by the same procedure of the second embodiment. It is omitted here and will not be repeated. 24 is PPN ₂ (point background). Therefore, the 2nd specified quantity is 5 and the 3rd specified quantity is 6 (dot background). The POC ₃ used for the calculation of the next iteration is 24, 24.4, 24.8, 25.2, 25.6, 26. The calculation results for i=4 and i=6 are shown in Figure 17 and Figure 18.

不同計算發生在i是8的時候。沒有PPN ₁₀。它匹配 _(i+2)大於N (10＞9) 的情況。從計算省略Wi的部分。計算結果顯示於圖19。簡化的公式變成POC ₉= POC ₇+ RF x | PPN ₉- PPN ₈| + PPN ₉。這種公式與先前實施例相同。然而，依據相同方法來選擇第8個和第9個指定數量。第8個指定數量是5，第9個指定數量是6。 A different calculation occurs when i is 8. There is no PPN ₁₀ . It matches the case where _(i+2) is greater than N (10>9). Omit the part of Wi from the calculation. The calculation results are shown in Figure 19. The simplified formula becomes POC ₉ = POC ₇ + RF x | PPN ₉ - PPN ₈ | + PPN ₉ . This formula is the same as the previous embodiment. However, the 8th and 9th specified quantities are selected according to the same method. The 8th specified quantity is 5, and the 9th specified quantity is 6.

該方法的最後步驟是由處理器在第0個時間點提供1單位的資源和在第j個時間點提供第j個指定單位數量的資源給工作負載，其中j是在1至N中 (S25)。它與步驟S04及S14相同，而變數符號不同。The final step of the method is for the processor to provide 1 unit of resources at the 0th time point and provide the jth specified unit amount of resources to the workload at the jth time point, where j is among 1 to N (S25 ). It is the same as steps S04 and S14, but the variable signs are different.

第三實施例的方法的計算總數是137。相較於第一和第二實施例，第三實施例的方法使用較多計算。雖然每一時間點的資源的提供數量可能不相同，但沒差很多，有結果的時間可降低。The calculated total number for the method of the third embodiment is 137. Compared with the first and second embodiments, the method of the third embodiment uses more calculations. Although the number of resources provided at each point in time may not be the same, it is not much different, and the time to produce results can be reduced.

應注意以上所有數學公式只是用來說明，而非限制本發明的應用。可表達相同微積分邏輯的任何其他數學公式也在本發明的範疇中。It should be noted that all the above mathematical formulas are only for illustration and do not limit the application of the present invention. Any other mathematical formula that expresses the same calculus logic is also within the scope of the invention.

儘管本發明已經根據目前認為是最實用和優選的實施例進行了描述，但應理解本發明不必限於所揭露的實施例。相反地，其意在涵蓋包括在所附申請專利範圍的精神和範疇內的各種修改和類似布置，它們應符合最廣泛的解釋，以涵蓋所有這樣的修改和類似結構。While the invention has been described in terms of what are presently considered to be the most practical and preferred embodiments, it is to be understood that the invention is not necessarily limited to the disclosed embodiments. On the contrary, it is intended to cover various modifications and similar arrangements included within the spirit and scope of the appended claims, and they are to be accorded the broadest interpretation so as to cover all such modifications and similar arrangements.

10:電腦叢集 20:網際網路 30:用戶端 100:計算單元 101:CPU 101a:處理器 101b:CPU 102:記憶體模組 103:儲存裝置 120:網路通訊介面 10:Computer cluster 20:Internet 30: Client 100:Computing unit 101:CPU 101a: Processor 101b:CPU 102:Memory module 103:Storage device 120:Network communication interface

圖1顯示本發明應用的硬體架構。圖2是揭露於本發明第一實施例的方法的流程圖。圖3顯示預測結果。圖4是查找表。圖5至圖7列出對所有時間點的計算。圖8是揭露於本發明第二實施例的改良方法的流程圖。圖9顯示與圖3相同的預測結果，可能的提供數量不同。圖10和圖11列出對所有時間點的計算。圖12是揭露於本發明第三實施例的方法的流程圖。圖13顯示預測結果。圖14是查找表。圖15至圖19列出對所有時間點的計算。 Figure 1 shows the hardware architecture of the application of the present invention. FIG. 2 is a flowchart of the method disclosed in the first embodiment of the present invention. Figure 3 shows the prediction results. Figure 4 is a lookup table. Figures 5 to 7 list the calculations for all time points. FIG. 8 is a flow chart of an improved method disclosed in the second embodiment of the present invention. Figure 9 shows the same prediction results as Figure 3, with a different number of possible offers. Figures 10 and 11 list the calculations for all time points. FIG. 12 is a flowchart of a method disclosed in the third embodiment of the present invention. Figure 13 shows the prediction results. Figure 14 is a lookup table. Figures 15 to 19 list the calculations for all time points.

10:電腦叢集 10:Computer cluster

20:網際網路 20:Internet

30:用戶端 30: Client

100:計算單元 100:Computing unit

101:CPU 101:CPU

101a:處理器 101a: Processor

101b:CPU 101b:CPU

102:記憶體模組 102:Memory module

103:儲存裝置 103:Storage device

120:網路通訊介面 120:Network communication interface

Claims

A method to optimize resource allocation based on reinforcement learning predictions is implemented by the processor determining the number of resource units in the computer cluster to be used, including the following steps: a) Provide more than N units after the 0th time point The prediction of the number of resource units required by the workload at a point in time is given to the processor, which can provide a source of up to M units. U _i is the number of units required at the i-th point in time based on the prediction. N, M, and i are all is a positive integer; b) The processor calculates at least one 0th possible operation cost (POC ₀ ), which is based on at least one possible supply quantity at the 1st time point (PPN ₁ ) in U ₁ to M (PPN), where POC ₀ is obtained from POC ₀ = K + RF x | PPN ₁ – K | + PPN ₁ , where RF is a rebalancing factor between 0 and 1, and K is a real number; c) by The processor sequentially repeats the following sub-steps for the i-th time point, i is from 1 to N: c1) Calculate at least one i-th possible operation cost (POC _i ), where POC _i is obtained from POC _i = POC _{(i -1)} + RF x | PPN _(i+1) - PPN _i | + PPN _(i+1) , where POC _(i-1) is the possible operation cost calculated for the (i-1)th time point , PPN _(i+1) is the PPN at the _{(i+1)-th time point in U (i+1)} to M, PPN _i is the PPN at the i-th time point in U _(i+ 1) to M, use to calculate POC _i and POC _(i-1) whose PPN _i have the same value; c2) find the smallest and second smallest POC _i ; and c3) if the smallest and second smallest POC _i are calculated from the same PPN _i , then Set the PPN _i used to calculate the minimum POC _i to the i-th specified quantity, and remove the uncalculated POC _i from the calculated i-th specified quantity at the next time point; and d) by the processor at the 0th time point Provide 1 unit of resources and provide the i-th specified unit of resources to the workload at the i-th point in time.

For example, the method in item 1 of the patent scope further includes the following sub-steps: c4) If the smallest and second smallest POC _i are not calculated from the same PPN _i , then use PPN _i individually to calculate POC _(i+1) , which will be used The PPN _i for calculating the minimum POC _(i+1) is set to the i-th specified quantity, and the uncalculated POC _i is removed from the calculated i-th specified quantity at the next time point, where POC _(i+1) is the ( The possible operation cost calculated at i+1) time points.

For example, the method of claim 1, wherein the resource is a memory module, CPU, I/O throughput, response time, requests per second, or latency.

A method to optimize resource allocation based on reinforcement learning predictions is implemented by the processor determining the number of resource units in the computer cluster to be used, including the following steps: a) Provide more than N units after the 0th time point The prediction of the number of resource units required by the workload at a point in time is given to the processor, which can provide a source of up to M units. U _i is the number of units required at the i-th point in time based on the prediction. N, M, and i are all is a positive integer; b) The processor calculates at least a 0th possible operation cost (POC ₀ ), which is based on the 1st time point in the minimum of U ₁ to U ₁ +A and M At least one possible supply quantity (PPN) of (PPN ₁ ), where POC ₀ is derived from POC ₀ = K + RF x | PPN ₁ – K | + PPN ₁ , where RF is a rebalance between 0 and 1 Factor, A is an integer, and K is a real number; c) The processor repeats the following sub-steps in sequence for the i-th time point, i is from 1 to N: c1) Calculate at least one i-th possible operation cost ( POC _i ), where POC _i is derived from POC _i = POC _(i-1) + RF x | PPN _(i+1) - PPN _i | + PPN _(i+1) , where POC _(i-1) is the The possible operation cost calculated at (i-1) time points, PPN _(i+1) is the smallest of U _(i+1) to U _(i+1) +A and M. i+1) time point PPN, PPN _i is the PPN at the i-th time point from U _i to U _i +A and M, whichever is the smallest, used to calculate POC _i and POC _{(i-1 )} have the same value for PPN _i ; c2) find the smallest and second smallest POC _i ; and c3) if the smallest and second smallest POC _i are calculated from the same PPN _i , then the PPN _i that will be used to calculate the smallest POC _i Set to the i-th specified quantity, remove the uncalculated POC _i from the calculated i-th specified quantity at the next time point; and d) provide 1 unit of resources by the processor at the 0th time point and at the i-th Provide the i-th specified unit amount of resources to the workload at the point in time.

For example, the method in item 4 of the patent application scope further includes the following sub-steps: c4) If the smallest and second smallest POC _i are not calculated from the same PPN _i , then use PPN _i individually to calculate POC _(i+1) , which will be used The PPN _i for calculating the minimum POC _(i+1) is set to the i-th specified quantity, and the uncalculated POC _i is removed from the calculated i-th specified quantity at the next time point, where POC _(i+1) is the ( The possible operation cost calculated at i+1) time points.

For example, in the method of claim 4, the resource is a memory module, CPU, I/O throughput, response time, requests per second, or latency.

A method to optimize resource allocation based on reinforcement learning predictions is implemented by the processor determining the number of resource units in the computer cluster to be used, including the following steps: a) Provide more than N units after the 0th time point The prediction of the number of resource units required by the workload at a point in time is given to the processor, which can provide a source of up to M units. U _i is the number of units required at the i-th point in time based on the prediction. N, M, and i are all is a positive integer; b) The processor calculates at least one first possible operation cost (POC ₁ ), which is based on at least one possible provided quantity at the first time point (PPN ₁ ) in U ₁ to M (PPN) and at least one PPN at the 2nd time point (PPN ₂ ) in U ₂ to M, where POC ₀ is derived from POC ₁ = K + RF x |PPN ₁ - K| + PPN ₁ + RF x | PPN ₂ – PPN ₁ | + PPN ₂ , where RF is a rebalancing factor between 0 and 1, and K is a real number; c) Set PPN _{1 used to calculate the minimum POC 1} _to the first specified quantity ; d) The processor repeats the following sub-steps sequentially for time points, i is an even number from 2 to 2 x [N/2]: d1) Calculate at least one (i+1)th possible operation cost POC _{(i +1)} , where POC _(i+1) is obtained from POC _(i+1) = POC _(i-1) + RF x | PPN _(i+1) - PPN _i | + PPN _(i+1) + Wi, _where _Wi _is _RF Operation cost, PPN _(i+2) is the PPN at the _{(i+2)th time point in U (i+2)} to M, PPN _(i+1) is the PPN at the _(i+1) to M _PPN at the ( _i + ₁ ₎ th time point in _i has the same value; where if (i+2) is greater than N, then Wi is omitted from the calculation; d2) find the smallest and second smallest POC _i ; and d3) if the smallest and second smallest POC _i are calculated from the same PPN _i , Then set the PPN _i used to calculate the minimum POC _(i+1) to the i-th specified quantity and the PPN _(i+1) used to calculate the minimum POC _(i+ 1) to the (i+1)-th specified quantity. quantity, excluding the uncalculated POC _(i+1) from the calculated i-th specified quantity at the next time point; and e) 1 unit of resources provided by the processor at the 0th time point and at the jth time point Provides the jth specified unit amount of resources to the workload, where j is in the range 1 to N.

For example, in the method of claim 7, the resource is a memory module, CPU, I/O throughput, response time, requests per second, or latency.