TWI768554B - Computing system and performance adjustment method thereof - Google Patents

Computing system and performance adjustment method thereof Download PDF

Info

Publication number
TWI768554B
TWI768554B TW109140913A TW109140913A TWI768554B TW I768554 B TWI768554 B TW I768554B TW 109140913 A TW109140913 A TW 109140913A TW 109140913 A TW109140913 A TW 109140913A TW I768554 B TWI768554 B TW I768554B
Authority
TW
Taiwan
Prior art keywords
performance
gradient
test
processors
configuration
Prior art date
Application number
TW109140913A
Other languages
Chinese (zh)
Other versions
TW202221523A (en
Inventor
蘇凱農
Original Assignee
宏碁股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 宏碁股份有限公司 filed Critical 宏碁股份有限公司
Priority to TW109140913A priority Critical patent/TWI768554B/en
Publication of TW202221523A publication Critical patent/TW202221523A/en
Application granted granted Critical
Publication of TWI768554B publication Critical patent/TWI768554B/en

Links

Images

Landscapes

  • Operation Control Of Excavators (AREA)
  • Control Of Turbines (AREA)
  • Variable-Direction Aerials And Aerial Arrays (AREA)
  • Debugging And Monitoring (AREA)

Abstract

A computing system and a performance adjustment method thereof are provided. In the method, a performance test is performed according to a performance arrangement. The computing system includes two or more processors. The performance test is used for a load scenario, and the performance arrangement is related to the weights of the processors to make efforts on the performance under the load scenario. A gradient corresponding to the test result of the performance test is determined. The performance test is related to a multi-variant function that is used to present the performance of the performance test. The gradient is obtained by performing gradient calculation on the multi-variant function with the performance arrangement and the test result. The performance arrangement is modified according to the gradient. Accordingly, better overall performance under the system limitation can be obtained.

Description

計算系統及其效能調整方法Computing system and method for adjusting performance thereof

本發明是有關於一種多處理器效能調教技術,且特別是有關於一種用於多處理器的計算系統及其效能調整方法。The present invention relates to a multiprocessor performance tuning technology, and more particularly, to a computing system for multiprocessors and a performance tuning method thereof.

現今電子產品為了符合各領域的應用,從高效能運算(High Performance Computing,HPC)伺服器、個人電腦、甚至到智慧型手機等裝置逐漸採用異質計算(Heterogeneous Computing)系統設計,以達到較好的計算效益或能源效率。例如,HPC 伺服器同時搭載中央處理單元(Central Processing Unit,CPU)、圖形處理單元(Graphic Processing unit,GPU)、現場可程式化邏輯閘陣列(Field Programmable Gate Array,FPGA)、特殊應用積體電路(Application-Specific Integrated Circuit,ASIC)、特殊規格加速卡及/或神經網路加速器。而個人電腦和手機除了原有獨立 CPU結合GPU 設計之外,CPU 設計也開始朝向由不同計算能力或功能的大小核心整合在單一晶片的趨勢,且針對不同工作負載(Workload)的特性採用較有優勢的計算單元達成最佳運算效益。In order to meet the application in various fields, electronic products today are gradually adopting Heterogeneous Computing system design from High Performance Computing (HPC) servers, personal computers, and even smart phones to achieve better performance. Calculate benefit or energy efficiency. For example, an HPC server is equipped with a central processing unit (CPU), a graphics processing unit (GPU), a field programmable gate array (FPGA), and a special application integrated circuit. (Application-Specific Integrated Circuit, ASIC), special specification accelerator card and/or neural network accelerator. In addition to the original independent CPU combined with GPU design in personal computers and mobile phones, the CPU design has also begun to move toward the trend of integrating cores of different computing capabilities or functions into a single chip, and adopting more sophisticated features for different workloads The dominant computing unit achieves the best computing efficiency.

針對多計算單元的一個常見設計問題是:各計算單元的運算能力因系統設計而彼此牽制。例如,CPU 和 GPU 因熱管(Heat Pipe)相連而共用熱容量(Thermal Capacity)。若一個計算單元高速運行,則另一計算單元的運算能力將被擠壓。另一常見設計問題是:瓦數不足的產品設計將導致在時間上先進行運算的模組會搶佔能量預算(Power Budget),進而壓縮其他模組的能力表現。這兩種常見情況均降低了異質計算的優勢。A common design problem for multiple computing units is that the computing power of each computing unit is tied to each other by the system design. For example, CPU and GPU share thermal capacity due to the connection of heat pipes. If one computing unit runs at high speed, the computing power of the other computing unit will be squeezed. Another common design problem is that the design of products with insufficient wattage will cause the modules that perform operations first in time to seize the power budget, thereby compressing the performance of other modules. Both of these common cases reduce the advantage of heterogeneous computing.

有鑑於此,本發明實施例提供一種計算系統及其效能調整方法,基於梯度特性調整多處理器(即,計算單元)的效能配置,以在系統限制的條件下快速得出較佳配置。In view of this, embodiments of the present invention provide a computing system and a performance adjustment method thereof, which adjust the performance configuration of multiprocessors (ie, computing units) based on gradient characteristics, so as to quickly obtain a better configuration under system constraints.

本發明實施例的計算系統的效能調整方法包括(但不僅限於)下列步驟:依據效能配置執行效能測試。計算系統包括兩個或更多個處理器。效能測試是針對一種負載情境,且效能配置相關於那些處理器在此負載情境下付出效能的比重。決定效能測試的測試結果對應的梯度(gradient)。測試結果相關於用於效能測試決定效能表現的多變數函數。此梯度是對基於那效能配置及測試結果的多變數函數進行梯度運算所得出。依據此梯度修改效能配置。The performance adjustment method of the computing system according to the embodiment of the present invention includes (but is not limited to) the following steps: performing a performance test according to the performance configuration. The computing system includes two or more processors. The performance test is for a load situation, and the performance profile is related to the proportion of those processors that pay performance under the load situation. Determines the gradient corresponding to the test results of the performance test. Test results are related to a multivariate function used in performance testing to determine performance. This gradient is obtained by performing gradient operations on a multivariate function based on that performance configuration and test results. Modify the performance configuration according to this gradient.

本發明實施例的計算系統包括(但不僅限於)兩個或更多個處理器。這些處理器經配置用以依據效能配置執行效能測試,決定效能測試的測試結果對應的梯度,並依據此梯度修改效能配置。效能測試是針對一種負載情境,且效能配置相關於那些處理器在此負載情境下付出效能的比重。測試結果相關於用於效能測試決定效能表現的多變數函數。此梯度是對基於那效能配置及測試結果的多變數函數進行梯度運算所得出。The computing system of the embodiment of the present invention includes (but is not limited to) two or more processors. The processors are configured to perform a performance test according to the performance configuration, determine a gradient corresponding to the test results of the performance test, and modify the performance configuration according to the gradient. The performance test is for a load situation, and the performance profile is related to the proportion of those processors that pay performance under the load situation. Test results are related to a multivariate function used in performance testing to determine performance. This gradient is obtained by performing gradient operations on a multivariate function based on that performance configuration and test results.

基於上述,在本發明實施例的計算系統及其效能調整方法中,對特定負載情境(即,工作負載)下的多變數效能函數進行梯度運算,以了解效能增長的趨向,進而調整那些處理器的效能配置。藉此,可取得較好的總體效能表現。Based on the above, in the computing system and the performance adjustment method thereof according to the embodiments of the present invention, a gradient operation is performed on the multivariate performance function under a specific load situation (ie, workload) to understand the trend of performance growth, and then adjust those processors performance configuration. Thereby, better overall performance can be achieved.

為讓本發明的上述特徵和優點能更明顯易懂,下文特舉實施例,並配合所附圖式作詳細說明如下。In order to make the above-mentioned features and advantages of the present invention more obvious and easy to understand, the following embodiments are given and described in detail with the accompanying drawings as follows.

圖1是依據本發明一實施例的計算系統100的方塊圖。請參照圖1,計算系統100可以是桌上型電腦、筆記型電腦、AIO電腦、智慧型手機、平板電腦、或伺服器等裝置。計算系統100包括但不僅限於記憶體110及兩個或更多個處理器130。FIG. 1 is a block diagram of a computing system 100 according to an embodiment of the invention. Referring to FIG. 1 , the computing system 100 may be a desktop computer, a notebook computer, an AIO computer, a smart phone, a tablet computer, or a server. Computing system 100 includes, but is not limited to, memory 110 and two or more processors 130 .

記憶體110可以是任何型態的固定或可移動隨機存取記憶體(Radom Access Memory,RAM)、唯讀記憶體(Read Only Memory,ROM)、快閃記憶體(flash memory)、傳統硬碟(Hard Disk Drive,HDD)、固態硬碟(Solid-State Drive,SSD)或類似元件。在一實施例中,記憶體110用以記錄程式碼、軟體模組、組態配置、資料或檔案。The memory 110 may be any type of fixed or removable random access memory (RAM), read only memory (ROM), flash memory, conventional hard disks (Hard Disk Drive, HDD), Solid-State Drive (Solid-State Drive, SSD) or similar components. In one embodiment, the memory 110 is used to record code, software modules, configuration, data or files.

處理器130耦接記憶體110,處理器130並可以是CPU、GPU,或是其他可程式化之一般用途或特殊用途的微處理器(Microprocessor)、數位信號處理器(Digital Signal Processor,DSP)、可程式化控制器、FPGA、ASIC、神經網路加速器或其他類似元件或上述元件的組合。在一實施例中,處理器130用以執行計算系統100的所有或部份作業,且可載入並執行記憶體110所記錄的各軟體模組、檔案及資料。The processor 130 is coupled to the memory 110. The processor 130 can be a CPU, a GPU, or other programmable general-purpose or special-purpose microprocessors (Microprocessors), digital signal processors (DSPs) , programmable controllers, FPGAs, ASICs, neural network accelerators or other similar elements or combinations of the above. In one embodiment, the processor 130 is used to execute all or part of the operations of the computing system 100 , and can load and execute various software modules, files and data recorded in the memory 110 .

在一實施例中,若欲實現異質計算系統,那些處理器130的類型可能不同。例如,計算系統100包括CPU和GPU。在一些實施例中,部分或全部的處理器130的類型可能相同。例如,計算系統100包括三個GPU。In one embodiment, those processors 130 may be of different types if a heterogeneous computing system is to be implemented. For example, computing system 100 includes a CPU and a GPU. In some embodiments, some or all of the processors 130 may be of the same type. For example, computing system 100 includes three GPUs.

下文中,將搭配計算系統100中的各項裝置、元件及模組說明本發明實施例所述之方法。本方法的各個流程可依照實施情形而隨之調整,且並不僅限於此。Hereinafter, the methods described in the embodiments of the present invention will be described in conjunction with various devices, components and modules in the computing system 100 . Each process of the method can be adjusted according to the implementation situation, and is not limited to this.

圖2是依據本發明一實施例的效能調整方法的流程圖。請參照圖2,處理器130可依據效能配置執行效能測試(步驟S210)。具體而言,效能測試例如是Cinebench、3DMark、3DMark11、AnTuTu或其他用於評估負載、壓力、斷點等各類型效能或基準(Benchmark)測試。此效能測試是針對一種特定負載情境(例如,time spy、伺服器的雲端處理、神經網路訓練/推論、影片播放、遊戲、影片編輯、網頁瀏覽、視訊對話、浮點運算等)。此外,效能配置相關於那些處理器130在此負載情境下付出效能的比重。換句而言,此比重代表那些處理器130共同運作的情況下影響整體效能的比例。效能配置可藉由對處理器130的執行頻率、電壓、功率、或風扇速度等因素之變化來改變。例如,三維模型編輯的負載情境需要對GPU設定較高的比重。FIG. 2 is a flowchart of a performance adjustment method according to an embodiment of the present invention. Referring to FIG. 2, the processor 130 may perform a performance test according to the performance configuration (step S210). Specifically, the performance test is, for example, Cinebench, 3DMark, 3DMark11, AnTuTu, or other types of performance or benchmark tests for evaluating load, stress, breakpoints, and the like. This benchmark is for a specific load scenario (eg, time spy, cloud processing of servers, neural network training/inference, video playback, gaming, video editing, web browsing, video chat, floating point arithmetic, etc.). In addition, the performance profile is related to the proportion of those processors 130 that pay performance under this load situation. In other words, the proportions represent the proportions that affect the overall performance when the processors 130 work together. The performance profile can be changed by changing factors such as the execution frequency, voltage, power, or fan speed of the processor 130 . For example, the load situation of 3D model editing needs to set a higher weight on the GPU.

處理器130可決定效能測試的測試結果對應的梯度(gradient)(步驟S230)。具體而言,測試結果相關於用於效能測試決定效能表現的多變數函數。此多變數函數相關於在此負載情境下那些處理器130的效能表現及計算系統100的整體表現。例如,效能表現作為變數,且整體表現作為函數值。在一些實施例中,計算系統100可能是取得來自其他計算系統100的測試結果。The processor 130 may determine a gradient corresponding to the test result of the performance test (step S230). Specifically, test results are related to a multivariate function used in performance testing to determine performance. This multivariate function is related to the performance of those processors 130 and the overall performance of the computing system 100 under this load situation. For example, performance as a variable and overall performance as a function value. In some embodiments, the computing system 100 may obtain test results from other computing systems 100 .

在一實施例中,測試結果包括那些處理器130的個別測試分數及整體測試分數。其中,各處理器130的個別測試分數作為變數,且整體測試分數作為函數的解。即,整體測試分數是將處理器130的個別測試分數輸入多變數函數所得出。多變數函數以加權調和平均(Weighted Harmonic Mean)效能表現函數P為例:

Figure 02_image001
…(1) 其中
Figure 02_image003
為各處理器130在當前負載情境下的個別測試分數(或個別測試表現),
Figure 02_image005
為各處理器130在此負載情境下的權重,n為處理器130的個數,i為正整數。 In one embodiment, the test results include individual test scores for those processors 130 and overall test scores. The individual test scores of each processor 130 are used as variables, and the overall test scores are used as the solution of the function. That is, the overall test score is obtained by inputting the individual test scores of the processor 130 into a multivariate function. The multivariate function takes the Weighted Harmonic Mean performance function P as an example:
Figure 02_image001
…(1) of which
Figure 02_image003
is the individual test score (or individual test performance) of each processor 130 under the current load situation,
Figure 02_image005
is the weight of each processor 130 under this load situation, n is the number of processors 130, and i is a positive integer.

需說明的是,不同負載情境的多變數函數可能不同,本發明實施例不加以限制。It should be noted that the multivariate functions of different load situations may be different, which is not limited in the embodiment of the present invention.

另一方面,梯度是對基於效能配置及測試結果的多變數函數進行梯度運算所得出。在向量微積分中,梯度是一種關於多變數/多元導數的概括。而多變數函數的梯度是向量值函數。多變數可微函數在多維向量空間上的點上的梯度,是以多變數可微函數在此點上的偏導數為分量的向量。針對效能的多變數函數,此向量的方向是此函數在那點上最大增長的效能,且其分量代表此方向上效能的增長率。由此可知,基於梯度的特性可了解如何讓特定負載情境的多變數函數的函數值(即,整體效能表現)增長。Gradients, on the other hand, are obtained by performing gradient operations on multivariable functions based on performance configurations and test results. In vector calculus, a gradient is a generalization about multivariate/multivariate derivatives. The gradient of a multivariate function is a vector-valued function. The gradient of a multivariable differentiable function at a point on a multidimensional vector space is a vector whose component is the partial derivative of the multivariable differentiable function at this point. For a multivariate function of performance, the direction of this vector is the maximum increase in performance for this function at that point, and its components represent the growth rate of performance in this direction. From this, it can be seen that gradient-based characteristics can learn how to increase the function value (ie, overall performance) of a multivariate function for a specific load situation.

在一實施例中,處理器130可決定梯度中對應於那些處理器130的梯度向量。此梯度向量包括分別對應於那些處理器130的向量的分量大小及方向。以函數(1)為例,對函數(1)進行梯度運算可得出:

Figure 02_image007
Figure 02_image009
…(2) 其中
Figure 02_image011
為這些處理器130在所形成的n維向量空間中使處理器130的效能增加方向為正的單位向量,
Figure 02_image013
為第i處理器130對應向量的分量大小。 In one embodiment, the processor 130 may determine the gradient vectors in the gradients that correspond to those of the processor 130 . This gradient vector includes magnitudes and directions of components corresponding to those of the processor 130 vector, respectively. Taking function (1) as an example, the gradient operation on function (1) can be obtained:
Figure 02_image007
Figure 02_image009
…(2) of which
Figure 02_image011
For these processors 130 in the formed n-dimensional vector space, the direction of the performance increase of the processors 130 is a positive unit vector,
Figure 02_image013
is the component size of the vector corresponding to the i-th processor 130 .

以3Dmark Time Spy的負載情境為例,其效能的多變數函數為:

Figure 02_image015
…(3) 其中
Figure 02_image017
為GPU和CPU在當前負載情境下的個別測試分數,0.85:0.15為GPU和CPU在此負載情境下的權重。 Taking the load situation of 3Dmark Time Spy as an example, the multi-variable function of its performance is:
Figure 02_image015
…(3) of which
Figure 02_image017
are the individual test scores of GPU and CPU under the current load situation, and 0.85:0.15 is the weight of GPU and CPU under this load situation.

多變數函數可表示為一個多維向量空間的多維曲面。以函數(3)為例,圖3A是一範例說明效能曲面。請參照圖3A,給定一個 CPU、GPU 個別測試分數(第一分數,第二分數),即得出總體效能表現(第一分數,第二分數)座標往上對應到曲面該點的整體分數座標,其中箭頭方向定義為使效能表現增加之方向。A multivariate function can be represented as a multidimensional surface in a multidimensional vector space. Taking function (3) as an example, FIG. 3A is an example illustrating the performance surface. Referring to FIG. 3A, given a CPU and GPU individual test scores (first score, second score), the overall performance performance (first score, second score) coordinate upwards corresponding to the overall score of the point on the surface is obtained Coordinates, where the direction of the arrow is defined as the direction of increased performance.

每個點的座標都代表一種可能的設置規則。例如,圖3A所示的點T1代表某一次效能測試結果。在此測試中,同時限制 GPU、CPU 的效能配置的比重大約在 1.3:1。同一台計算系統100,由圖3A所示點T1的位置大幅放寬 GPU 的效能,但限制 CPU 的能力。假設GPU、CPU的個別測試分數S1、S2分別為7,354、5,657,且整體測試分數S3為7,037。The coordinates of each point represent a possible setting rule. For example, the point T1 shown in FIG. 3A represents the result of a certain performance test. In this test, the ratio of performance configurations that limit both GPU and CPU is about 1.3:1. In the same computing system 100, the performance of the GPU is greatly relaxed from the position of the point T1 shown in FIG. 3A, but the performance of the CPU is limited. Assume that the individual test scores S1 and S2 of GPU and CPU are 7,354 and 5,657 respectively, and the overall test score S3 is 7,037.

圖3B是另一範例說明效能曲面。請參照圖3B,假設GPU、CPU 的效能配置的比重大約在 5:1 情況下得出此測試結果。以此配置修改(位於點T2的位置)而言,雖然大幅提高了 GPU的個別測試分數S1(由7,354升高至10,041),但整體測試分數S3由 7,037 降至 6,437。FIG. 3B is another example illustrating the performance surface. Please refer to FIG. 3B, assuming that the ratio of GPU and CPU performance configuration is about 5:1 to obtain this test result. For this configuration modification (located at point T2), although the GPU's individual test score S1 was greatly improved (from 7,354 to 10,041), the overall test score S3 dropped from 7,037 to 6,437.

另值得注意的是,現有常見的系統設計方式是針對每個處理器130給定由經驗或特定 Benchmark反覆實驗得出的結果而設置對應規則。例如 3DMark Time Spy的負載情境大量使用 DirectX 12 API 進行影像計算,因此配置優化在大方向上是採取限制CPU的使用瓦數但讓出熱容量給 GPU 使用來達成。然而,實務上要如何設置到最佳平衡點,現有的反覆測試微調沒有系統化規則可遵循,同時也相當費時。It is also worth noting that, a common system design method in the present is to set corresponding rules for each processor 130 given the results obtained by experience or specific Benchmark repeated experiments. For example, the load situation of 3DMark Time Spy uses a lot of DirectX 12 API for image calculation, so the general direction of configuration optimization is to limit the wattage used by the CPU but allow the thermal capacity to be used by the GPU. However, in practice, how to set the optimal balance point, the existing trial and error fine-tuning has no systematic rules to follow, and it is also quite time-consuming.

而在本發明實施例中,處理器130可依據梯度修改效能配置(步驟S250)。具體而言,本發明實施例可將某個設置規則(即,效能配置)下的運算結果(即,測試結果)帶入梯度函數(例如,函數(2)),藉由提高梯度函數中具有最大向量成份的處理器130的能力,同時降低其他處理器130的能力,來取得更好的總體效能表現(例如,整體測試分數)。In the embodiment of the present invention, the processor 130 may modify the performance configuration according to the gradient (step S250). Specifically, the embodiment of the present invention can bring the operation result (ie, the test result) under a certain setting rule (ie, the performance configuration) into the gradient function (eg, function (2)), by increasing the gradient function with The capabilities of the processor 130 with the largest vector component while reducing the capabilities of other processors 130 to achieve better overall performance (eg, overall test scores).

舉例而言,對函數(3)進行梯度運算後可得函數(4):

Figure 02_image019
…(5) For example, the function (4) can be obtained after the gradient operation on the function (3):
Figure 02_image019
…(5)

值得注意的是,由前述圖3A及圖3B的範例可知,針對特定負載情境,基於已知圖3A的測試結果而言,不應繼續進行圖3B的效能配置,且這樣的配置改變這樣對總體效能沒有幫助。而若欲使多維曲面上的每個點得到最高的垂直軸上升(即,整體測試分數增加),GPU和CPU的效能比重可由梯度函數(例如,函數(4))決定。It is worth noting that, as can be seen from the aforementioned examples of FIGS. 3A and 3B , for a specific load situation, based on the known test results of FIG. 3A , the performance configuration of FIG. 3B should not be continued, and such configuration changes will affect the overall performance. Efficacy doesn't help. And if each point on the multi-dimensional surface is to get the highest vertical axis rise (ie, the overall test score increases), the performance weight of GPU and CPU can be determined by a gradient function (eg, function (4)).

圖4是一範例說明梯度向量。請參照圖4,函數(4)的梯度向量場為一個二維向量場,其是將此向量場投影在如圖3A或圖3B中垂直軸的數值為零的水平面之位置,並經過正規化後可得出圖4。每個位置的箭頭方向(即,向量方向)即代表此效能配置在局部應如何配置可獲得最大整體測試分數提升。FIG. 4 is an example illustrating gradient vectors. Please refer to FIG. 4 , the gradient vector field of function (4) is a two-dimensional vector field, which is to project the vector field on the horizontal plane where the value of the vertical axis is zero as shown in FIG. 3A or FIG. 3B , and is normalized Figure 4 can then be obtained. The direction of the arrow (ie, the direction of the vector) at each location represents how this performance configuration should be configured locally to achieve the maximum overall test score improvement.

在一實施例中,處理器130可依據那些處理器130對應的分量大小修改效能配置。換句而言,各處理器130對應的單位向量的分量大小可作為效能配置的比重調整的依據。以函數(4)為例,CPU 和GPU的效能配置可設置為梯度向量中單位向量

Figure 02_image021
Figure 02_image023
的分量大小的比例,這樣可以提升最大的總體效能表現。圖3A中點T1的位置的梯度向量大約為
Figure 02_image025
,且其正規化向量(
Figure 02_image027
)大約為
Figure 02_image029
。此正規化向量大約對應到圖4所示點T1。此配置下,向量
Figure 02_image021
的分量大於
Figure 02_image023
的分量,因此梯度向量的向量方向指向為左上,但較偏上。而基於梯度所得出的效能配置的比重為0.958:0.286 ≈ 3.3:1。處理器130可依據此比重修改原效能配置。例如,直接將此比重設定為新效能配置。又例如,適當增加部分比重。 In one embodiment, the processors 130 may modify the performance configuration according to the component sizes corresponding to those processors 130 . In other words, the component size of the unit vector corresponding to each processor 130 can be used as a basis for adjusting the weight of the performance configuration. Taking function (4) as an example, the performance configuration of CPU and GPU can be set to the unit vector in the gradient vector
Figure 02_image021
and
Figure 02_image023
The proportion of the component size, which can improve the maximum overall performance. The gradient vector at the location of point T1 in Figure 3A is approximately
Figure 02_image025
, and its normalized vector (
Figure 02_image027
) is approximately
Figure 02_image029
. This normalized vector corresponds approximately to point T1 shown in FIG. 4 . In this configuration, the vector
Figure 02_image021
component is greater than
Figure 02_image023
The component of , so the vector direction of the gradient vector points to the upper left, but it is more upward. The proportion of the performance configuration based on the gradient is 0.958:0.286 ≈ 3.3:1. The processor 130 can modify the original performance configuration according to the weight. For example, simply set this weight to the new performance profile. For another example, a part of the specific gravity is appropriately increased.

在一實施例中,處理器130可判斷那些處理器130的梯度向量在多維向量空間中的向量方向。此多維向量空間即是由對應於那些處理器130的個別測試分數的多個維度所形成。以圖4為例,GPU及CPU的個別測試分數S1,S2(即,第一分數及第二分數)分別定義成兩個維度。In one embodiment, the processor 130 may determine the vector directions of those gradient vectors of the processor 130 in the multi-dimensional vector space. This multi-dimensional vector space is formed by dimensions corresponding to the individual test scores of those processors 130 . Taking FIG. 4 as an example, the individual test scores S1 and S2 (ie, the first score and the second score) of the GPU and the CPU are respectively defined as two dimensions.

處理器130可依據向量方向的指向修改效能配置。例如,朝指向增加或減少對應維度的比重。舉例而言,若欲針對圖4中的點T1增加更多整體測試分數(整體效能呈現),則第一分數對應的GPU的比重應再增加,但第二分數對應的CPU的比重可維持(對應到指向為左上但較偏上的向量方向)。The processor 130 can modify the performance configuration according to the direction of the vector direction. For example, increasing or decreasing the weight of the corresponding dimension towards the direction. For example, if more overall test scores (overall performance presentation) are to be added for the point T1 in FIG. 4 , the proportion of the GPU corresponding to the first score should be further increased, but the proportion of the CPU corresponding to the second score can be maintained ( Corresponds to the direction of the vector pointing to the upper left but more upward).

而針對圖3B的範例(過度放寬GPU效能),圖3B中點T2的位置的梯度向量大約為

Figure 02_image031
,且其正規化向量(
Figure 02_image033
)大約為
Figure 02_image035
。此正規化向量大約對應到圖4所示點T2。此配置下,向量
Figure 02_image021
的分量大於
Figure 02_image023
的分量,因此梯度向量的向量方向指向為左上,但較偏左。若欲針對圖4中的點T2增加更多整體測試分數(整體效能呈現),則例如是第一分數對應的GPU的比重應降低,但第二分數對應的CPU的比重應增加。而基於梯度所得出的效能配置的比重為0.245:0.969 ≈ 1:3.96。處理器130可依據此比重修改原效能配置。 For the example of FIG. 3B (overly relaxing GPU performance), the gradient vector at the position of point T2 in FIG. 3B is approximately
Figure 02_image031
, and its normalized vector (
Figure 02_image033
) is approximately
Figure 02_image035
. This normalized vector corresponds approximately to point T2 shown in FIG. 4 . In this configuration, the vector
Figure 02_image021
component is greater than
Figure 02_image023
The component of , so the vector direction of the gradient vector points to the upper left, but more to the left. To add more overall test scores (overall performance presentation) for point T2 in FIG. 4 , for example, the proportion of GPU corresponding to the first score should be reduced, but the proportion of CPU corresponding to the second score should be increased. The proportion of the performance configuration based on the gradient is 0.245:0.969 ≈ 1:3.96. The processor 130 can modify the original performance configuration according to the weight.

圖5是依據本發明一實施例說明系統限制決策的流程圖。請參照圖5,處理器130可判斷判斷梯度是否達到系統限制(步驟S510)。具體而言,系統限制相關於計算系統100所允許的效能配置。例如,供應瓦數、散熱能力或其他影響效能的因素。又例如,CPU的在梯度對應的分量大小大約為5,但比重超過4所需的供應瓦數大於計算系統100所能提供最高上限,即判斷達到系統限制。若梯度尚未達到系統限制,處理器130可基於梯度向量修改效能配置(步驟S530),依據修改的效能配置再次執行效能測試(例如,執行步驟S210),並依據再次執行的該效能測試的測試結果決定對應的梯度(例如,執行步驟S230)。例如,前述梯度所得出的效能配置的比重為1:3.96,則處理器130將GPU和CPU的效能配置修改為1:3.96,並再次執行相同負載情境的效能測試,以得出測試結果,並據以得出對應梯度。如此反覆進行(其數值應可收斂),直到效能調整已達系統設計限制,處理器130將禁能用於效能配置之調整(即,不再調整效能配置),並可據以決定最終效能配置(步驟S550)。此時,最終效能配置應是可針對此特定負載情境下提供最佳總體效能表現的配置。FIG. 5 is a flow chart illustrating system limit decisions according to an embodiment of the present invention. Referring to FIG. 5 , the processor 130 can determine whether the gradient reaches the system limit (step S510 ). Specifically, the system limitations relate to the performance configurations allowed by the computing system 100 . For example, supply wattage, cooling capability, or other factors that affect performance. For another example, the component size corresponding to the gradient of the CPU is about 5, but the supply wattage required for the specific gravity to exceed 4 is greater than the upper limit that the computing system 100 can provide, that is, it is determined that the system limit is reached. If the gradient has not reached the system limit, the processor 130 may modify the performance configuration based on the gradient vector (step S530 ), execute the performance test again according to the modified performance configuration (for example, execute step S210 ), and execute the performance test again according to the test result of the performance test Determine the corresponding gradient (for example, perform step S230). For example, if the ratio of the performance configuration obtained by the aforementioned gradient is 1:3.96, the processor 130 modifies the performance configuration of the GPU and the CPU to 1:3.96, and performs the performance test under the same load situation again to obtain the test result, and Accordingly, the corresponding gradient is obtained. This is repeated (the value should be able to converge) until the performance adjustment has reached the system design limit, the processor 130 will disable the adjustment of the performance configuration (ie, no longer adjust the performance configuration), and can determine the final performance configuration accordingly (step S550). At this point, the final performance configuration should be the configuration that provides the best overall performance for this particular load situation.

在實際應用上,處理器130可記錄基於梯度且對應於不同負載情境的效能配置。此外,處理器130可偵測計算系統100的當前運作行為(例如,遊戲、影片、或網頁瀏覽等),並據以設定為對應負載情境的效能配置。藉此,即可讓計算系統100在特定應用上能發揮更佳的效能。In practical applications, the processor 130 can record the performance configuration based on the gradient and corresponding to different load situations. In addition, the processor 130 can detect the current operation behavior of the computing system 100 (eg, game, video, or web browsing, etc.), and set the performance configuration corresponding to the load situation accordingly. In this way, the computing system 100 can achieve better performance in specific applications.

綜上所述,依據本發明計算系統及其效能調整方法,可針對特定負載情境利用效能測試的測試結果對應的梯度得出合適的效能配置。藉此,可快速找出受系統限制的情況下的最佳配置。To sum up, according to the computing system and the performance adjustment method of the present invention, an appropriate performance configuration can be obtained by using the gradient corresponding to the test result of the performance test for a specific load situation. In this way, you can quickly find the optimal configuration under system constraints.

雖然本發明已以實施例揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明的精神和範圍內,當可作些許的更動與潤飾,故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed above by the embodiments, it is not intended to limit the present invention. Anyone with ordinary knowledge in the technical field can make some changes and modifications without departing from the spirit and scope of the present invention. Therefore, The protection scope of the present invention shall be determined by the scope of the appended patent application.

100:計算系統 110:記憶體 130:處理器 S210~S250、S510~S550:步驟 T1、T2:點 S1、S2:個別測試分數 S3:整體測試分數 100: Computing Systems 110: Memory 130: Processor S210~S250, S510~S550: Steps T1, T2: point S1, S2: Individual test scores S3: Overall Test Score

圖1是依據本發明一實施例的計算系統的方塊圖。 圖2是依據本發明一實施例的效能調整方法的流程圖。 圖3A是一範例說明效能曲面。 圖3B是另一範例說明效能曲面。 圖4是一範例說明梯度向量。 圖5是依據本發明一實施例說明系統限制決策的流程圖。 FIG. 1 is a block diagram of a computing system according to an embodiment of the present invention. FIG. 2 is a flowchart of a performance adjustment method according to an embodiment of the present invention. FIG. 3A is an example illustrating a performance surface. FIG. 3B is another example illustrating the performance surface. FIG. 4 is an example illustrating gradient vectors. FIG. 5 is a flow chart illustrating system limit decisions according to an embodiment of the present invention.

S210~S250:步驟S210~S250: Steps

Claims (8)

一種計算系統的效能調整方法,其中該計算系統包括至少二處理器,且該效能調整方法包括:依據一效能配置執行一效能測試,其中該效能測試是針對一負載情境,且該效能配置相關於該至少二處理器在該負載情境下付出效能的比重;決定該效能測試的測試結果對應的一梯度(gradient),其中該測試結果相關於用於該效能測試決定效能表現的一多變數函數,且該梯度是對基於該效能配置及該測試結果的該多變數函數進行梯度運算所得出;依據該梯度修改該效能配置;依據修改的效能配置再次執行該效能測試;以及依據再次執行的該效能測試的測試結果決定對應的梯度。 A performance adjustment method of a computing system, wherein the computing system includes at least two processors, and the performance adjustment method includes: executing a performance test according to a performance configuration, wherein the performance test is for a load situation, and the performance configuration is related to The at least two processors pay a proportion of performance under the load situation; determine a gradient corresponding to a test result of the performance test, wherein the test result is related to a multivariate function used for the performance test to determine performance, And the gradient is obtained by performing gradient operation on the multivariate function based on the performance configuration and the test result; modifying the performance configuration according to the gradient; re-executing the performance test according to the modified performance configuration; and re-executing the performance The test result of the test determines the corresponding gradient. 如請求項1所述的效能調整方法,其中決定該效能測試的測試結果對應的該梯度的步驟包括:決定該梯度中對應於該至少二處理器的梯度向量,其中該梯度向量包括分別對應於該至少二處理器的向量的分量大小;以及依據該至少二處理器對應的分量大小修改該效能配置。 The performance adjustment method of claim 1, wherein the step of determining the gradient corresponding to the test result of the performance test comprises: determining a gradient vector in the gradient corresponding to the at least two processors, wherein the gradient vector includes component sizes of the vector of the at least two processors; and modifying the performance configuration according to the component sizes corresponding to the at least two processors. 如請求項2所述的效能調整方法,其中該測試結果包括該至少二處理器的個別測試分數及一整體測試分數,該整體測試分數是將該至少二處理器的個別測試分數輸入該多變數函數所得出,且依據該梯度修改該效能配置的步驟包括: 判斷該至少二處理器的該梯度向量在一多維向量空間中的向量方向,其中該多維向量空間由對應於該至少二處理器的該個別測試分數的多個維度所形成;以及依據該向量方向的指向修改該效能配置,其中朝該指向增加或減少對應維度的比重。 The performance adjustment method of claim 2, wherein the test result includes individual test scores of the at least two processors and an overall test score, and the overall test score is obtained by inputting the individual test scores of the at least two processors into the multivariable The steps of modifying the performance profile according to the gradient include: determining the vector direction of the gradient vector of the at least two processors in a multidimensional vector space, wherein the multidimensional vector space is formed by a plurality of dimensions corresponding to the individual test scores of the at least two processors; and according to the vector The orientation of the direction modifies the performance profile, where the weight of the corresponding dimension is increased or decreased toward that orientation. 一種計算系統的效能調整方法,其中該計算系統包括至少二處理器,且該校能調正方法包括:依據一效能配置執行一效能測試,其中該效能測試是針對依負載情境,且該效能配置相關於該至少二處理器在該負載情境下付出效能的比重;決定該效能測試的測試結果對應的一梯度,其中該測試結果相關於用於該效能測試決定效能表現的一多變數函數,且該梯度是對基於該效能配置及該測試結果的該多變數函數進行梯度運算所得出;依據該梯度修改該效能配置;判斷該梯度是否達到一系統限制,其中該系統限制相關於該計算系統所允許的效能配置;以及針對達到該系統限制的該梯度,禁能用於該效能配置之調整。 A performance adjustment method for a computing system, wherein the computing system includes at least two processors, and the performance adjustment method includes: executing a performance test according to a performance configuration, wherein the performance test is based on a load situation, and the performance configuration relative to the proportion of performance paid by the at least two processors under the load situation; determining a gradient corresponding to a test result of the performance test, wherein the test result is related to a multivariate function used for the performance test to determine performance performance, and The gradient is obtained by performing a gradient operation on the multivariate function based on the performance configuration and the test result; modifying the performance configuration according to the gradient; determining whether the gradient reaches a system limit, wherein the system limit is related to the computing system a permissible performance profile; and disabling adjustment for the performance profile for the gradient up to the system limit. 一種計算系統,包括:至少二處理器,其中的一者經配置用以執行:依據一效能配置執行一效能測試,其中該效能測試是針對一負載情境,且該效能配置相關於該至少二處理器在該負載情 境下付出效能的比重;決定該效能測試的測試結果對應的一梯度,其中該測試結果相關於用於該效能測試決定效能表現的一多變數函數,且該梯度是對基於該效能配置及該測試結果的該多變數函數進行梯度運算所得出;依據該梯度修改該效能配置;依據修改的效能配置再次執行該效能測試;以及依據再次執行的該效能測試的測試結果決定對應的梯度。 A computing system comprising: at least two processors, one of which is configured to perform: executing a performance test according to a performance configuration, wherein the performance test is for a load situation and the performance configuration is related to the at least two processors in this load situation The proportion of performance paid in the environment; determine a gradient corresponding to the test result of the performance test, wherein the test result is related to a multivariate function used for the performance test to determine the performance performance, and the gradient is based on the performance configuration and the The multi-variable function of the test result is obtained by performing gradient operation; modifying the performance configuration according to the gradient; executing the performance test again according to the modified performance configuration; and determining the corresponding gradient according to the test result of the performance test executed again. 如請求項5所述的計算系統,其中該至少二處理器中的一者更經配置用以執行:決定該梯度中對應於該至少二處理器的梯度向量,其中該梯度向量包括分別對應於該至少二處理器的向量的分量大小;以及依據該至少二處理器對應的分量大小修改該效能配置。 The computing system of claim 5, wherein one of the at least two processors is further configured to perform: determining a gradient vector in the gradient corresponding to the at least two processors, wherein the gradient vector includes respectively corresponding to component sizes of the vector of the at least two processors; and modifying the performance configuration according to the component sizes corresponding to the at least two processors. 如請求項6所述的計算系統,其中該測試結果包括該至少二處理器的個別測試分數及一整體測試分數,該整體測試分數是將該至少二處理器的個別測試分數輸入該多變數函數所得出,且該至少二處理器中的一者更經配置用以執行:判斷該至少二處理器的該梯度向量在一多維向量空間中的向量方向,其中該多維向量空間由對應於該至少二處理器的該個別測試分數的多個維度所形成;以及依據該向量方向的指向修改該效能配置,其中朝該指向增加或減少對應維度的比重。 The computing system of claim 6, wherein the test result includes an individual test score of the at least two processors and an overall test score, the overall test score being inputted into the multivariate function by the individual test scores of the at least two processors is obtained, and one of the at least two processors is further configured to perform: determining the vector direction of the gradient vector of the at least two processors in a multi-dimensional vector space, wherein the multi-dimensional vector space is defined by corresponding to the forming a plurality of dimensions of the individual test scores of at least two processors; and modifying the performance configuration according to the direction of the vector direction, wherein the weight of the corresponding dimension is increased or decreased toward the direction. 一種計算系統,包括:至少二處理器,其中的一者經配置用以執行:依據一效能配置執行一效能測試,其中該效能測試是針對一負載情境,且該效能配置相關於該至少二處理器在該負載情境下付出效能的比重;決定該效能測試的測試結果對應的一梯度,其中該測試結果相關於用於該效能測試決定效能表現的一多變數函數,且該梯度是對基於該效能配置及該測試結果的該多變數函數進行梯度運算所得出;依據該梯度修改該效能配置;判斷該梯度是否達到一系統限制,其中該系統限制相關於該計算系統所允許的效能配置;以及針對達到該系統限制的該梯度,禁能用於該效能配置之調整。 A computing system comprising: at least two processors, one of which is configured to perform: executing a performance test according to a performance configuration, wherein the performance test is for a load situation and the performance configuration is related to the at least two processors The proportion of performance that the device pays under the load situation; determining a gradient corresponding to the test result of the performance test, wherein the test result is related to a multivariate function used for the performance test to determine performance, and the gradient is based on the The performance configuration and the multivariate function of the test result are obtained by performing gradient operations; modifying the performance configuration according to the gradient; determining whether the gradient reaches a system limit, wherein the system limit is related to the performance configuration allowed by the computing system; and Adjustments for the performance profile are disabled for the gradient up to the system limit.
TW109140913A 2020-11-23 2020-11-23 Computing system and performance adjustment method thereof TWI768554B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW109140913A TWI768554B (en) 2020-11-23 2020-11-23 Computing system and performance adjustment method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW109140913A TWI768554B (en) 2020-11-23 2020-11-23 Computing system and performance adjustment method thereof

Publications (2)

Publication Number Publication Date
TW202221523A TW202221523A (en) 2022-06-01
TWI768554B true TWI768554B (en) 2022-06-21

Family

ID=83062522

Family Applications (1)

Application Number Title Priority Date Filing Date
TW109140913A TWI768554B (en) 2020-11-23 2020-11-23 Computing system and performance adjustment method thereof

Country Status (1)

Country Link
TW (1) TWI768554B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110313674A1 (en) * 2010-06-18 2011-12-22 Roche Diagnostics Operations, Inc. Insulin optimization systems and testing methods with adjusted exit criterion accounting for system noise associated with biomarkers
TWI465934B (en) * 2008-12-02 2014-12-21 Intel Corp Apparatus, system and method for controlling allocation of computing resources
CN106681453A (en) * 2016-11-24 2017-05-17 电子科技大学 Dynamic heat treatment method of high-performance multi-core microprocessor
CN106980623A (en) * 2016-01-18 2017-07-25 华为技术有限公司 A kind of determination method and device of data model
CN107665155A (en) * 2016-07-28 2018-02-06 华为技术有限公司 The method and apparatus of processing data
JP6357525B2 (en) * 2016-12-01 2018-07-11 ヴィア アライアンス セミコンダクター カンパニー リミテッド Neural network unit for efficient 3D convolution
US20190004920A1 (en) * 2017-06-30 2019-01-03 Intel Corporation Technologies for processor simulation modeling with machine learning
CN109885452A (en) * 2019-01-23 2019-06-14 平安科技(深圳)有限公司 Method for monitoring performance, device and terminal device
US20190391859A1 (en) * 2017-10-30 2019-12-26 SigOpt, Inc. Systems and methods for implementing an intelligent application program interface for an intelligent optimization platform

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI465934B (en) * 2008-12-02 2014-12-21 Intel Corp Apparatus, system and method for controlling allocation of computing resources
US20110313674A1 (en) * 2010-06-18 2011-12-22 Roche Diagnostics Operations, Inc. Insulin optimization systems and testing methods with adjusted exit criterion accounting for system noise associated with biomarkers
CN106980623A (en) * 2016-01-18 2017-07-25 华为技术有限公司 A kind of determination method and device of data model
CN107665155A (en) * 2016-07-28 2018-02-06 华为技术有限公司 The method and apparatus of processing data
CN106681453A (en) * 2016-11-24 2017-05-17 电子科技大学 Dynamic heat treatment method of high-performance multi-core microprocessor
JP6357525B2 (en) * 2016-12-01 2018-07-11 ヴィア アライアンス セミコンダクター カンパニー リミテッド Neural network unit for efficient 3D convolution
US20190004920A1 (en) * 2017-06-30 2019-01-03 Intel Corporation Technologies for processor simulation modeling with machine learning
US20190391859A1 (en) * 2017-10-30 2019-12-26 SigOpt, Inc. Systems and methods for implementing an intelligent application program interface for an intelligent optimization platform
CN109885452A (en) * 2019-01-23 2019-06-14 平安科技(深圳)有限公司 Method for monitoring performance, device and terminal device

Also Published As

Publication number Publication date
TW202221523A (en) 2022-06-01

Similar Documents

Publication Publication Date Title
CA3069185C (en) Operation accelerator
US20220129752A1 (en) Memory bandwidth reduction techniques for low power convolutional neural network inference applications
JP6005895B1 (en) Intelligent multi-core control for optimal performance per watt
US20190243609A1 (en) Method and processing apparatus for performing arithmetic operation
US20170060588A1 (en) Computing system and method for processing operations thereof
TW202011264A (en) Method, device and device for detecting information
WO2023093623A1 (en) Computation graph optimization method, data processing method and related product
US20220012307A1 (en) Information processing device, information processing system, information processing method, and storage medium
US11231760B1 (en) Techniques for accurately determining the temperature at various locations of an operating integrated circuit
US20170031822A1 (en) Control method and electronic device
WO2024027039A1 (en) Data processing method and apparatus, and device and readable storage medium
Martin Multicore processors: challenges, opportunities, emerging trends
US20150332495A1 (en) Graphics processing method and graphics processing apparatus
JP2023059231A (en) Key point detection and model training method, apparatus, device, and storage medium
JP2021022373A (en) Method, apparatus and device for balancing loads, computer-readable storage medium, and computer program
US11635904B2 (en) Matrix storage method, matrix access method, apparatus and electronic device
WO2020062252A1 (en) Operational accelerator and compression method
US11003808B2 (en) Subtractive design for heat sink improvement
TW202138999A (en) Data dividing method and processor for convolution operation
WO2019095333A1 (en) Data processing method and device
TWI768554B (en) Computing system and performance adjustment method thereof
Voss et al. Convolutional neural networks on dataflow engines
Power et al. Implications of emerging 3D GPU architecture on the scan primitive
WO2021115039A1 (en) Fpga platform, performance evaluation and design optimization method therefor, and storage medium
Wang Analysis on the construction of computer data processing mode based on era of big data