CN111444025A - Resource allocation method, system and medium for improving energy efficiency of computing subsystem - Google Patents

Resource allocation method, system and medium for improving energy efficiency of computing subsystem Download PDF

Info

Publication number
CN111444025A
CN111444025A CN202010290699.5A CN202010290699A CN111444025A CN 111444025 A CN111444025 A CN 111444025A CN 202010290699 A CN202010290699 A CN 202010290699A CN 111444025 A CN111444025 A CN 111444025A
Authority
CN
China
Prior art keywords
power consumption
processor
frequency
node
computing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010290699.5A
Other languages
Chinese (zh)
Other versions
CN111444025B (en
Inventor
陈娟
齐新新
董勇
袁远
吴菲豪
孙晓乐
欧祉辛
张云放
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202010290699.5A priority Critical patent/CN111444025B/en
Publication of CN111444025A publication Critical patent/CN111444025A/en
Application granted granted Critical
Publication of CN111444025B publication Critical patent/CN111444025B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Power Sources (AREA)

Abstract

The invention discloses a resource allocation method, a system and a medium for improving the energy efficiency of a computing subsystemtargetDetermining the optimal number of added nodes delta N*Processor frequency f*(ii) a Setting a power consumption limit value to be satisfied to PtargetAnd scheduling the parallel program to run at N + delta N*On one computing node (Δ N)*>0) and the initial value of the processor frequency of each compute node is the processor frequency f*Where N is the minimum number of compute nodes (one process allocated for each processor core) required for parallel program execution. The invention can realize the reduction of the program execution time and the energy consumption under the condition of meeting the power consumption constraint aiming at the access limited parallel program running on the system, thereby improving the energy effectiveness of the system.

Description

Resource allocation method, system and medium for improving energy efficiency of computing subsystem
Technical Field
The invention relates to a resource allocation technology of a high-performance computing cluster, in particular to a resource allocation method, a resource allocation system and a resource allocation medium for improving the energy efficiency of a computing subsystem.
Background
The computing power of high performance computing systems is increasingly affected by power consumption. Despite the rapid increase in energy consumption of high performance computing centers, high performance computing users still require higher performance to run more complex models at larger data scales. Therefore, it is urgently needed to find a method for improving the performance of a high-performance computer program under the condition of meeting the constraint condition of power consumption. Currently in this research area, there are several ways to improve the energy efficiency of high performance computing systems, such as designing new computer architectures and performing reasonable resource scheduling for high performance computing programs based on software. The software-based resource scheduling method improves the performance of a program under the condition of meeting power consumption constraints by carefully determining the computing resource settings, such as the number of computing nodes, the processor frequency and the like. One advantage of the software-based resource scheduling approach is that it can be easily deployed on existing hardware since no hardware modifications are required. Currently, the resource allocation strategy of most high performance computing centers aims to maximize system utilization, i.e., allocate as few computing nodes as possible. This strategy does not take into account the relationship between the optimal performance of the access-limited parallel program and the number of distributed compute nodes, because maximizing processor utilization may result in severe memory contention for the access-limited parallel program, thereby affecting parallel performance.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: aiming at the problems in the prior art, the invention provides a resource allocation method, a system and a medium for improving the energy efficiency of a computing subsystem, and the method, the system and the medium can realize the reduction of program execution time, the constant total power consumption and the reduction of energy consumption under the condition of meeting the power consumption constraint condition aiming at the access-limited parallel program running on the system, thereby improving the energy effectiveness of the system.
In order to solve the technical problems, the invention adopts the technical scheme that:
a resource allocation method for improving energy efficiency of a computing subsystem comprises the following implementation steps:
1) determining an optimal number of added nodes Δ N, a processor frequency f, and a power consumption limit value Ptarget
2) Setting a power consumption limit value to P using a dynamic processor frequency adjustment tooltargetAnd scheduling the parallel program to run on N + delta N computing nodes, wherein the initial value of the processor frequency of each computing node is the processor frequency f, N is the minimum number of the computing nodes required by the running of the parallel program, and each processor core runs a process under the allocation of default resources.
Optionally, step 1) is preceded by a step of calculating an optimal number of added nodes Δ N, and the detailed steps include: calculating a first incremental node data interval [0, Δ N ] using total memory bandwidthpref](ii) a Calculating a second increased node data interval [0, Delta N ] by using a power consumption constraint conditionpower](ii) a Find the first incremental node data interval [0, Δ Npref]Second incremental node data interval [0, Δ Npower]And selecting the maximum value in the intersection interval as the optimal number of the increased nodes.
Optionally, the first increased node data interval [0, Δ N ] is calculated by using the total memory bandwidthpref]The detailed steps comprise:
memory bandwidth calculation first incremental node data interval [0, Δ N ]pref]The detailed steps comprise:
s1) acquiring the actual memory access bandwidth b of each computing node at each time t of record1(t),b2(t),...,bN(t), calculating the average actual memory access bandwidth B (t) of the single node during the running of the parallel program, and taking the maximum value of B (t) as the actual memory access bandwidth B of the parallel programN
S2) calculating the actual memory access bandwidth BNA ratio bound with respect to the physical memory bandwidth B of a single node, and according to the ratioWhether the value bound reaches a threshold value α or not, judging whether the access of the parallel program is limited or not, if the access is not limited, jumping to execute the step S3), and if the access is limited, jumping to execute the step S4);
s3) determining that no node addition is required, Δ N is setprefIs 0, so that the resulting first incremental node data interval [0, Δ N ] ispref]Is [0,0 ]]Ending and returning;
s4) according to the principle of invariable total memory bandwidth N- ((bound/α) & BN)=(N+ΔNpref) α. B solving for the number of nodes Δ N that need to be addedprefTo obtain the first incremental node data interval [0, Δ N ]pref]And ending and returning.
Optionally, the second increased node data interval [0, Δ N ] is calculated by using a power consumption constraint conditionpower]Specifically, the maximum node number Δ N satisfying the following power consumption constraint function is solved, and the obtained node number Δ N is used as the node number Δ N to be increasedpowerTo obtain a second incremental node data interval [0, Δ N ]power];
Figure BDA0002450283890000022
In the above formula, n is the number of processes of the parallel program, and each processor core runs one process under the default resource allocation, Pcpu(fmax) As maximum frequency f of a single processor coremaxLower corresponding maximum power consumption, Pcpu(fmid) Run at f for a single processor coremidThe next corresponding processor power consumption, c is the number of processor cores owned on each compute node,
Figure BDA0002450283890000023
for processor power consumption, P, with a single processor core in an idle statememFor memory power consumption, PotherFor other power consumption on a single compute node, in addition to processor and memory, adding Δ N compute nodes will correspondingly increase the total power consumption, including memory power consumption of Δ N compute nodes and idle processor cores generated by the added nodesPower consumption, the frequency of all processor cores must be from the maximum frequency f in order to ensure that the total power consumption of the multi-node is not increasedmaxDown to the middle of the frequency fmidTaking the intermediate value f of frequencymidIs the maximum frequency fmaxAnd minimum frequency fminAverage value between the two.
Optionally, step 1) is preceded by a step of calculating a processor frequency f, and the detailed steps include: substituting the optimal number of added nodes Δ N into a power consumption constraint function expressed by the following formula, making Δ N equal to Δ N, and making Pcpu(fmid)=Pcpu(fi) To obtain a power consumption value Pcpu(fi) According to the relation between different processor frequency levels and the power consumption values of the processor cores, the processor frequency f meeting the conditions is takeniAs the processor frequency f determined in step 1);
Figure BDA0002450283890000032
in the above formula, n is the number of processes of the parallel program, and each processor core runs one process under the default resource allocation, Pcpu(fmax) Operating at a maximum frequency f for a single processor coremaxLower corresponding processor power consumption, Pcpu(fi) Run at f for a single processor coreiThe next corresponding processor power consumption, c is the number of processor cores owned on each compute node,
Figure BDA0002450283890000033
for processor power consumption, P, with a single processor core in an idle statememFor memory power consumption, PotherFor other power consumption on a single compute node than processor and memory.
Optionally, before the step 1), calculating the power consumption limit value P is further includedtargetAnd calculating a function expression as follows:
Figure BDA0002450283890000034
in the above formula, Pcpu(fmax) Operating at a maximum frequency f for a single processor coremaxThe corresponding processor power consumption.
In addition, the invention also provides a resource allocation system for improving the energy efficiency of the computing subsystem, which comprises the following components:
a parameter initialization program unit for determining the optimum number of incremental nodes Δ N, the processor frequency f and the power consumption limit value Ptarget
A resource allocation program unit for setting the power consumption limit value of the dynamic frequency adjustment tool of the processor to PtargetAnd scheduling the parallel program to run on N + delta N computing nodes, wherein the initial value of the processor frequency of each computing node is the processor frequency f, N is the minimum computing node number (each processor core runs one process under default resource allocation) required by the running of the parallel program, and delta N is the optimal increased node number.
In addition, the present invention also provides a resource allocation system for improving energy efficiency of a computing subsystem, which includes a computer device programmed or configured to perform the steps of the aforementioned resource allocation method for improving energy efficiency of a computing subsystem, or a computer program programmed or configured to perform the aforementioned resource allocation method for improving energy efficiency of a computing subsystem is stored in a memory of the computer device.
Furthermore, the present invention also provides a computer-readable storage medium having stored thereon a computer program programmed or configured to perform the aforementioned resource allocation method for improving energy efficiency of a computing subsystem.
Compared with the prior art, the invention has the following advantages: the invention can realize the reduction of program execution time, the constant total power consumption and the reduction of energy consumption under the condition of meeting the power consumption constraint condition aiming at the access-limited parallel program running on the system, thereby improving the energy effectiveness of the system.
Drawings
FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.
FIG. 2 is a detailed flow chart of the method according to the embodiment of the present invention.
Detailed Description
As shown in fig. 1, the implementation steps of the resource allocation method for improving the energy efficiency of the computing subsystem in this embodiment include:
1) determining an optimal number of added nodes Δ N, a processor frequency f, and a power consumption limit value Ptarget
2) Setting a power consumption limit value P for a processor dynamic frequency adjustment tooltargetAnd scheduling the parallel program to run on N + delta N computing nodes, wherein the initial value of the processor frequency of each computing node is the processor frequency f, and N is the minimum number of computing nodes required by the running of the parallel program (each processor core runs a process under default resource allocation)target
As shown in FIG. 2, the present embodiment requires pre-measuring and establishing the relationship between the frequency levels of different processors and the power consumption of the processor core, P is described abovecpu() Namely, the method is used for acquiring the relationship between different processor frequency levels and the power consumption values of the processor cores. In this embodiment, the frequency level is divided by 0.1GHZ, and the processor can adjust the frequency interval [ f [ ]min,fmax]May be divided into M stages. And constructing a corresponding table of the relationship between the processor frequency level and the processor core power consumption value by using the measured processor power consumption under different frequency levels, wherein the table comprises M groups of values. Each set of values includes two parts: processor frequency fiCorresponding individual processor core power consumption Pcpu(fi). In this embodiment, the program is run under a default resource allocation policy and used
Figure BDA0002450283890000041
VTuneTMThe Amplifier collects performance analysis data, and uses Intel RAP L to measure power consumption related data, wherein the performance related analysis data comprises actual memory access bandwidths of programs at different moments t and bound values (the actual memory access bandwidth B) reflecting the limited degree of memory accessNRelative sheetThe ratio of the physical memory bandwidth B of the individual nodes). The analysis data related to power consumption includes: processor power consumption with single processor core in idle state
Figure BDA0002450283890000042
Memory power consumption PmenPower consumption P on a single compute node other than processor and memoryother
As shown in fig. 2, before step 1), the present embodiment further includes a step of calculating an optimal number of added nodes Δ N, and the detailed steps include: calculating a first incremental node data interval [0, Δ N ] using total memory bandwidthpref](ii) a Calculating a second increased node data interval [0, Delta N ] by using a power consumption constraint conditionpower](ii) a Find the first incremental node data interval [0, Δ Npref]Second incremental node data interval [0, Δ Npower]And selecting the maximum value in the intersection interval as the optimal number of the increased nodes.
In this embodiment, the total memory bandwidth is used to calculate the first incremental node data interval [0, Δ Npref]The detailed steps comprise:
s1) acquiring the actual memory access bandwidth b of each computing node at each time t of record1(t),b2(t),...,bN(t), calculating the average actual memory access bandwidth B (t) of the single node during the running of the parallel program, and taking the maximum value of B (t) as the actual memory access bandwidth B of the parallel programN
S2) calculating the actual memory access bandwidth BNRelative to the ratio bound of the physical memory bandwidth B of a single node, judging whether the access and storage of the parallel program are limited according to whether the ratio bound exceeds a threshold value α, and if the access and storage are not limited, skipping to execute a step S3);
s3) determining that no node addition is required, Δ N is setprefIs 0, so that the resulting first incremental node data interval [0, Δ N ] ispref]Is [0,0 ]]Ending and returning;
s4) according to the principle of invariable total memory bandwidth N- ((bound/α) & BN)=(N+ΔNpref) α. B is solvedNumber of nodes to increase toprefTo obtain the first incremental node data interval [0, Δ N ]pref]And ending and returning.
Calculating a function expression of the average actual memory access bandwidth b (t) of the single node as follows:
Figure BDA0002450283890000051
in the above equation, N is the minimum number of compute nodes required for parallel program execution (one process is run per processor core under default resource allocation), where biAnd (t) is the actual memory access bandwidth value on the ith computing node.
Limited by physical memory bandwidth, during program execution, actual memory access bandwidth BNFor a program which is not limited by access, when different computing node numbers are used, the total actual access bandwidth of all nodes can be considered to be constant, and when the program is limited by access, the increased computing nodes can reduce the calculation amount of the parallel program on a single computing node and reduce the access times on the single node, thereby relieving the access limitation condition of the single node, and the total actual access bandwidth of multiple nodes is correspondingly increased along with the increase of the node number (bound/α) B. BNThe higher the memory limitation degree is, the larger the value is than BNWherein the bound value reflects the access limit degree.
According to the unchanged total memory bandwidth, the following function expression is provided:
Figure BDA0002450283890000052
in the above formula, the first and second carbon atoms are,
Figure BDA0002450283890000053
therefore, according to the principle of the total memory bandwidth invariance, N. ((bound/α) · BN)=(N+ΔNpref) α. B solving for the number of nodes Δ N that need to be addedprefTo obtain the first incremental node data interval [0, Δ N ]pref]。
In this embodiment, the power consumption constraint condition is used to calculate the second incremental node data interval [0, Δ Npower]Specifically, the maximum node number Δ N satisfying the following power consumption constraint function is solved, and the obtained node number Δ N is used as the node number Δ N to be increasedpowerTo obtain a second incremental node data interval [0, Δ N ]power];
Figure BDA0002450283890000054
In the above formula, n is the number of processes, Pcpu(fmax) As maximum frequency f of a single processor coremaxLower corresponding maximum power consumption, Pcpu(fmid) As the intermediate frequency f of a single processor coremidThe corresponding maximum power consumption, c is the number of processor cores of a single compute node,
Figure BDA0002450283890000055
for processor power consumption, P, with a single processor core in an idle statemenFor memory power consumption, PotherFor other power consumption of a single computing node except for a processor and a memory, increasing the Δ N computing nodes correspondingly increases the total power consumption, wherein the total power consumption comprises the memory power consumption of the Δ N computing nodes and the power consumption of idle processor cores generated by the increased nodes, and in order to ensure that the total power consumption of multiple nodes is not increased, the frequency of all the processor cores must be increased from the maximum frequency fmaxDown to the middle of the frequency fmidTaking the intermediate value f of frequencymidIs the maximum frequency fmaxAnd minimum frequency fminAverage value between the two.
In this embodiment, step 1) further includes a step of calculating a processor frequency f, and the detailed steps include substituting the optimal number of added nodes Δ N into formula (1), making Δ N ═ Δ N, and making Pcpu(fmid)=Pcpu(fi) To obtain a power consumption value Pcpu(fi) According to the relation between different processor frequency levels and the power consumption values of the processor cores, the processor frequency f meeting the conditions is takeniIs taken as the processor frequency f determined in step 1).
In this embodiment, before step 1), calculating the power consumption limit value PtargetAnd calculating a function expression as follows:
Figure BDA0002450283890000061
in the above formula, Pcpu(fmax) Operating at a maximum frequency f for a single processor coremaxThe corresponding processor power consumption.
In summary, the resource allocation method for improving the energy efficiency of the computing subsystem aims at access-limited programs running on a cluster system, and aims to achieve the purposes that program execution time is reduced, total power consumption is kept unchanged, and energy consumption is reduced under the condition that power consumption constraint conditions are met, so that the energy efficiency of the system is improved.
In addition, this embodiment further provides a resource allocation system for improving energy efficiency of a computing subsystem, including:
a parameter initialization program unit for determining the optimum number of incremental nodes Δ N, the processor frequency f and the power consumption limit value Ptarget
A resource allocation program unit for setting the power consumption limit value of the dynamic frequency adjustment tool of the processor to PtargetAnd scheduling the parallel program to run on N + delta N computing nodes, wherein the initial value of the processor frequency of each computing node is the processor frequency f, N is the minimum computing node number required by the running of the parallel program, each processor core runs a process under the default resource allocation, and delta N is the optimal increased node number.
In addition, the embodiment also provides a resource allocation system for improving energy efficiency of a computing subsystem, which includes a computer device programmed or configured to perform the steps of the foregoing resource allocation method for improving energy efficiency of a computing subsystem, or a computer program programmed or configured to perform the foregoing resource allocation method for improving energy efficiency of a computing subsystem is stored in a memory of the computer device.
Furthermore, the present embodiment also provides a computer-readable storage medium, on which a computer program is stored, the computer program being programmed or configured to execute the foregoing resource allocation method for improving energy efficiency of the computing subsystem.
The above description is only a preferred embodiment of the present embodiment, and the protection scope of the present embodiment is not limited to the above embodiment, and all technical solutions belonging to the idea of the present embodiment belong to the protection scope of the present embodiment. It should be noted that, for those skilled in the art, several improvements and modifications can be made without departing from the principle of the present embodiment, and these improvements and modifications should also be construed as the protection scope of the present embodiment.

Claims (9)

1. A resource allocation method for improving energy efficiency of a computing subsystem is characterized by comprising the following implementation steps:
1) determining an optimal number of added nodes Δ N, a processor frequency f, and a power consumption limit value Ptarget
2) Setting a power consumption limit value to P using a dynamic processor frequency adjustment tooltargetAnd scheduling the parallel program to run on N + delta N computing nodes, wherein the initial value of the processor frequency of each computing node is the processor frequency f, N is the minimum number of the computing nodes required by the running of the parallel program, and each processor core runs a process under the allocation of default resources.
2. The method for allocating resources to improve energy efficiency of a computing subsystem according to claim 1, wherein step 1) is preceded by a step of calculating an optimal number of incremental nodes Δ N, and the detailed steps include: calculating a first incremental node data interval [0, Δ N ] using total memory bandwidthpref](ii) a Calculating a second increased node data interval [0, Delta N ] by using a power consumption constraint conditionpower](ii) a Find the first incremental node data interval [0, Δ Npref]Second incremental node data interval [0, Δ Npower]And selecting the maximum value in the intersection interval as the optimal number of the increased nodes.
3. The resource allocation method for improving energy efficiency of the computing subsystem according to claim 2, wherein the first incremental node data interval [0, Δ N ] is calculated by using the total memory bandwidthpref]The detailed steps comprise:
s1) acquiring the actual memory access bandwidth b of each computing node at each time t of record1(t),b2(t),...,bN(t), calculating the average actual memory access bandwidth B (t) of the single node during the running of the parallel program, and taking the maximum value of B (t) as the actual memory access bandwidth B of the parallel programNWherein b isi(t) is the actual memory access bandwidth value on the ith computing node;
s2) calculating the actual memory access bandwidth BNRelative to the ratio bound of the physical memory bandwidth B of a single node, judging whether the access and storage of the parallel program are limited according to whether the ratio bound reaches a threshold value α, and if the access and storage are not limited, skipping to execute a step S3);
s3) determining that no node addition is required, Δ N is setprefIs 0, so that the resulting first incremental node data interval [0, Δ N ] ispref]Is [0,0 ]]Ending and returning;
s4) according to the principle of invariable total memory bandwidth N- ((bound/α) & BN)=(N+ΔNpref) α. B solving for the number of nodes Δ N that need to be addedprefTo obtain the first incremental node data interval [0, Δ N ]pref]And ending and returning.
4. The resource allocation method for improving energy efficiency of the computing subsystem according to claim 2, wherein the second incremental node data interval [0, Δ N ] is calculated by using the power consumption constraint conditionpower]Specifically, the maximum node number Δ N satisfying the following power consumption constraint function is solved, and the obtained node number Δ N is used as the node number Δ N to be increasedpowerTo obtain a second incremental node data interval [0, Δ N ]power];
Figure FDA0002450283880000011
In the above formula, n is the number of processes of the parallel program, and each processor core runs one process under the default resource allocation, Pcpu(fmax) As maximum frequency f of a single processor coremaxLower corresponding maximum power consumption, Pcpu(fmid) Run at f for a single processor coremidThe next corresponding processor power consumption, c is the number of processor cores owned on each compute node,
Figure FDA0002450283880000012
for processor power consumption, P, with a single processor core in an idle statememFor memory power consumption, PotherFor other power consumption of a single computing node except for a processor and a memory, increasing the Delta N computing nodes can correspondingly increase the total power consumption, wherein the total power consumption comprises the memory power consumption of the Delta N computing nodes and the power consumption of idle processor cores generated by the increased nodes, and in order to ensure the total power consumption of multiple nodesThe power consumption is not increased, and the frequency of all processor cores must be from the maximum frequency fmaxDown to the middle of the frequency fmidTaking the intermediate value f of frequencymidIs the maximum frequency fmaxAnd minimum frequency fminAverage value between the two.
5. The method for allocating resources to improve energy efficiency of a computing subsystem according to claim 1, wherein step 1) is preceded by a step of calculating a processor frequency f, and the detailed steps comprise: substituting the optimal number of added nodes Δ N into a power consumption constraint function expressed by the following formula, making Δ N equal to Δ N, and making Pcpu(fmid)=Pcpu(fi) To obtain a power consumption value Pcpu(fi) According to the relation between different processor frequency levels and the power consumption values of the processor cores, the processor frequency f meeting the conditions is takeniAs the processor frequency f determined in step 1);
Figure FDA0002450283880000021
in the above formula, n is the number of processes of the parallel program, and each processor core runs one process under the default resource allocation, Pcpu(fmax) Operating at a maximum frequency f for a single processor coremaxLower corresponding processor power consumption, Pcpu(fi) Run at f for a single processor coreiThe next corresponding processor power consumption, c is the number of processor cores owned on each compute node,
Figure FDA0002450283880000022
for processor power consumption, P, with a single processor core in an idle statememFor memory power consumption, PotherFor other power consumption on a single compute node than processor and memory.
6. The resource allocation method for improving energy efficiency of the computing subsystem according to claim 5, wherein the step 1) is preceded by calculating workConsumption limit value PtargetAnd calculating a function expression as follows:
Figure FDA0002450283880000023
in the above formula, Pcpu(fmax) Operating at a maximum frequency f for a single processor coremaxThe corresponding processor power consumption.
7. A resource allocation system for improving energy efficiency of a computing subsystem, comprising:
a parameter initialization program unit for determining the optimum number of incremental nodes Δ N, the processor frequency f and the power consumption limit value Ptarget
A resource allocation program unit for setting a power consumption limit value to P using the dynamic processor frequency adjustment tooltargetAnd scheduling the parallel program to run on N + delta N computing nodes, wherein the initial value of the processor frequency of each computing node is the processor frequency f, N is the minimum number of the computing nodes required by the running of the parallel program, and each processor core runs a process under the allocation of default resources.
8. A resource allocation system for improving energy efficiency of a computing subsystem, comprising a computer device, wherein the computer device is programmed or configured to perform the steps of the resource allocation method for improving energy efficiency of a computing subsystem according to any one of claims 1 to 6, or wherein a memory of the computer device has stored thereon a computer program programmed or configured to perform the resource allocation method for improving energy efficiency of a computing subsystem according to any one of claims 1 to 6.
9. A computer-readable storage medium having stored thereon a computer program programmed or configured to perform the method for computing sub-system energy efficient resource allocation according to any one of claims 1 to 6.
CN202010290699.5A 2020-04-14 2020-04-14 Resource allocation method, system and medium for improving energy efficiency of computing subsystem Active CN111444025B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010290699.5A CN111444025B (en) 2020-04-14 2020-04-14 Resource allocation method, system and medium for improving energy efficiency of computing subsystem

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010290699.5A CN111444025B (en) 2020-04-14 2020-04-14 Resource allocation method, system and medium for improving energy efficiency of computing subsystem

Publications (2)

Publication Number Publication Date
CN111444025A true CN111444025A (en) 2020-07-24
CN111444025B CN111444025B (en) 2022-11-25

Family

ID=71651687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010290699.5A Active CN111444025B (en) 2020-04-14 2020-04-14 Resource allocation method, system and medium for improving energy efficiency of computing subsystem

Country Status (1)

Country Link
CN (1) CN111444025B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104375899A (en) * 2014-11-21 2015-02-25 北京应用物理与计算数学研究所 Thread for high-performance computer NUMA perception and memory resource optimizing method and system
US20160196121A1 (en) * 2015-01-02 2016-07-07 Reservoir Labs, Inc. Systems and methods for energy proportional scheduling
US20170031416A1 (en) * 2015-07-31 2017-02-02 International Business Machines Corporation Controlling power consumption
CN109298918A (en) * 2018-07-10 2019-02-01 东南大学 A kind of parallel task energy-saving scheduling method based on linear programming

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104375899A (en) * 2014-11-21 2015-02-25 北京应用物理与计算数学研究所 Thread for high-performance computer NUMA perception and memory resource optimizing method and system
US20160196121A1 (en) * 2015-01-02 2016-07-07 Reservoir Labs, Inc. Systems and methods for energy proportional scheduling
US20170031416A1 (en) * 2015-07-31 2017-02-02 International Business Machines Corporation Controlling power consumption
CN109298918A (en) * 2018-07-10 2019-02-01 东南大学 A kind of parallel task energy-saving scheduling method based on linear programming

Also Published As

Publication number Publication date
CN111444025B (en) 2022-11-25

Similar Documents

Publication Publication Date Title
Sarood et al. Maximizing throughput of overprovisioned hpc data centers under a strict power budget
JP5564564B2 (en) Method and apparatus for non-uniformly changing the performance of a computing unit according to performance sensitivity
US7818594B2 (en) Power efficient resource allocation in data centers
CN107515663B (en) Method and device for adjusting running frequency of central processing unit kernel
US20090248922A1 (en) Memory buffer allocation device and computer readable medium having stored thereon memory buffer allocation program
CN105574153A (en) Transcript placement method based on file heat analysis and K-means
Heirman et al. Undersubscribed threading on clustered cache architectures
Zhang et al. Toward qos-awareness and improved utilization of spatial multitasking gpus
US20200137581A1 (en) Resource utilization of heterogeneous compute units in electronic design automation
US20130080809A1 (en) Server system and power managing method thereof
TW202145010A (en) Methods of storing data, electronic devices and storage media
Hanson et al. What computer architects need to know about memory throttling
US20120290789A1 (en) Preferentially accelerating applications in a multi-tenant storage system via utility driven data caching
Padoin et al. Saving energy by exploiting residual imbalances on iterative applications
EP3295276B1 (en) Reducing power by vacating subsets of cpus and memory
CN115269118A (en) Scheduling method, device and equipment of virtual machine
US9563532B1 (en) Allocation of tasks in large scale computing systems
CN112114650B (en) Power consumption regulation and control method, device, equipment and readable storage medium
CN111444025B (en) Resource allocation method, system and medium for improving energy efficiency of computing subsystem
CN110308991B (en) Data center energy-saving optimization method and system based on random tasks
CN114356588B (en) Data preloading method and device
Jahre et al. A high performance adaptive miss handling architecture for chip multiprocessors
Fu et al. Optimizing data locality by executor allocation in spark computing environment
US9389919B2 (en) Managing workload distribution among computer systems based on intersection of throughput and latency models
Watanabe et al. Power reduction of chip multi-processors using shared resource control cooperating with DVFS

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant