CN103914121B - Multicomputer system and method and device for optimizing power consumption of same - Google Patents

Multicomputer system and method and device for optimizing power consumption of same Download PDF

Info

Publication number
CN103914121B
CN103914121B CN201310001368.5A CN201310001368A CN103914121B CN 103914121 B CN103914121 B CN 103914121B CN 201310001368 A CN201310001368 A CN 201310001368A CN 103914121 B CN103914121 B CN 103914121B
Authority
CN
China
Prior art keywords
computer system
testing site
power consumption
data processing
processing equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310001368.5A
Other languages
Chinese (zh)
Other versions
CN103914121A (en
Inventor
张帅
宋风龙
王达
张�浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Institute of Computing Technology of CAS
Original Assignee
Huawei Technologies Co Ltd
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd, Institute of Computing Technology of CAS filed Critical Huawei Technologies Co Ltd
Priority to CN201310001368.5A priority Critical patent/CN103914121B/en
Publication of CN103914121A publication Critical patent/CN103914121A/en
Application granted granted Critical
Publication of CN103914121B publication Critical patent/CN103914121B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Power Sources (AREA)

Abstract

The invention provides a multicomputer system and a method and a device which are used for optimizing power consumption of the same. A first test point and a second test point of each-time search are determined in the determined number range of data processing devices for adjusting the power consumption of the multicomputer system. A section on one side of a test point with the larger power consumption value is abandoned after each-time search, and the test point serves as the boundary of the nuclear number range of the next-time search. The nuclear number search range is effective shortened, and the power consumption optimization efficiency of the multicomputer system is improved.

Description

Multi-computer system, the method and device for optimizing multi-computer system power consumption
Technical field
The present invention relates to computer power-saving technology, more particularly to a kind of multi-computer system, for optimizing multi-computer system power consumption Method and device.
Background technology
Power consumption of processing unit administrative skill is the important topic of processor design in recent years.With entering for deep submicron process Step, electricity leakage power dissipation has become the part of power consumption of processing unit.Therefore, it is a series of for reducing processor electricity leakage power dissipation(It is quiet State power consumption)Technology be born in succession.
The method for being widely used in first reducing processor electricity leakage power dissipation is dynamic power management technology(Dynamic Power Management, DPM).DPM is first passed through and is closed idle processor or processor core and open reducing unnecessary power consumption Pin, then by task immigration and closes the method for the relatively low processor of load or processor core reducing power consumption.
Secondly, with dynamic voltage frequency zoom technology(Dynamic Voltage and Frequency Scaling, DVFS)Extensive application, by DVFS in combination with aforementioned DPM, while the relatively low processor of load or processor core is closed, The electric voltage frequency of other work cores is improved, can be accomplished not only to have saved power consumption but also be ensured that performance is not suffered a loss.
But, on the premise of performance is ensured, being not that the higher check figure of frequency is more few can just obtain lower power consumption.
On the one hand, frequency is improved can cause the superlinearity of power consumption to increase, therefore the dynamic power consumption caused by being increased by frequency Increase more than during the quiescent dissipation reduction brought by closing processor core, the total power consumption of processor will increase;On the other hand, it is right Can perform check figure and reduce frequency ensureing performance, but the static state that check figure increase brings by increasing in the high program of concurrency When power consumption increases more than the dynamic power consumption reduction for bringing is reduced by frequency, the total power consumption of processor also can increase.Therefore ensureing On the premise of performance is constant, using the execution check figure that processor is adjusted while DVFS, power consumption can occur first with the increase of check figure The variation tendency increased after reduction.
For above-mentioned rule on multinuclear or even many-core processor in the case of extensive multithread programs are run, how to exist The optimum frequency for performing check figure of distinct program is found under certain performance constraints, becomes the final goal of optimised power consumption management.
At present, turbo-accelerator(Turbo boost)Technology is Intel(Intel)One kind used in main flow processor Power consumption management method.The technology is adjusted by the frequency that bottom hardware carries out processor core, and specified single core can be carried out Frequency is operated, while remaining idling borne processor core enters deep sleep, to reach the balance between power consumption and performance.
But, Turbo boost technologies are mainly used on the processor less than or equal to 8 core, towards main flow at The check figure of reason device is less, when processor check figure scale is more than current check figure, closes load using Turbo boost technologies low Core and lift the high core frequency of load, it is most likely that there is the situation that power consumption occurs first reducing being further added by with the increase of check figure. Also, the application program degree of parallelism that the technology is directed to is limited, cause Thread Count to be typically smaller than processor check figure, now close idle Although processor core can reduce quiescent dissipation, when program line number of passes be more than processor check figure when, close a part at Reason device core may cause the load of other processor cores to increase, so as to ensureing target capabilities or causing power consumption to increase.
Another power consumption management method be in power consumption-check figure spatially, using climbing method search lowest power consumption needed for core Number.The method measures power consumption number using a certain check figure a as testing site, then performs on a+1 processor core, if power consumption More than the power consumption number measured on a core, then next time testing site is a-1 core;If power consumption is less than the power consumption measured on a core, Then next time the check figure of testing site is a+2, is performed for the both sides circulation of a in check figure successively, and measures corresponding power consumption, is therefrom looked for To the corresponding processor check figure of lowest power consumption.
The shortcoming of this method is that testing site is slower to the speed that lowest power consumption place check figure is approached, and each testing site is all compared Last time increases or decreases a check figure.With the increase of processor scale, the test number (TN) of climbing method search spread also can be big Big to increase, the speed for obtaining optimal solution is slower.Therefore the extensibility of the method is poor, it is impossible to approach power consumption optimum rapidly Value.
The content of the invention
The embodiment of the present invention provides a kind of multi-computer system, the method and device for optimizing multi-computer system power consumption, for carrying The optimised power consumption efficiency of high multi-computer system.
One side, the embodiment of the present invention provides a kind of method for optimizing multi-computer system power consumption, including:
Determine the quantitative range of the data processing equipment for being used to adjust the multi-computer system power consumption in multi-computer system, the number The minimum of a value of amount scope is lower bound, and maximum is the upper bound;
Scan in the quantitative range, determine the first testing site and the second testing site;First testing site, Two testing sites are the quantity of data processing equipment, and first testing site is equal on described with second testing site sum Boundary and the lower bound sum;
All non-executing data processing equipments are closed according to first testing site and the second testing site, and is gradually reduced surplus The frequency of remaining data processing equipment, to meet target capabilities.
Second aspect, the embodiment of the present invention provides a kind of device for optimizing multi-computer system power consumption, including:
Scope determining unit, for determining multi-computer system in for adjusting the data processing equipment of the multi-computer system power consumption Quantitative range, the minimum of a value of the quantitative range is lower bound, and maximum is the upper bound;
Testing site determining unit, for scanning in the quantitative range, determines the first testing site and the second test Point;First testing site, the second testing site are the quantity of data processing equipment, and first testing site and described second Testing site sum is equal to the upper bound and the lower bound sum;
Adjusting performance unit, for closing all non-executing data processings according to first testing site and the second testing site Equipment, and the frequency of remaining data processing equipment is gradually reduced, to meet target capabilities.
3rd aspect, the embodiment of the present invention provides a kind of multi-computer system, including multi-computer system body and above-mentioned for excellent Change the device of multi-computer system power consumption, described device is used to optimize the power consumption of the multi-computer system body.
Multi-computer system provided in an embodiment of the present invention, the method and device for optimizing multi-computer system power consumption, by true In the fixed quantitative range for the data processing equipment for adjusting the multi-computer system power consumption, it is determined that the first testing site of search every time With the second testing site Liang Ge testing sites so that can all give up the interval of the larger testing site side of power consumption number after searching for each time, And using the wherein testing site as the border for searching for check figure scope next time, check figure hunting zone is effectively reduced, improve The optimised power consumption efficiency of multi-computer system.
Description of the drawings
Fig. 1 is a kind of flow chart for optimizing the method for multi-computer system power consumption provided in an embodiment of the present invention;
Fig. 2 is another kind of flow chart for optimizing the method for multi-computer system power consumption provided in an embodiment of the present invention;
Fig. 3 is a kind of structural representation for optimizing the device of multi-computer system power consumption provided in an embodiment of the present invention;
Fig. 4 is a kind of structural representation of multi-computer system of the embodiment of the present invention.
Specific embodiment
Fig. 1 is a kind of flow chart for optimizing the method for multi-computer system power consumption provided in an embodiment of the present invention.This enforcement The method that there is provided of example can realize by increasing a device in multi-computer system, and One function is such as arranged in multi-computer system Module is realizing.Or a device is set in addition realizing the method.As shown in figure 1, the method includes:
Step 11, the quantitative range for determining the data processing equipment for being used to adjust the multi-computer system power consumption in multi-computer system, The minimum of a value of the quantitative range is lower bound, and maximum is the upper bound.
Wherein, multi-computer system can be polycaryon processor, can also be specifically have many alternatively with the system of multiple processors It is individual can separately adjustable frequency the system of equipment etc..
Step 12, scan in above-mentioned quantitative range, determine the first testing site and the second testing site;First test Point, the second testing site are the quantity of data processing equipment, and first testing site be equal to the second testing site sum it is described The upper bound and the lower bound sum.
Step 13, all non-executing data processing equipments are closed according to above-mentioned first testing site and the second testing site, and by Step reduces the frequency of remaining data processing equipment, to meet target capabilities.
In the present embodiment, by it is determined that the quantitative range for adjusting the data processing equipment of the multi-computer system power consumption It is interior, it is determined that first testing site and the second testing site Liang Ge testing sites of search every time so that can all give up work(after searching for each time Consumption is worth the interval of larger testing site side, and using the wherein testing site as the border for searching for check figure scope next time, effectively Reduce check figure hunting zone, improve the optimised power consumption efficiency of multi-computer system.
In above-mentioned steps 11, the number of the data processing equipment for being used to adjust the multi-computer system power consumption in multi-computer system is determined Amount scope may include:Determine the initial number of the data processing equipment for being used to adjust the multi-computer system power consumption in the multi-computer system Amount scope.
Specifically, it is determined that being used to adjust the initial of the data processing equipment of the multi-computer system power consumption in the multi-computer system Quantitative range, it may include:
If the Thread Count run in the multi-computer system is more than the quantity of all data processing equipments in the multi-computer system, Then the minimum of a value of the initial number scope is 0, and maximum is the quantity of all data processing equipments in the multi-computer system;
If the Thread Count of total Thread Count or operation in the multi-computer system is less than in the multi-computer system at all data The quantity of reason equipment, then the minimum of a value of the initial number scope is 0, and maximum is the thread run in the multi-computer system Number.
In above-mentioned steps 12, scan in the quantitative range, determine the first testing site and the second testing site, can wrap Include:
Calculate a=(X+Y)× m, b=(X+Y)×(1-m);Wherein, a is first testing site, and b is the described second test Point, 0<m<1, X is the upper bound, and Y is the lower bound, and a, b, X, Y are variable;
The all threads in the multi-computer system are performed with a data processing equipment, and measures the multi-computer system Power consumption A;
The all threads in the multi-computer system are performed with the b data processing equipment, and measures the multi-computer system Power consumption B;
The comparison A and B;
If | A-B |<W, then perform and described close all non-executing data according to first testing site and the second testing site Processing equipment, wherein, w is first predetermined value;
If | A-B |>=w, then judge whether | a-b |<e;
If | a-b |<E, then perform and described close all non-executing data according to first testing site and the second testing site Processing equipment, wherein, e is second predetermined value;If | a-b |>=e, then judge whether I>=d;
If I>=d, then perform and described close all non-executing data processings according to first testing site and the second testing site Equipment, wherein, I is cycle-index, and it is third predetermined value that initial value is 0, d;
If I<D, and A>B, a<B, then calculate I=I+1, Y=a, a=b, b=X+Y-a;A=B, then performs the b Data processing equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system;
If I<D, and A>B, a>B, then calculate I=I+1, X=a, a=b, b=X+Y-a;A=B, then performs the b Data processing equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system;
If I<D, and A<B, a<B, then calculate I=I+1, X=b, a=a, b=X+Y-a;Then the b data are performed Processing equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system;
If I<D, and A<B, a>B, then calculate I=I+1, Y=b, a=a, b=X+Y-a;Then the b data are performed Processing equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system.
In above-mentioned steps 13, all non-executing data processings are closed according to first testing site and the second testing site and is set It is standby, it may include:
Calculate Z-(a+b)/ 2, wherein Z are the sum of data processing equipment in the multi-computer system;
Close Z- in the multi-computer system(a+b)/ 2 data processing equipments.
Below by taking the polycaryon processor with N number of core as an example, to the method work for optimizing multi-computer system power consumption further Describe in detail.
When testing site check figure is performed every time, if the testing site check figure is more than the check figure that a upper testing site performs, need By the pending thread migration such as a part to startup and on the processor core of free time;If the testing site check figure is less than upper one examination A check figure for performing is tested, the thread migration performed on the processor core that will will be closed is needed to other movable processor cores On, and accomplish the load average distribution of each movable processor core as far as possible.
Referring to Fig. 2, the method for optimizing polycaryon processor power consumption is comprised the following steps:
Step 21, by the abundant parallelization of the program run in polycaryon processor, determine in polycaryon processor be used for adjust this The initial number scope [Y, X] of the core of polycaryon processor, that is, determine the initial value of X and Y.Also, I=0 is set, and wherein I is circulation Number of times.
Wherein, X is the check figure upper bound, and Y is check figure lower bound.
For example, if the total number of threads of institute's operation program is more than the total N of polycaryon processor center, it is determined that the initial number of core Amount scope is 0 to total check figure, i.e. [0, N], that is to say, that X=N, Y=0.
If the total number of threads of institute's operation program is less than the total N of polycaryon processor center, it is determined that the initial number model of core Enclose for 0 to total number of threads, i.e. [0, total number of threads], that is to say, that X=total number of threads, Y=0.
If total number of threads is less than the total N of polycaryon processor center, the method that can adopt similar tradition turbo boost Core idle in polycaryon processor is closed, then to remaining core execution step 22 and later step, polycaryon processor is entered Row optimised power consumption.
Step 22, first testing site a=of calculating(X+Y)× m, calculates second testing site b=(X+Y)×(1-m).Its In, the selection interval of m is(0,1).
Here, with(Y, X)As the optimum search space for performing check figure, first examination is selected in the search space respectively Point a, second testing site b are tested as the test check figure in this search.
Step 23, using a core in polycaryon processor as perform core, by other X-a core in the polycaryon processor Thread migration to this core, this X-a core becomes non-executing core, and closes these non-executing cores.Then, progressively drop Low each frequency for performing core, until the polycaryon processor is just met for target capabilities, measures polycaryon processor power consumption now A.Assume that all cores are had to operate in identical frequency, when reducing frequency, the frequency of all cores is together reduced.
Wherein, target capabilities are the target capabilities used after power managed mechanism is started, and can be preset.Such as:Can be with One fixed target capabilities is set, a fixed target capabilities lower limit can be set, one can also be arranged and allow performance The upper limit of loss, or using the performance of the program for directly measuring polycaryon processor when all available cores are normally performed as target Performance.
Step 24, using b core in polycaryon processor as perform core, by other X-b core in the polycaryon processor Thread migration to this b core, this X-b core becomes non-executing core, and closes these non-executing cores.Then, progressively drop Low each frequency for performing core, until the polycaryon processor is just met for target capabilities, measures the plurality of processor power consumption now B。
Step 25, judge that power consumption A and power consumption B difference, whether less than a predetermined value, such as judge whether | A-B |<W, if It is, execution step 28;Otherwise, execution step 26.Wherein, w is the predetermined value of A and B difference.
Step 26, judge that a and b differences, whether less than a predetermined value, such as judge whether | a-b |<ε, if so, performs step Rapid 28;Otherwise, execution step 27.Wherein, ε is the predetermined value of difference.
Step 27, so far, whether search procedure reaches a cycle-index limits, and such as judges whether I >=d, if so, performs Step 28, otherwise, execution step 29.Wherein, d is the predetermined value of cycle-index.
Step 28, the check figure obtained needed for polycaryon processor lowest power consumption are(a+b)/ 2, that is, determine and held in polycaryon processor The optimal number of row core is(a+b)/ 2, therefore, only retain(a+b)/ 2 cores close remaining all non-executing as core is performed Core, gradually reduces the frequency for performing core, and the target capabilities until being just met for polycaryon processor complete the power consumption of polycaryon processor Optimization.Frequency now is the processor frequencies needed for lowest power consumption.Terminate flow process.
Step 29, calculating I=I+1.
Step 210, compare power consumption A and power consumption B, judge whether power consumption A>Power consumption B, if so, execution step 211;Otherwise, hold Row step 214.
Step 211, judge whether a<B, if so, execution step 212;Otherwise, execution step 213.
Step 212, the new check figure region of search is reduced into(A, X), will check figure lower bound Y be set to a, that is to say, that Y=a, Also, the first i.e. a=b of testing site a being set to current b in the new check figure region of search, and set b=X+Y-a.So, afterwards It is not required to perform above-mentioned steps 23, A=B can be directly obtained, then, execution step 24 obtains new B values.
Step 213, the new check figure region of search is reduced into(Y, a), will check figure upper bound X be set to a, that is to say, that X=a, First test check figure i.e. a=b of a b being set in the new check figure region of search.And set b=X+Y-a.So, it is not required to afterwards hold Row above-mentioned steps 23, can directly obtain A=B, and then, execution step 24 obtains new B values.
Step 214, judge whether a<B, if so, execution step 215;Otherwise, execution step 216.
Step 215, the new check figure region of search is reduced into(Y, b), will check figure upper bound X be set to b, that is to say, that X=b, First test check figure i.e. a=a of a a being still set in the new check figure region of search.And set b=X+Y-a.So, A values are constant, It is not required to afterwards perform above-mentioned steps 23.Then, execution step 24, obtain new B values.
Step 216, the new check figure region of search is reduced into(B, X), will check figure lower bound Y be set to b, that is to say, that Y=b, First test check figure i.e. a=a of a a being still set in the new check figure region of search.And set b=X+Y-a.So, A values are constant, It is not required to afterwards perform above-mentioned steps 23.Then, execution step 24, obtain new B values.
When having performed above-mentioned steps 28, the power consumption of multi-computer system reach it is minimum after, if there is following any one situation, 21 ~ step 216 of above-mentioned steps can be again performed, optimised power consumption is carried out to multi-computer system:
The Thread Count that a, polycaryon processor are run changes;
B, the load of multithread programs change;
When c, power managed mechanism start;
D, when change setting target capabilities when.
In above-described embodiment, by two testing sites a and b determining search every time so that all can give up after searching for each time The interval of the larger testing site side of power consumption number is abandoned, and using the wherein testing site as the border for searching for check figure scope next time, So as to effectively reduce check figure hunting zone.Also, perform search constantly to reduce check figure hunting zone by circulation, until Till reaching the condition for stopping search.So, if every time the used fraction m of search, is searched in circulation next time In do not have as border testing site still can as testing site, and test gained power consumption directly can be measured using the last time Result, can save in every time circulation and once perform measurement.
The technical scheme provided using above-described embodiment, first, spatially can quickly judge to need house in power consumption-check figure The region of search abandoned, and every time the power dissipation ratio of testing site, compared with the power consumption number that can be measured using a last time, is reduced The expense of search measurement, and rapidly reduce the space for needing search.Secondly, the check figure and frequency needed for lowest power consumption is found While, can take and close the processor core for not needing configuration processor to save quiescent dissipation, it is also possible to need not perform Other programs are performed on the processor core of this program, the utilization rate of processor greatly improved, improve the efficiency of processor.Most Afterwards, a certain class application programming application specific processor, the check figure according to needed for the feature of program finds lowest power consumption can be directed to And frequency, the design of chip is instructed, hardware spending on piece can significantly be saved with effective control chip-scale.
Fig. 3 is a kind of structural representation for optimizing the device of multi-computer system power consumption provided in an embodiment of the present invention.This The device that embodiment is provided is used to realize the method shown in above-mentioned Fig. 1, as shown in figure 3, the device includes:Scope determining unit 31st, testing site determining unit 32 and adjusting performance unit 33.
Scope determining unit 31 is used to determine that the data processing for being used to adjust the multi-computer system power consumption in multi-computer system sets Standby quantitative range, the minimum of a value of the quantitative range is lower bound, and maximum is the upper bound.
Testing site determining unit 32 is used to be scanned in the quantitative range, determines the first testing site and the second test Point;First testing site, the second testing site are the quantity of data processing equipment, and first testing site and described second Testing site sum is equal to the upper bound and the lower bound sum.
Adjusting performance unit 33 is used to be closed at all non-executing data according to first testing site and the second testing site Reason equipment, and the frequency of remaining data processing equipment is gradually reduced, to meet target capabilities.
Scope determining unit 31 can be specifically for being used to adjust the multi-computer system power consumption in the determination multi-computer system The initial number scope of data processing equipment.
Further, scope determining unit 31 can be specifically for:
If the Thread Count run in the multi-computer system is more than the quantity of all data processing equipments in the multi-computer system, Then the minimum of a value of the initial number scope is 0, and maximum is the quantity of all data processing equipments in the multi-computer system;
If the Thread Count of total Thread Count or operation in the multi-computer system is less than in the multi-computer system at all data The quantity of reason equipment, then the minimum of a value of the initial number scope is 0, and maximum is the thread run in the multi-computer system Number.
Alternatively, testing site determining unit 32 can be specifically for:
Calculate a=(X+Y)× m, b=(X+Y)×(1-m);Wherein, a is first testing site, and b is the described second test Point, 0<m<1, X is the upper bound, and Y is the lower bound, and a, b, X, Y are variable;
The all threads in the multi-computer system are performed with a data processing equipment, and measures the multi-computer system Power consumption A;
The all threads in the multi-computer system are performed with the b data processing equipment, and measures the multi-computer system Power consumption B;
The comparison A and B;
If | A-B |<W, then perform and described close all non-executing data according to first testing site and the second testing site Processing equipment, wherein, w is first predetermined value;
If | A-B |>=w, then judge whether | a-b |<e;
If | a-b |<E, then perform and described close all non-executing data according to first testing site and the second testing site Processing equipment, wherein, e is second predetermined value;If | a-b |>=e, then judge whether I>=d;
If I>=d, then perform and described close all non-executing data processings according to first testing site and the second testing site Equipment, wherein, I is cycle-index, and it is third predetermined value that initial value is 0, d;
If I<D, and A>B, a<B, then calculate I=I+1, Y=a, a=b, b=X+Y-a;A=B, then performs and described uses the b Individual data processing equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system;
If I<D, and A>B, a>B, then calculate I=I+1, X=a, a=b, b=X+Y-a;A=B, then performs described with described B data processing equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system;
If I<D, and A<B, a<B, then calculate I=I+1, X=b, a=a, b=X+Y-a;Then the b numbers are performed The all threads in the multi-computer system are performed according to processing equipment, and measures power consumption B of the multi-computer system;
If I<D, and A<B, a>B, then calculate I=I+1, Y=b, a=a, b=X+Y-a;Then the b data are performed Processing equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system.
Alternatively, adjusting performance unit 33 may include:Quantity computation subunit 331 and equipment close subelement 332.
Quantity computation subunit 331 is used to calculate Z-(a+b)/ 2, wherein Z are data processing equipment in the multi-computer system Sum;
Equipment closes subelement 332 to be used to close Z- in the multi-computer system(a+b)/ 2 data processing equipments.
Fig. 4 is a kind of structural representation of multi-computer system of the embodiment of the present invention.As shown in figure 4, the multi-computer system can be existing Have on the basis of polycaryon processor, multicomputer system, many device systems etc. and increased optimised power consumption function, i.e., including multi-computer system Body 41 and optimization device 42.Multi-computer system body 41 can be existing polycaryon processor, multicomputer system, many device systems Deng.Optimization device 42 can be that any one shown in Fig. 3 is used to optimize the device of multi-computer system power consumption, for optimizing the multimachine The power consumption of system ontology 41.
The embodiment of the present invention gives a kind of computer program, and the computer program is situated between including computer-readable Matter, the computer-readable recording medium include first group of program code, for performing method shown in above-mentioned Fig. 1 in step:
Determine the quantitative range of the data processing equipment for being used to adjust the multi-computer system power consumption in multi-computer system, the number The minimum of a value of amount scope is lower bound, and maximum is the upper bound;
Scan in the quantitative range, determine the first testing site and the second testing site;First testing site, Two testing sites are the quantity of data processing equipment, and first testing site is equal on described with second testing site sum Boundary and the lower bound sum;
All non-executing data processing equipments are closed according to first testing site and the second testing site, and is gradually reduced surplus The frequency of remaining data processing equipment, to meet target capabilities.
Optionally it is determined that in multi-computer system be used for adjust the multi-computer system power consumption data processing equipment quantity model Enclose, including:
Determine the initial number model of the data processing equipment for being used to adjust the multi-computer system power consumption in the multi-computer system Enclose.
Optionally it is determined that being used to adjust the initial of the data processing equipment of the multi-computer system power consumption in the multi-computer system Quantitative range, including:
If the Thread Count run in the multi-computer system is more than the quantity of all data processing equipments in the multi-computer system, Then the minimum of a value of the initial number scope is 0, and maximum is the quantity of all data processing equipments in the multi-computer system;
If the Thread Count of total Thread Count or operation in the multi-computer system is less than in the multi-computer system at all data The quantity of reason equipment, then the minimum of a value of the initial number scope is 0, and maximum is the thread run in the multi-computer system Number.
Alternatively, scan in the quantitative range, determine the first testing site and the second testing site, including:
Calculate a=(X+Y)× m, b=(X+Y)×(1-m);Wherein, a is first testing site, and b is the described second test Point, 0<m<1, X is the upper bound, and Y is the lower bound, and a, b, X, Y are variable;
The all threads in the multi-computer system are performed with a data processing equipment, and measures the multi-computer system Power consumption A;
The all threads in the multi-computer system are performed with the b data processing equipment, and measures the multi-computer system Power consumption B;
The comparison A and B;
If | A-B |<W, then perform and described close all non-executing data according to first testing site and the second testing site Processing equipment, wherein, w is first predetermined value;
If | A-B |>=w, then judge whether | a-b |<e;
If | a-b |<E, then perform and described close all non-executing data according to first testing site and the second testing site Processing equipment, wherein, e is second predetermined value;If | a-b |>=e, then judge whether I>=d;
If I>=d, then perform and described close all non-executing data processings according to first testing site and the second testing site Equipment, wherein, I is cycle-index, and it is third predetermined value that initial value is 0, d;
If I<D, and A>B, a<B, then calculate I=I+1, Y=a, a=b, b=X+Y-a;A=B, then performs the b Data processing equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system;
If I<D, and A>B, a>B, then calculate I=I+1, X=a, a=b, b=X+Y-a;A=B, then performs the b Data processing equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system;
If I<D, and A<B, a<B, then calculate I=I+1, X=b, a=a, b=X+Y-a;Then the b numbers are performed The all threads in the multi-computer system are performed according to processing equipment, and measures power consumption B of the multi-computer system;
If I<D, and A<B, a>B, then calculate I=I+1, Y=b, a=a, b=X+Y-a;Then the b numbers are performed The all threads in the multi-computer system are performed according to processing equipment, and measures power consumption B of the multi-computer system.
Alternatively, all non-executing data processing equipments are closed according to first testing site and the second testing site, including:
Calculate Z-(a+b)/ 2, wherein Z are the sum of data processing equipment in the multi-computer system;
Close Z- in the multi-computer system(a+b)/ 2 data processing equipments.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above-mentioned each method embodiment can lead to Cross the related hardware of programmed instruction to complete.Aforesaid program can be stored in a computer read/write memory medium.The journey Sequence upon execution, performs the step of including above-mentioned each method embodiment;And aforesaid storage medium includes:ROM, RAM, magnetic disc or Person's CD etc. is various can be with the medium of store program codes.
Finally it should be noted that:Various embodiments above only to illustrate technical scheme, rather than a limitation;To the greatest extent Pipe has been described in detail with reference to foregoing embodiments to the present invention, it will be understood by those within the art that:Its according to So the technical scheme described in foregoing embodiments can be modified, either which part or all technical characteristic are entered Row equivalent;And these modifications or replacement, do not make the essence disengaging various embodiments of the present invention technology of appropriate technical solution The scope of scheme.

Claims (11)

1. a kind of method for optimizing multi-computer system power consumption, it is characterised in that include:
Determine the quantitative range of the data processing equipment for being used to adjust the multi-computer system power consumption in multi-computer system, the quantity model The minimum of a value enclosed is lower bound, and maximum is the upper bound;
Scan in the quantitative range, determine the first testing site and the second testing site;First testing site, the second examination Test the quantity for being a little data processing equipment, and first testing site and second testing site sum be equal to the upper bound with The lower bound sum;
All non-executing data processing equipments are closed according to first testing site and the second testing site, and gradually reduces remainder According to the frequency of processing equipment, to meet target capabilities;
Wherein, it is described to scan in the quantitative range, determine the first testing site and the second testing site, including:
Calculate a=(X+Y) × m, b=(X+Y) × (1-m);Wherein, a is first testing site, and b is second testing site, 0<m<1, X is the upper bound, and Y is the lower bound, and a, b, X, Y are variable;
The all threads in the multi-computer system are performed with a data processing equipment, and measures power consumption A of the multi-computer system;
The all threads in the multi-computer system are performed with b data processing equipment, and measures power consumption B of the multi-computer system;
The comparison A and B;
If | A-B |<W, then perform and described close all non-executing data processings according to first testing site and the second testing site Equipment, wherein, w is first predetermined value;
If | A-B |>=w, then judge whether | a-b |<e;
If | a-b |<E, then perform and described close all non-executing data processings according to first testing site and the second testing site Equipment, wherein, e is second predetermined value;
If | a-b |>=e, then judge whether I>=d, wherein, I is cycle-index, and it is third predetermined value that initial value is 0, d;
If I>=d, then perform and described close all non-executing data processings according to first testing site and the second testing site and set It is standby;If I<D, then according to the magnitude relationship of A and B, and the magnitude relationship of a and b, reduce the quantitative range.
2. method according to claim 1, it is characterised in that determine and be used in multi-computer system and adjust the multi-computer system power consumption Data processing equipment quantitative range, including:
Determine the initial number scope of the data processing equipment for being used to adjust the multi-computer system power consumption in the multi-computer system.
3. method according to claim 2, it is characterised in that determine and be used in the multi-computer system and adjust the multi-computer system The initial number scope of the data processing equipment of power consumption, including:
If the Thread Count run in the multi-computer system is more than the quantity of all data processing equipments in the multi-computer system, institute The minimum of a value for stating initial number scope is 0, and maximum is the quantity of all data processing equipments in the multi-computer system;
If the Thread Count of total Thread Count or operation in the multi-computer system sets less than all data processings in the multi-computer system Standby quantity, then the minimum of a value of the initial number scope is 0, and maximum is the Thread Count run in the multi-computer system.
4. according to claim 1-3 any one methods described, it is characterised in that if the I<D, then close according to the size of A and B System, and the magnitude relationship of a and b, reduce the quantitative range, including:
If I<D, and A>B, a<B, then calculate I=I+1, Y=a, a=b, b=X+Y-a;A=B, then performs and described uses b numbers The all threads in the multi-computer system are performed according to processing equipment, and measures power consumption B of the multi-computer system;
If I<D, and A>B, a>B, then calculate I=I+1, X=a, a=b, b=X+Y-a;A=B, then performs and described uses b numbers The all threads in the multi-computer system are performed according to processing equipment, and measures power consumption B of the multi-computer system;
If I<D, and A<B, a<B, then calculate I=I+1, X=b, a=a, b=X+Y-a;Then the use b data processing is performed Equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system;
If I<D, and A<B, a>B, then calculate I=I+1, Y=b, a=a, b=X+Y-a;Then the use b data processing is performed Equipment performs all threads in the multi-computer system, and measures power consumption B of the multi-computer system.
5. method according to claim 4, it is characterised in that closed according to first testing site and the second testing site all Non-executing data processing equipment, including:
Z- (a+b)/2 is calculated, wherein Z is the sum of data processing equipment in the multi-computer system;
Close Z- (a+b)/2 data processing equipment in the multi-computer system.
6. a kind of device for optimizing multi-computer system power consumption, it is characterised in that include:
Scope determining unit, for determining multi-computer system in for adjust the multi-computer system power consumption data processing equipment number Amount scope, the minimum of a value of the quantitative range is lower bound, and maximum is the upper bound;
Testing site determining unit, for scanning in the quantitative range, determines the first testing site and the second testing site;Institute State the first testing site, the second testing site and be the quantity of data processing equipment, and first testing site and the described second test Point sum is equal to the upper bound and the lower bound sum;
Adjusting performance unit, sets for closing all non-executing data processings according to first testing site and the second testing site It is standby, and the frequency of remaining data processing equipment is gradually reduced, to meet target capabilities;
The testing site determining unit specifically for:
Calculate a=(X+Y) × m, b=(X+Y) × (1-m);Wherein, a is first testing site, and b is second testing site, 0<m<1, X is the upper bound, and Y is the lower bound, and a, b, X, Y are variable;
The all threads in the multi-computer system are performed with a data processing equipment, and measures power consumption A of the multi-computer system;
The all threads in the multi-computer system are performed with b data processing equipment, and measures power consumption B of the multi-computer system;
The comparison A and B;
If | A-B |<W, then perform and described close all non-executing data processings according to first testing site and the second testing site Equipment, wherein, w is first predetermined value;
If | A-B |>=w, then judge whether | a-b |<e;
If | a-b |<E, then perform and described close all non-executing data processings according to first testing site and the second testing site Equipment, wherein, e is second predetermined value;If | a-b |>=e, then judge whether I>=d;
If I>=d, then perform and described close all non-executing data processings according to first testing site and the second testing site and set It is standby, wherein, I is cycle-index, and it is third predetermined value that initial value is 0, d;If I<D, then according to the magnitude relationship of A and B, and a With the magnitude relationship of b, the quantitative range is reduced.
7. device according to claim 6, it is characterised in that the scope determining unit is specifically for determining the multimachine system In system be used for adjust the multi-computer system power consumption data processing equipment initial number scope.
8. device according to claim 7, it is characterised in that the scope determining unit specifically for:
If the Thread Count run in the multi-computer system is more than the quantity of all data processing equipments in the multi-computer system, institute The minimum of a value for stating initial number scope is 0, and maximum is the quantity of all data processing equipments in the multi-computer system;
If the Thread Count of total Thread Count or operation in the multi-computer system sets less than all data processings in the multi-computer system Standby quantity, then the minimum of a value of the initial number scope is 0, and maximum is the Thread Count run in the multi-computer system.
9. according to claim 6-8 any one described device, it is characterised in that if the testing site determining unit is used for I<D, then According to the magnitude relationship of A and B, and the magnitude relationship of a and b, the quantitative range is reduced, including:
If I<D, and A>B, a<B, then calculate I=I+1, Y=a, a=b, b=X+Y-a;A=B, then performs and described uses b numbers The all threads in the multi-computer system are performed according to processing equipment;
If I<D, and A>B, a>B, then calculate I=I+1, X=a, a=b, b=X+Y-a;A=B, then performs and described uses b numbers The all threads in the multi-computer system are performed according to processing equipment;
If I<D, and A<B, a<B, then calculate I=I+1, X=b, a=a, b=X+Y-a;Then the use b data processing is performed Equipment performs all threads in the multi-computer system;
If I<D, and A<B, a>B, then calculate I=I+1, Y=b, a=a, b=X+Y-a;Then the use b data processing is performed Equipment performs all threads in the multi-computer system.
10. device according to claim 9, it is characterised in that the adjusting performance unit includes:
Quantity computation subunit, for calculating Z- (a+b)/2, wherein Z is the sum of data processing equipment in the multi-computer system;
Equipment close subelement, for closing the multi-computer system in Z- (a+b)/2 data processing equipments.
11. a kind of multi-computer systems, it is characterised in that including described in multi-computer system body and any one of the claims 6-10 For optimizing the device of multi-computer system power consumption, described device is used to optimize the power consumption of the multi-computer system body.
CN201310001368.5A 2013-01-04 2013-01-04 Multicomputer system and method and device for optimizing power consumption of same Expired - Fee Related CN103914121B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310001368.5A CN103914121B (en) 2013-01-04 2013-01-04 Multicomputer system and method and device for optimizing power consumption of same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310001368.5A CN103914121B (en) 2013-01-04 2013-01-04 Multicomputer system and method and device for optimizing power consumption of same

Publications (2)

Publication Number Publication Date
CN103914121A CN103914121A (en) 2014-07-09
CN103914121B true CN103914121B (en) 2017-04-19

Family

ID=51039875

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310001368.5A Expired - Fee Related CN103914121B (en) 2013-01-04 2013-01-04 Multicomputer system and method and device for optimizing power consumption of same

Country Status (1)

Country Link
CN (1) CN103914121B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10296067B2 (en) * 2016-04-08 2019-05-21 Qualcomm Incorporated Enhanced dynamic clock and voltage scaling (DCVS) scheme
CN109471716A (en) * 2018-09-26 2019-03-15 努比亚技术有限公司 A kind of application thread processing method, terminal and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1641534A (en) * 2004-01-13 2005-07-20 Lg电子株式会社 Apparatus for controlling power of processor having a plurality of cores and control method of the same
CN101010655A (en) * 2004-09-03 2007-08-01 英特尔公司 Coordinating idle state transitions in multi-core processors
CN101790709A (en) * 2007-08-27 2010-07-28 马维尔国际贸易有限公司 Dynamic core switches

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7568115B2 (en) * 2005-09-28 2009-07-28 Intel Corporation Power delivery and power management of many-core processors
US7617403B2 (en) * 2006-07-26 2009-11-10 International Business Machines Corporation Method and apparatus for controlling heat generation in a multi-core processor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1641534A (en) * 2004-01-13 2005-07-20 Lg电子株式会社 Apparatus for controlling power of processor having a plurality of cores and control method of the same
CN101010655A (en) * 2004-09-03 2007-08-01 英特尔公司 Coordinating idle state transitions in multi-core processors
CN101790709A (en) * 2007-08-27 2010-07-28 马维尔国际贸易有限公司 Dynamic core switches

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions;范东睿等;《JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY》;20091130;第24卷(第6期);1061-1073页 *

Also Published As

Publication number Publication date
CN103914121A (en) 2014-07-09

Similar Documents

Publication Publication Date Title
CN107515663B (en) Method and device for adjusting running frequency of central processing unit kernel
US10176014B2 (en) System and method for multithreaded processing
Li et al. Sculptor: Flexible approximation with selective dynamic loop perforation
KR101666549B1 (en) Method for dynamic frequency scailing of cpu in the computing device
Zhang et al. Deadline-aware task scheduling for solar-powered nonvolatile sensor nodes with global energy migration
CN103294550A (en) Heterogeneous multi-core thread scheduling method, heterogeneous multi-core thread scheduling system and heterogeneous multi-core processor
CN114240019A (en) Flexible resource value evaluation method and device suitable for new energy power system
CN108664367B (en) Power consumption control method and device based on processor
CN103914121B (en) Multicomputer system and method and device for optimizing power consumption of same
Lawson et al. Energy evaluation for applications with different thread affinities on the Intel Xeon Phi
Sundriyal et al. Initial investigation of a scheme to use instantaneous CPU power consumption for energy savings format
Kim et al. Understanding energy aspects of processing-near-memory for HPC workloads
US10481661B2 (en) Power supply interface light load signal
Das et al. The slowdown or race-to-idle question: Workload-aware energy optimization of SMT multicore platforms under process variation
Korkmaz et al. Towards Dynamic Green-Sizing for Database Servers.
Rumi et al. CPU power consumption reduction in android smartphone
US20230119235A1 (en) Large-Scale Accelerator System Energy Performance Optimization
Wang et al. Evaluating the energy consumption of openmp applications on haswell processors
Rteil et al. Balancing power and performance: A multi-generational analysis of enterprise server bios profiles
Lal et al. GPGPU workload characteristics and performance analysis
Zhu et al. Onac: optimal number of active cores detector for energy efficient gpu computing
US20150067692A1 (en) Thermal Prioritized Computing Application Scheduling
Wang et al. An architecture‐level graphics processing unit energy model
CN107577524A (en) The GPGPU thread scheduling methods of non-memory access priority of task
Bylina et al. Impact of processor frequency scaling on performance and energy consumption for WZ factorization on multicore architecture

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170419

Termination date: 20210104

CF01 Termination of patent right due to non-payment of annual fee