CN101881996A - Parallel memory system check-point power consumption optimization method - Google Patents
Parallel memory system check-point power consumption optimization method Download PDFInfo
- Publication number
- CN101881996A CN101881996A CN 201010229535 CN201010229535A CN101881996A CN 101881996 A CN101881996 A CN 101881996A CN 201010229535 CN201010229535 CN 201010229535 CN 201010229535 A CN201010229535 A CN 201010229535A CN 101881996 A CN101881996 A CN 101881996A
- Authority
- CN
- China
- Prior art keywords
- storage server
- object storage
- power consumption
- state
- request
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Retry When Errors Occur (AREA)
- Hardware Redundancy (AREA)
Abstract
The invention discloses a parallel memory system check-point power consumption optimization method, which aims at solving the technical problem that power consumption optimization of the parallel memory system is how to proceed based on the operation characteristics of the check-point. The invention adopts the technical scheme that: a server work state gather indicating the work state of each target memory server is built for each memory server, each element in the gather represents a process of the target memory server for providing the service, and more elements in the gather represent that the target memory server provides the check-point service to more processes; and after one target memory server receives a power consumption state setup request, the request is judged whether to be executed according to the state of the server work state gather. By adopting the method, different power consumption states can be self-adaptively set according to the work state of the target memory server operated by each response check point, so the power consumption of the vacant server can be reduced, and the setting confliction problem of different power consumption state orders of the target memory server can be eliminated.
Description
Technical field
The present invention relates to the power consumption optimization method of parallel memory system, refer to especially by storage server being provided with multistage power consumption state, to the method for parallel memory system check-point operation carrying out optimised power consumption.
Background technology
Parallel memory system is the important component part in the massively parallel computer system, and the power consumption that a large amount of file read-write operations produces accounts for the very most of of whole power consumption of computer systems.The checkpoint is the important means that strengthens the high performance computing system availability.In high performance computing system, extensive science computing application often working time longer, and because larger, take a large amount of computational resources, the possibility that hardware fault appears in the system that makes significantly increases.For the assurance program can normally be moved, improve the validity of sequential operation, can in the application program operational process, carry out checkpointed usually, preserve each running state of a process and shared internal memory in the application.In case system's operation is broken down, can utilize the operation of the image file recovery application of nearest preservation, improve the availability of system.Checkpointed is created independently image file for each calculation procedure, and each image file is accompanied by the read-write of mass data, and a large amount of read-write operations makes the power consumption of object storage server sharply rise.Therefore, at the characteristic of checkpointed, it is very necessary and effective that storage server is implemented optimised power consumption.
Checkpointed is carried out has the characteristics of property at interval, and the user carries out a checkpointed to whole application at regular intervals.This interval property characteristics make that the storage server of preserving the checkpoint image file is not in running order always, have the state of storage server zero service of certain hour.Related checkpoint image file is stored in the independent partitions of parallel memory system and (is called checkpoint reflection subregion) among the present invention, its operation and other file read-write are separately, the file read-write of non-checkpointed can not use the object storage server under this checkpoint reflection subregion, therefore, when certain object storage server does not have checkpointed and need handle, server is in idle condition, there is power wastage, the object storage server in this stage can be set to low power consumpting state, to save power consumption.
Utilize checkpoint reflection subregion, and the time interval characteristics of checkpointed, by reducing processor frequencies, memory device being made as the power consumption that means such as low power consumpting state reduce the object storage server that is in idle condition, reducing the energy consumption in the storage system operational process, is one of important means that realizes the storage system optimised power consumption.
At present, power consumption optimization method at storage system is mainly reflected in the memory device level, comprise device sleeps, conditioning equipment rotating speed, minimizing disk tracking number of times are set, also has power consumption control in addition, the power consumption of server during the reduction data backup when backup requirements is arranged at data backup server.The work of carrying out optimised power consumption at the checkpointed characteristics is seldom arranged, towards the optimization of checkpoint also mainly towards the performance optimization aspect.At present, many storage systems all are equipped with the object storage server that carries out checkpointed specially, and the optimised power consumption of ignoring it will be that one of storage resources is wasted greatly.
Summary of the invention
The technical problem to be solved in the present invention is how based on the checkpointed characteristic parallel memory system to be implemented optimised power consumption.Specifically comprise: how to insert the power consumption state setting command, how to solve the collision problem that a plurality of power consumption states instruction that relates to a plurality of checkpointed is set: when use larger or operation more for a long time, checkpoint image file number is greater than the object storage number of servers, a plurality of image files are kept on the same object storage server, different computing nodes can send repeatedly power consumption state setting command, produces conflict.
Technical scheme of the present invention is: be server duty set of each object storage server constructs, be used to represent the duty of this object storage server, on behalf of a server, each element in the set of server duty the process of service is provided, element in the set is many more, represents this object storage server to provide the checkpoint service for many more processes.After certain object storage server receives that a power consumption state is set request, according to the state of server duty set, judge whether to need to carry out this request, with repetition and the collision problem of avoiding a plurality of power consumption state instructions to set.Concrete technical scheme is:
The first step, be two power consumption states of object storage server definition: normal power consumption state and low power consumpting state.Before carrying out checkpointed, the object storage server is set at normal power consumption state.After checkpointed was complete, computing node sent the low power consumpting state setting command to the object storage server, and the object storage server is set to low power consumpting state.
Second step, for the parallel memory system that N object storage server arranged, (server duty of the structure of 1≤j≤N) is gathered G for object storage server j
j, set G
jEmbodied the duty of current object storage server, N is a positive integer.Set G
jIn each element be the process identification (PID) I that obtains by the splicing of job number and process number, representative object storage server j provides the process of checkpoint service for it.G when initial
jBe sky.
Below each step all launch at each object storage server j.
The 3rd step, object storage server j wait for power consumption state setting request R on the horizon, R ∈ { R
Normal, R
Down, R wherein
NormalExpression is set at the request of normal power consumption state, R with the object storage server
DownExpression is set at the object storage server request of low power consumpting state.
After the 4th step, object storage server receive that a power consumption state is set request R, ask pairing job number, process number according to this, job number and process number are coupled together constitute a process identification (PID) I, for example job number is 1000, process number is 500, and then identifying I is 1000500.
If the 5th step R=R
Normal, carried out for the 6th step; Otherwise, carried out for the tenth step;
The power consumption state that the 6th step, this moment arrive is set request R and is required the object storage server is set at normal power consumption state, and indicated object storage server j need respond the services request of I, I is incorporated into the server duty set G of this object storage server j
j, i.e. G
j=G
jU{I}.
The power consumption state of the 7th step, the current object storage server j of inquiry if be in low power consumpting state, then carried out for the 8th step; Otherwise carried out for the 9th step.
The 8th step, object storage server j carry out request R, and j is set at normal power consumption state with the object storage server, and the power consumption state of revising current object storage server j simultaneously is normal power consumption state, changes for the 14 step.
The 9th the step, ignore this request R, changeed for the 14 step.
The power consumption state that the tenth step, this moment arrive is set request R and is required the object storage server is set at low power consumpting state, and indicated object storage server j has finished the services request of I, with the services state set G of I from object storage server j
jIn remove i.e. G
j=G
j-{ I}.
The 11 step, judgement G this moment
jWhether be empty, if carried out for the 12 step; Otherwise carried out for the 13 step.
The 12 step, object storage server j carry out request R, and j is set at low power consumpting state with the object storage server, and the power consumption state of revising current object storage server j simultaneously is a low power consumpting state, changes for the 14 step.
The 13 step, this moment still have the checkpoint operation of other processes to need service, ignore this request R.
The 14 goes on foot, whether has new checkpoint service at hand, if carried out for the 3rd step; Otherwise, carried out for the 15 step.
The 15 step, end.
Adopt this method can reach following effect:
1) can different power consumption states be set adaptively according to each busy-idle condition that responds the object storage server of checkpointed, reach the purpose that reduces the idle server power consumption.
2) the present invention is directed to big operation scale and many job runs situation, defined the duty set of object storage server,, eliminated the collision problem that a plurality of power consumption state instructions of object storage server are set by inquiring about the state of this set.
Description of drawings
Fig. 1 has provided the object storage system structural drawing that has checkpoint reflection subregion, and subregion 1 is used for the subregion of save routine data, and subregion 2 is used to preserve the checkpoint image file for checkpoint reflection subregion.
Fig. 2 is an overview flow chart of the present invention.
Embodiment
Step 1), be two power consumption states of object storage server definition: normal power consumption state and low power consumpting state.
Step 2), be a server duty set of object storage server j structure G
j, G when initial
jBe sky.
Step 3), object storage server j are waiting for power consumption state setting request R on the horizon, R ∈ { R constantly
Normal, R
Down.
Step 4), after the object storage server receives that power consumption state is set request R, job number and process number coupled together constitute a process identification (PID) I.
If step 5) R=R
Normal, then execution in step 6); Otherwise, execution in step 10).
Step 6), G
j=G
jU{I}.
If the current low power consumpting state that is in of step 7) object storage server j, then execution in step 8); Otherwise execution in step 9).
Step 8), object storage server j is set at normal power consumption state, the power consumption state of revising current object storage server j simultaneously is normal power consumption state, turns to step 14).
Step 9), ignore this request R, turn to step 14).
Step 10), G
j=G
j-{ I}.
Step 11), judgement G this moment
jWhether be empty, if then execution in step 12); Otherwise execution in step 13).
Step 12), object storage server j is set at low power consumpting state, the power consumption state of revising current object storage server j simultaneously is a low power consumpting state.Turn to step 14).
Step 13), ignore this request R.
Step 14), whether the service of new checkpoint is arranged at hand, if, execution in step 3); Otherwise, execution in step 15).
Step 15), end.
Claims (1)
1. parallel memory system check-point power consumption optimization method is characterized in that may further comprise the steps:
The first step, be two power consumption states of object storage server definition: normal power consumption state and low power consumpting state; Before carrying out checkpointed, the object storage server is set at normal power consumption state, after checkpointed was complete, computing node sent the low power consumpting state setting command to the object storage server, and the object storage server is set to low power consumpting state;
Second step, for the parallel memory system that N object storage server arranged, be that server duty of object storage server j structure gathers G
j, set G
jIn each element be the process identification (PID) I that obtains by the splicing of job number and process number, representative object storage server j provides the process of checkpoint service for it; G when initial
jBe sky; N is a positive integer, 1≤j≤N;
Each object storage server j enters following work:
The 3rd step, object storage server j wait for power consumption state setting request R on the horizon, R ∈ { R
Normal, R
Down, R wherein
NormalExpression is set at the request of normal power consumption state, R with the object storage server
DownExpression is set at the object storage server request of low power consumpting state;
The 4th step, object storage server are asked pairing job number, process number according to this after receiving that power consumption state is set request R, job number and process number are coupled together constitute a process identification (PID) I;
If the 5th step R=R
Normal, carried out for the 6th step; Otherwise, carried out for the tenth step;
The power consumption state that the 6th step, this moment arrive is set request R and is required the object storage server is set at normal power consumption state, and indicated object storage server j need respond the services request of I, I is incorporated into the server duty set G of this object storage server j
j, i.e. G
j=G
jU{I};
The power consumption state of the 7th step, the current object storage server j of inquiry if be in low power consumpting state, then carried out for the 8th step; Otherwise carried out for the 9th step;
The 8th step, object storage server j carry out request R, and j is set at normal power consumption state with the object storage server, and the power consumption state of revising current object storage server j simultaneously is normal power consumption state, changes for the 14 step;
The 9th the step, ignore this request R, changeed for the 14 step;
The power consumption state that the tenth step, this moment arrive is set request R and is required the object storage server is set at low power consumpting state, and indicated object storage server j has finished the services request of I, with the services state set G of I from object storage server j
jIn remove i.e. G
j=G
j-{ I};
The 11 step, judgement G this moment
jWhether be empty, if carried out for the 12 step; Otherwise carried out for the 13 step;
The 12 step, object storage server j carry out request R, and j is set at low power consumpting state with the object storage server, and the power consumption state of revising current object storage server j simultaneously is a low power consumpting state, changes for the 14 step;
The 13 step, this moment still have the checkpoint operation of other processes to need service, ignore this request R;
The 14 goes on foot, whether has new checkpoint service at hand, if carried out for the 3rd step; Otherwise, carried out for the 15 step;
The 15 step, end.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010102295358A CN101881996B (en) | 2010-07-19 | 2010-07-19 | Parallel memory system check-point power consumption optimization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010102295358A CN101881996B (en) | 2010-07-19 | 2010-07-19 | Parallel memory system check-point power consumption optimization method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101881996A true CN101881996A (en) | 2010-11-10 |
CN101881996B CN101881996B (en) | 2011-07-27 |
Family
ID=43054027
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2010102295358A Expired - Fee Related CN101881996B (en) | 2010-07-19 | 2010-07-19 | Parallel memory system check-point power consumption optimization method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101881996B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102915257A (en) * | 2012-09-28 | 2013-02-06 | 曙光信息产业(北京)有限公司 | TORQUE(tera-scale open-source resource and queue manager)-based parallel checkpoint execution method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1584787A (en) * | 2003-08-19 | 2005-02-23 | 英特尔公司 | Power conservation in the absence of AC power |
WO2008016162A1 (en) * | 2006-08-02 | 2008-02-07 | Kabushiki Kaisha Toshiba | Memory system and memory chip |
-
2010
- 2010-07-19 CN CN2010102295358A patent/CN101881996B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1584787A (en) * | 2003-08-19 | 2005-02-23 | 英特尔公司 | Power conservation in the absence of AC power |
WO2008016162A1 (en) * | 2006-08-02 | 2008-02-07 | Kabushiki Kaisha Toshiba | Memory system and memory chip |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102915257A (en) * | 2012-09-28 | 2013-02-06 | 曙光信息产业(北京)有限公司 | TORQUE(tera-scale open-source resource and queue manager)-based parallel checkpoint execution method |
CN102915257B (en) * | 2012-09-28 | 2017-02-08 | 曙光信息产业(北京)有限公司 | TORQUE(tera-scale open-source resource and queue manager)-based parallel checkpoint execution method |
Also Published As
Publication number | Publication date |
---|---|
CN101881996B (en) | 2011-07-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11345020B2 (en) | Robot cluster scheduling system | |
US8943353B2 (en) | Assigning nodes to jobs based on reliability factors | |
CN102111337B (en) | Method and system for task scheduling | |
Ananthanarayanan et al. | Why let resources idle? Aggressive cloning of jobs with Dolly | |
CN109240825B (en) | Elastic task scheduling method, device, equipment and computer readable storage medium | |
CN103067425A (en) | Creation method of virtual machine, management system of virtual machine and related equipment thereof | |
CN102958166A (en) | Resource allocation method and resource management platform | |
US20110107344A1 (en) | Multi-core apparatus and load balancing method thereof | |
CN108351783A (en) | The method and apparatus that task is handled in multinuclear digital information processing system | |
CN103593242A (en) | Resource sharing control system based on Yarn frame | |
CN102713854A (en) | Method and apparatus for saving and restoring container state | |
WO2014168913A1 (en) | Database management system with database hibernation and bursting | |
CN101713970A (en) | Method and systems for restarting a flight control system | |
US20200183703A1 (en) | Systems and methods for selecting a target host for migration of a virtual machine | |
US11169844B2 (en) | Virtual machine migration to multiple destination nodes | |
US20230325082A1 (en) | Method for setting up and expanding storage capacity of cloud without disruption of cloud services and electronic device employing method | |
CN105808346A (en) | Task scheduling method and device | |
CN105095112A (en) | Method and device for controlling caches to write and readable storage medium of non-volatile computer | |
CN101881996B (en) | Parallel memory system check-point power consumption optimization method | |
CN109783304B (en) | Energy-saving scheduling method and corresponding device for data center | |
CN103957229A (en) | Active updating method, device and server for physical machines in IaaS cloud system | |
US20190243673A1 (en) | System and method for timing out guest operating system requests from hypervisor level | |
JP2007328413A (en) | Method for distributing load | |
CN112328402A (en) | High-efficiency self-adaptive space-based computing platform architecture and implementation method thereof | |
WO2015111067A1 (en) | Dynamically patching kernels using storage data structures |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20110727 Termination date: 20160719 |
|
CF01 | Termination of patent right due to non-payment of annual fee |