CN103645956A - Intelligent cluster load management method - Google Patents

Intelligent cluster load management method Download PDF

Info

Publication number
CN103645956A
CN103645956A CN201310695452.1A CN201310695452A CN103645956A CN 103645956 A CN103645956 A CN 103645956A CN 201310695452 A CN201310695452 A CN 201310695452A CN 103645956 A CN103645956 A CN 103645956A
Authority
CN
China
Prior art keywords
node
power
idle
state
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310695452.1A
Other languages
Chinese (zh)
Inventor
焦芬芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201310695452.1A priority Critical patent/CN103645956A/en
Publication of CN103645956A publication Critical patent/CN103645956A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Power Sources (AREA)

Abstract

The invention provides an intelligent cluster load management method. The intelligent load management in a cluster system is finished by operating one shell script or a C program, and a part of nodes in a cluster are automatically opened or closed according to the change condition of an operating load; when the operating load is light, some node power supplies are powered off through a power management module; and when the operating load is heavy, a part of node power supplies are powered on through the power management module, thereby assisting a cluster manager in saving energy. Compared with the prior art, the intelligent cluster load management method is high in practicability and easy to popularize, a large quantity of idle nodes can be closed, and a large amount of energy charge can be saved for a manager of the cluster system.

Description

A kind of method of swarm intelligence load management
Technical field
The present invention relates to Computer Applied Technology field, specifically a kind of method of swarm intelligence load management.
Background technology
In a large-scale group system, the electricity charge account for a big chunk of cluster system management expense, because all node is all in open state in traditional group system, no matter whether at full capacity the utilization factor of clustered node, have wasted a lot of electricity charge.
In group system, on most of node, have " power management module ", if can close automatically/open node power according to job load situation, can save a large amount of manpowers, financial resources to cluster administrator.When job load is lighter, automatically close a part of idle node power supply, reach energy-conservation object; When job load is heavy, automatically open a part of power-off node, meet job requirements, based on this, now provide a kind of method of swarm intelligence load management.
Summary of the invention
Technical assignment of the present invention is to solve the deficiencies in the prior art, and a kind of method of swarm intelligence load management is provided.
Technical scheme of the present invention realizes in the following manner, and the method for this kind of swarm intelligence load management, comprises the following steps:
Configuration item in step 1, configuration file, this configuration item comprises: maximum powers on idle node number, node idle duration, maximum nodes of shutoff operation, training in rotation time;
The corresponding relation file of node and node power administration module ip in step 2, group system, form is:
The power management module ip of node name node, the ip address that obtains power management module when obtaining node state by reading this document;
Step 3, startup training in rotation node state finger daemon, this finger daemon regularly passes through to node power module sending node power supply status querying command, by reading node and node power administration module ip file, obtain the ip of power management module, send the order of obtaining power supply status then to this ip, obtain node power state, and deposit file in, obtain the power supply status of node;
Step 4, starting switch machine decision-making finger daemon, in Fixed Time Interval, whether inquiry has queued jobs, free time/the busy condition of node, reserved node, the number of nodes of electric power starting PowerOn and power-off PowerOff, according to Query Result above and the configuration item value in configuration file, determine switching on and shutting down decision-making, its decision process foundation is the demand that meets operation in group system, when meeting job requirements, the idle node of redundancy some is in power-up state, when the operation does not above have releasing resource, idle node meets the demand of follow-up submit job, when idle node is greater than configuration item maximum, power on idle node while counting, just close a part of node, reach energy-conservation object.
The detailed process of described step 3 is: starts training in rotation node state finger daemon, in fixing interval, carries out,
A, by the order of node power administration module, obtain the power supply status of node, return node PowerOn/PowerOff state;
B, node power state is saved in to file PowerState.txt.
Further, the implementation procedure of described training in rotation node state finger daemon is:
Step 1, first by reading node-map.txt, obtain the ip of the power management module of a node;
Step 2, read while finishing, if arrived end-of-file place, wait for the training in rotation time, and return to step 1;
Step 3, read and do not finish, send power management module node power status inquiry command;
Step 4, order return state is write to PowerState.txt, then return to step 1.
The detailed process of described step 4 is: starting switch machine decision-making finger daemon, in fixing interval, carry out,
A, open node power, below two kinds of situations need to open node:
(1) there is queued jobs; The resource needing according to queued jobs is searched and is met the node of job requirements and by its unlatching from PowerOff node;
(2) actual idle node quantity is less than the maximum idle node number that powers on, and opens node number=min((maximum idle node number that powers on and subtracts an actual idle node quantity nodes), (Poweroff nodes));
B, closed node power supply, actual idle node quantity is greater than the maximum idle node number that powers on, and need to meet two conditions when closed node: one is free time of closed node to be greater than configuration item node idle duration; Another condition is that the idle node quantity of once closing is less than or equal to maximum nodes of shutoff operation of configuration item, wherein the reserved number of nodes of actual idle node quantity=idle node amount of number –.
The beneficial effect that the present invention compared with prior art produced is:
The method of a kind of swarm intelligence load management of the present invention completes intelligent load in group system and manages by moving a shell script or c program, according to the situation of change of job load, part of nodes in automatic opening/closing cluster, when job load is light, by power management module, close some node power; When job load is heavy, by power management module, open a part of node power, help cluster administrator energy-conservation; Can, according to the operation of group system and resource tendency, automatically open, close the node power in cluster; When job load is lighter, a large amount of idle node are closed; When job load is heavy, the node of power-off is reopened; Because the electricity charge account for significant proportion in the managerial cost of group system, and use the method, save a large amount of electricity charge can to cluster system management person; Practical, be easy to promote.
Accompanying drawing explanation
Accompanying drawing 1 is training in rotation node state finger daemon realization flow figure of the present invention.
Accompanying drawing 2 is realization flow figure of switching on and shutting down decision-making finger daemon of the present invention.
Embodiment
Below in conjunction with accompanying drawing, the method for a kind of swarm intelligence load management of the present invention is described in detail below.
In order to overcome the above problems, invented a kind of method of the swarm intelligence load management based on linux shell or C language herein, comprise the following steps:
Configuration item in step 1, configuration file, this configuration item comprises: maximum powers on idle node number, node idle duration, maximum nodes of shutoff operation, training in rotation time.
Its specific operation process is as described below:
Write configuration file InteliLoad.cfg, the contents are as follows:
The # maximum idle node number that powers on, according to cluster scale setting, empirical value, unit node number.
MaxIdleNodeNum 10
# node idle duration, empirical value, in seconds.
NodeIdleDuration 60
# maximum nodes of shutoff operation, empirical value, unit node number.
MaxOperatingNum 10
The # training in rotation time, how long detect one-stop operation load and node state, empirical value, in seconds.
PollIterval 120
The corresponding relation file of node and node power administration module ip in step 2, group system, form is:
The power management module ip of node name node, the ip address that obtains power management module when obtaining node state by reading this document.
Its specific operation process is:
The corresponding relation file node_map.txt that writes node and this node power administration module ip in group system, form is as follows:
# node name node power source administration module ip
Node1 10.156.3.5
Node2 10.156.3.6
Node3 10.156.3.7
……
Step 3, startup training in rotation node state finger daemon, this finger daemon regularly passes through to node power module sending node power supply status querying command, by reading node and node power administration module ip file, obtain the ip of power management module, send the order of obtaining power supply status then to this ip, obtain node power state, and deposit file in, obtain the power supply status of node.
Step 4, starting switch machine decision-making finger daemon, in Fixed Time Interval, whether inquiry has queued jobs, free time/the busy condition of node, reserved node, the number of nodes of electric power starting PowerOn and power-off PowerOff, according to Query Result above and the configuration item value in configuration file, determine switching on and shutting down decision-making, its decision process foundation is the demand that meets operation in group system, when meeting job requirements, the idle node of redundancy some is in power-up state, when the operation does not above have releasing resource, idle node meets the demand of follow-up submit job, when idle node is greater than configuration item maximum, power on idle node while counting, just close a part of node, reach energy-conservation object.
As shown in Figure 1, the detailed process of described step 3 is: start training in rotation node state finger daemon, in fixing interval, carry out,
A, by the order of node power administration module, obtain the power supply status of node, return node PowerOn/PowerOff state.
B, node power state is saved in to file PowerState.txt.
The implementation procedure of described training in rotation node state finger daemon is:
Step 1, first by reading node-map.txt, obtain the ip of the power management module of a node.
Step 2, read while finishing, if arrived end-of-file place, wait for the training in rotation time, and return to step 1.
Step 3, read and do not finish, send power management module node power status inquiry command.
Step 4, order return state is write to PowerState.txt, then return to step 1.
As shown in Figure 2, the detailed process of described step 4 is: starting switch machine decision-making finger daemon, in fixing interval, carry out,
A, open node power, below two kinds of situations need to open node:
(1) there is queued jobs; The resource needing according to queued jobs is searched and is met the node of job requirements and by its unlatching from PowerOff node.
(2) actual idle node quantity is less than the maximum idle node number that powers on, and opens node number=min((maximum idle node number that powers on and subtracts an actual idle node quantity nodes), (Poweroff nodes)).
B, closed node power supply, actual idle node quantity is greater than the maximum idle node number that powers on, and need to meet two conditions when closed node: one is free time of closed node to be greater than configuration item node idle duration; Another condition is that the idle node quantity of once closing is less than or equal to maximum nodes of shutoff operation of configuration item, wherein the reserved number of nodes of actual idle node quantity=idle node amount of number –.
The foregoing is only embodiments of the invention, within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (4)

1. a method for swarm intelligence load management, is characterized in that comprising the following steps:
Configuration item in step 1, configuration file, this configuration item comprises: maximum powers on idle node number, node idle duration, maximum nodes of shutoff operation, training in rotation time;
The corresponding relation file of node and node power administration module ip in step 2, group system, form is:
The power management module ip of node name node, the ip address that obtains power management module when obtaining node state by reading this document;
Step 3, startup training in rotation node state finger daemon, this finger daemon regularly passes through to node power module sending node power supply status querying command, by reading node and node power administration module ip file, obtain the ip of power management module, send the order of obtaining power supply status then to this ip, obtain node power state, and deposit file in, obtain the power supply status of node;
Step 4, starting switch machine decision-making finger daemon, in Fixed Time Interval, whether inquiry has queued jobs, free time/the busy condition of node, reserved node, the number of nodes of electric power starting PowerOn and power-off PowerOff, according to Query Result above and the configuration item value in configuration file, determine switching on and shutting down decision-making, its decision process foundation is the demand that meets operation in group system, when meeting job requirements, the idle node of redundancy some is in power-up state, when the operation does not above have releasing resource, idle node meets the demand of follow-up submit job, when idle node is greater than configuration item maximum, power on idle node while counting, just close a part of node, reach energy-conservation object.
2. the method for a kind of swarm intelligence load management according to claim 1, is characterized in that: the detailed process of described step 3 is: starts training in rotation node state finger daemon, in fixing interval, carries out,
A, by the order of node power administration module, obtain the power supply status of node, return node PowerOn/PowerOff state;
B, node power state is saved in to file PowerState.txt.
3. the method for a kind of swarm intelligence load management according to claim 2, is characterized in that: the implementation procedure of described training in rotation node state finger daemon is:
Step 1, first by reading node-map.txt, obtain the ip of the power management module of a node;
Step 2, read while finishing, if arrived end-of-file place, wait for the training in rotation time, and return to step 1;
Step 3, read and do not finish, send power management module node power status inquiry command;
Step 4, order return state is write to PowerState.txt, then return to step 1.
4. the method for a kind of swarm intelligence load management according to claim 1, is characterized in that: the detailed process of described step 4 is: starting switch machine decision-making finger daemon, in fixing interval, carry out,
A, open node power, below two kinds of situations need to open node:
(1) there is queued jobs; The resource needing according to queued jobs is searched and is met the node of job requirements and by its unlatching from PowerOff node;
(2) actual idle node quantity is less than the maximum idle node number that powers on, and opens node number=min((maximum idle node number that powers on and subtracts an actual idle node quantity nodes), (Poweroff nodes));
B, closed node power supply, actual idle node quantity is greater than the maximum idle node number that powers on, and need to meet two conditions when closed node: one is free time of closed node to be greater than configuration item node idle duration; Another condition is that the idle node quantity of once closing is less than or equal to maximum nodes of shutoff operation of configuration item, wherein the reserved number of nodes of actual idle node quantity=idle node amount of number –.
CN201310695452.1A 2013-12-18 2013-12-18 Intelligent cluster load management method Pending CN103645956A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310695452.1A CN103645956A (en) 2013-12-18 2013-12-18 Intelligent cluster load management method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310695452.1A CN103645956A (en) 2013-12-18 2013-12-18 Intelligent cluster load management method

Publications (1)

Publication Number Publication Date
CN103645956A true CN103645956A (en) 2014-03-19

Family

ID=50251177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310695452.1A Pending CN103645956A (en) 2013-12-18 2013-12-18 Intelligent cluster load management method

Country Status (1)

Country Link
CN (1) CN103645956A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897133A (en) * 2017-02-27 2017-06-27 郑州云海信息技术有限公司 A kind of implementation method based on the management cluster load of PBS job schedulings
CN107480013A (en) * 2017-08-11 2017-12-15 网宿科技股份有限公司 A kind of method and apparatus of calculation procedure redundancy
CN108279982A (en) * 2018-02-27 2018-07-13 郑州云海信息技术有限公司 Pbs resources and hadoop method for managing resource, system and equipment
CN110022246A (en) * 2019-04-15 2019-07-16 苏州浪潮智能科技有限公司 Distributed type assemblies equipment power dissipation monitoring method, device, system and associated component
CN110764872A (en) * 2019-10-21 2020-02-07 深圳金蝶账无忧网络科技有限公司 Automatic tax declaring method and system based on cloud service architecture and related equipment
CN111741130A (en) * 2020-07-31 2020-10-02 苏州交驰人工智能研究院有限公司 Server management method, device, equipment and storage medium
CN111930502A (en) * 2020-07-31 2020-11-13 苏州交驰人工智能研究院有限公司 Server management method, device, equipment and storage medium
CN116449935A (en) * 2023-06-02 2023-07-18 工业富联(佛山)创新中心有限公司 Cluster energy-saving management method, electronic equipment and computer storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101661324A (en) * 2009-07-21 2010-03-03 浪潮电子信息产业股份有限公司 Energy-saving method of multipath server
US20100083015A1 (en) * 2008-10-01 2010-04-01 Hitachi, Ltd. Virtual pc management method, virtual pc management system, and virtual pc management program
CN102520783A (en) * 2011-11-04 2012-06-27 浪潮电子信息产业股份有限公司 Method capable of realizing energy saving of smart rack and rack system
CN102622273A (en) * 2012-02-23 2012-08-01 中国人民解放军国防科学技术大学 Self-learning load prediction based cluster on-demand starting method
US20120233474A1 (en) * 2011-03-10 2012-09-13 Sanken Electric Co., Ltd. Power supply and control method thereof
JP2013206162A (en) * 2012-03-28 2013-10-07 Nec Corp Power consumption control device, information processing device, power consumption control method, and program

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100083015A1 (en) * 2008-10-01 2010-04-01 Hitachi, Ltd. Virtual pc management method, virtual pc management system, and virtual pc management program
CN101661324A (en) * 2009-07-21 2010-03-03 浪潮电子信息产业股份有限公司 Energy-saving method of multipath server
US20120233474A1 (en) * 2011-03-10 2012-09-13 Sanken Electric Co., Ltd. Power supply and control method thereof
CN102520783A (en) * 2011-11-04 2012-06-27 浪潮电子信息产业股份有限公司 Method capable of realizing energy saving of smart rack and rack system
CN102622273A (en) * 2012-02-23 2012-08-01 中国人民解放军国防科学技术大学 Self-learning load prediction based cluster on-demand starting method
JP2013206162A (en) * 2012-03-28 2013-10-07 Nec Corp Power consumption control device, information processing device, power consumption control method, and program

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897133A (en) * 2017-02-27 2017-06-27 郑州云海信息技术有限公司 A kind of implementation method based on the management cluster load of PBS job schedulings
CN106897133B (en) * 2017-02-27 2020-09-29 苏州浪潮智能科技有限公司 Implementation method for managing cluster load based on PBS job scheduling
CN107480013A (en) * 2017-08-11 2017-12-15 网宿科技股份有限公司 A kind of method and apparatus of calculation procedure redundancy
CN108279982A (en) * 2018-02-27 2018-07-13 郑州云海信息技术有限公司 Pbs resources and hadoop method for managing resource, system and equipment
CN108279982B (en) * 2018-02-27 2021-11-09 郑州云海信息技术有限公司 Method, system and equipment for managing pbs resources and hadoop resources
CN110022246A (en) * 2019-04-15 2019-07-16 苏州浪潮智能科技有限公司 Distributed type assemblies equipment power dissipation monitoring method, device, system and associated component
CN110764872A (en) * 2019-10-21 2020-02-07 深圳金蝶账无忧网络科技有限公司 Automatic tax declaring method and system based on cloud service architecture and related equipment
CN111741130A (en) * 2020-07-31 2020-10-02 苏州交驰人工智能研究院有限公司 Server management method, device, equipment and storage medium
CN111930502A (en) * 2020-07-31 2020-11-13 苏州交驰人工智能研究院有限公司 Server management method, device, equipment and storage medium
CN116449935A (en) * 2023-06-02 2023-07-18 工业富联(佛山)创新中心有限公司 Cluster energy-saving management method, electronic equipment and computer storage medium
CN116449935B (en) * 2023-06-02 2023-11-21 工业富联(佛山)创新中心有限公司 Cluster energy-saving management method, electronic equipment and computer storage medium

Similar Documents

Publication Publication Date Title
CN103645956A (en) Intelligent cluster load management method
Rong et al. Optimizing energy consumption for data centers
Goiri et al. Greenhadoop: leveraging green energy in data-processing frameworks
Cheng et al. An energy-saving task scheduling strategy based on vacation queuing theory in cloud computing
CN103188277B (en) load energy consumption management system, method and server
CN106059835B (en) A kind of High-reliability Control method of low energy consumption computer set group node
CN103684916A (en) Method and system for intelligent monitoring and analyzing under cloud computing
CN103700041A (en) Cloud computation-based smart grid load prediction management platform
Jin et al. Multi-agent-based cloud architecture of smart grid
CN103645795A (en) Cloud computing data center energy saving method based on ANN (artificial neural network)
CN103701889A (en) Data center energy saving method on basis of cloud computing
CN106897133B (en) Implementation method for managing cluster load based on PBS job scheduling
CN104796673B (en) A kind of cloud video monitoring system task cut-in method towards energy optimization
CN104270430A (en) Server remote dispatching method suitable for cloud computing
CN106483876A (en) A kind of energy scheduling architecture of new forms of energy data center
CN102855157A (en) Method for comprehensively scheduling load of servers
Tian et al. Modeling and analyzing power management policies in server farms using stochastic petri nets
CN105528054B (en) Group system integrated dispatch power-economizing method and device
Li et al. Oasis: Scaling out datacenter sustainably and economically
CN106774813A (en) A kind of dynamic power management system and method
TW201234734A (en) Hybrid intelligent power management device and method
Jam et al. Survey on improved Autoscaling in Hadoop into cloud environments
CN102042638A (en) Computer concentrating control system of urban heating end equipment and temperature end control method thereof
Hua et al. Building fuel powered supercomputing data center at low cost
CN101661325A (en) Power source dynamic management method of mobile equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140319

WD01 Invention patent application deemed withdrawn after publication