CN103645956A

CN103645956A - Intelligent cluster load management method

Info

Publication number: CN103645956A
Application number: CN201310695452.1A
Authority: CN
Inventors: 焦芬芳
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: Inspur Electronic Information Industry Co Ltd
Priority date: 2013-12-18
Filing date: 2013-12-18
Publication date: 2014-03-19

Abstract

The invention provides an intelligent cluster load management method. The intelligent load management in a cluster system is finished by operating one shell script or a C program, and a part of nodes in a cluster are automatically opened or closed according to the change condition of an operating load; when the operating load is light, some node power supplies are powered off through a power management module; and when the operating load is heavy, a part of node power supplies are powered on through the power management module, thereby assisting a cluster manager in saving energy. Compared with the prior art, the intelligent cluster load management method is high in practicability and easy to popularize, a large quantity of idle nodes can be closed, and a large amount of energy charge can be saved for a manager of the cluster system.

Description

A kind of method of swarm intelligence load management

Technical field

The present invention relates to Computer Applied Technology field, specifically a kind of method of swarm intelligence load management.

Background technology

In a large-scale group system, the electricity charge account for a big chunk of cluster system management expense, because all node is all in open state in traditional group system, no matter whether at full capacity the utilization factor of clustered node, have wasted a lot of electricity charge.

In group system, on most of node, have " power management module ", if can close automatically/open node power according to job load situation, can save a large amount of manpowers, financial resources to cluster administrator.When job load is lighter, automatically close a part of idle node power supply, reach energy-conservation object; When job load is heavy, automatically open a part of power-off node, meet job requirements, based on this, now provide a kind of method of swarm intelligence load management.

Summary of the invention

Technical assignment of the present invention is to solve the deficiencies in the prior art, and a kind of method of swarm intelligence load management is provided.

Technical scheme of the present invention realizes in the following manner, and the method for this kind of swarm intelligence load management, comprises the following steps:

Configuration item in step 1, configuration file, this configuration item comprises: maximum powers on idle node number, node idle duration, maximum nodes of shutoff operation, training in rotation time;

The corresponding relation file of node and node power administration module ip in step 2, group system, form is:

The power management module ip of node name node, the ip address that obtains power management module when obtaining node state by reading this document;

Step 3, startup training in rotation node state finger daemon, this finger daemon regularly passes through to node power module sending node power supply status querying command, by reading node and node power administration module ip file, obtain the ip of power management module, send the order of obtaining power supply status then to this ip, obtain node power state, and deposit file in, obtain the power supply status of node;

Step 4, starting switch machine decision-making finger daemon, in Fixed Time Interval, whether inquiry has queued jobs, free time/the busy condition of node, reserved node, the number of nodes of electric power starting PowerOn and power-off PowerOff, according to Query Result above and the configuration item value in configuration file, determine switching on and shutting down decision-making, its decision process foundation is the demand that meets operation in group system, when meeting job requirements, the idle node of redundancy some is in power-up state, when the operation does not above have releasing resource, idle node meets the demand of follow-up submit job, when idle node is greater than configuration item maximum, power on idle node while counting, just close a part of node, reach energy-conservation object.

The detailed process of described step 3 is: starts training in rotation node state finger daemon, in fixing interval, carries out,

A, by the order of node power administration module, obtain the power supply status of node, return node PowerOn/PowerOff state;

B, node power state is saved in to file PowerState.txt.

Further, the implementation procedure of described training in rotation node state finger daemon is:

Step 1, first by reading node-map.txt, obtain the ip of the power management module of a node;

Step 2, read while finishing, if arrived end-of-file place, wait for the training in rotation time, and return to step 1;

Step 3, read and do not finish, send power management module node power status inquiry command;

Step 4, order return state is write to PowerState.txt, then return to step 1.

The detailed process of described step 4 is: starting switch machine decision-making finger daemon, in fixing interval, carry out,

A, open node power, below two kinds of situations need to open node:

(1) there is queued jobs; The resource needing according to queued jobs is searched and is met the node of job requirements and by its unlatching from PowerOff node;

(2) actual idle node quantity is less than the maximum idle node number that powers on, and opens node number=min((maximum idle node number that powers on and subtracts an actual idle node quantity nodes), (Poweroff nodes));

B, closed node power supply, actual idle node quantity is greater than the maximum idle node number that powers on, and need to meet two conditions when closed node: one is free time of closed node to be greater than configuration item node idle duration; Another condition is that the idle node quantity of once closing is less than or equal to maximum nodes of shutoff operation of configuration item, wherein the reserved number of nodes of actual idle node quantity=idle node amount of number –.

The beneficial effect that the present invention compared with prior art produced is:

The method of a kind of swarm intelligence load management of the present invention completes intelligent load in group system and manages by moving a shell script or c program, according to the situation of change of job load, part of nodes in automatic opening/closing cluster, when job load is light, by power management module, close some node power; When job load is heavy, by power management module, open a part of node power, help cluster administrator energy-conservation; Can, according to the operation of group system and resource tendency, automatically open, close the node power in cluster; When job load is lighter, a large amount of idle node are closed; When job load is heavy, the node of power-off is reopened; Because the electricity charge account for significant proportion in the managerial cost of group system, and use the method, save a large amount of electricity charge can to cluster system management person; Practical, be easy to promote.

Accompanying drawing explanation

Accompanying drawing 1 is training in rotation node state finger daemon realization flow figure of the present invention.

Accompanying drawing 2 is realization flow figure of switching on and shutting down decision-making finger daemon of the present invention.

Embodiment

Below in conjunction with accompanying drawing, the method for a kind of swarm intelligence load management of the present invention is described in detail below.

In order to overcome the above problems, invented a kind of method of the swarm intelligence load management based on linux shell or C language herein, comprise the following steps:

Configuration item in step 1, configuration file, this configuration item comprises: maximum powers on idle node number, node idle duration, maximum nodes of shutoff operation, training in rotation time.

Its specific operation process is as described below:

Write configuration file InteliLoad.cfg, the contents are as follows:

The # maximum idle node number that powers on, according to cluster scale setting, empirical value, unit node number.

MaxIdleNodeNum 10

# node idle duration, empirical value, in seconds.

NodeIdleDuration 60

# maximum nodes of shutoff operation, empirical value, unit node number.

MaxOperatingNum 10

The # training in rotation time, how long detect one-stop operation load and node state, empirical value, in seconds.

PollIterval 120

The power management module ip of node name node, the ip address that obtains power management module when obtaining node state by reading this document.

Its specific operation process is:

The corresponding relation file node_map.txt that writes node and this node power administration module ip in group system, form is as follows:

# node name node power source administration module ip

Node1 10.156.3.5

Node2 10.156.3.6

Node3 10.156.3.7

……

Step 3, startup training in rotation node state finger daemon, this finger daemon regularly passes through to node power module sending node power supply status querying command, by reading node and node power administration module ip file, obtain the ip of power management module, send the order of obtaining power supply status then to this ip, obtain node power state, and deposit file in, obtain the power supply status of node.

As shown in Figure 1, the detailed process of described step 3 is: start training in rotation node state finger daemon, in fixing interval, carry out,

A, by the order of node power administration module, obtain the power supply status of node, return node PowerOn/PowerOff state.

B, node power state is saved in to file PowerState.txt.

The implementation procedure of described training in rotation node state finger daemon is:

Step 1, first by reading node-map.txt, obtain the ip of the power management module of a node.

Step 2, read while finishing, if arrived end-of-file place, wait for the training in rotation time, and return to step 1.

Step 3, read and do not finish, send power management module node power status inquiry command.

Step 4, order return state is write to PowerState.txt, then return to step 1.

As shown in Figure 2, the detailed process of described step 4 is: starting switch machine decision-making finger daemon, in fixing interval, carry out,

A, open node power, below two kinds of situations need to open node:

(1) there is queued jobs; The resource needing according to queued jobs is searched and is met the node of job requirements and by its unlatching from PowerOff node.

(2) actual idle node quantity is less than the maximum idle node number that powers on, and opens node number=min((maximum idle node number that powers on and subtracts an actual idle node quantity nodes), (Poweroff nodes)).

The foregoing is only embodiments of the invention, within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims

1. a method for swarm intelligence load management, is characterized in that comprising the following steps:

2. the method for a kind of swarm intelligence load management according to claim 1, is characterized in that: the detailed process of described step 3 is: starts training in rotation node state finger daemon, in fixing interval, carries out,

B, node power state is saved in to file PowerState.txt.

3. the method for a kind of swarm intelligence load management according to claim 2, is characterized in that: the implementation procedure of described training in rotation node state finger daemon is:

Step 4, order return state is write to PowerState.txt, then return to step 1.

4. the method for a kind of swarm intelligence load management according to claim 1, is characterized in that: the detailed process of described step 4 is: starting switch machine decision-making finger daemon, in fixing interval, carry out,

A, open node power, below two kinds of situations need to open node: