Disclosure of Invention
The technical problem to be solved by the invention is to provide a method for independently and simultaneously distributing off-peak power-up of a server, which can realize decentralization, reduce cost, improve reliability and reduce failure rate. On the basis, a system adopting the method for independently and simultaneously distributing off-peak power-up of the server is further provided.
Therefore, the invention provides a method for powering up servers independently and simultaneously with peak staggering distribution, which comprises the following steps:
step S1, acquiring a true random number through a random number generator, and processing the range of the true random number through data processing to acquire a variable factor of a [0,1] closed interval;
s2, calculating the off-peak power-on delay time of each server in the cluster data center according to the variable factor;
and S3, executing a delay action according to the off-peak power-on delay time, starting timing counting at the same time, and calling the embedded system to trigger the mainboard of the server through the GPIO interface to realize power-on after the current server reaches the off-peak power-on delay time.
A further development of the invention is that said step S1 comprises the following substeps:
step S101, initializing floating-point variable factors;
step S102, HASH HASH calculation is carried out on the random asynchronous event;
step S103, endowing the value of HASH HASH calculation to a variable factor;
step S104, carrying out normalization operation on the variable factor to obtain [0,1]]Closed interval variable factor normalized rand 。
In a further improvement of the present invention, in the step S102, the random asynchronous event includes an asynchronous interrupt, an asynchronous driving event, and overall information of a core clock.
The invention is further improved in that the step S2 is based on the formula T i =OP Round up to round (normalized rand xAA) calculating the off-peak power-on delay time T of each server in the clustered data center i Wherein i represents a server serial number of the cluster data center, OP Round down on Representing a rounding operation function and a representing a preconfigured maximum number of delays.
In a further development of the invention, the preconfigured maximum number of delays a is 0 to 2 23 An integer between, used to represent the overall sampling delay time of all servers of the clustered data center.
The invention is further improved in that a maximum delay number configuration table is preset and set, the maximum delay number configuration table is used for recording the maximum delay number A and the number of the servers, and the maximum delay number A is automatically acquired by inputting the current number of the servers to inquire the maximum delay number configuration table in the user configuration stage.
The invention is further improved in that a convergence coefficient B is added in the maximum delay number configuration table, and the convergence coefficient B is set as 1 by default; when the convergence coefficient B increases, controlling the preset maximum delay number A to decrease according to the increasing proportion of the convergence coefficient B; and when the convergence coefficient B is reduced, controlling the preconfigured maximum delay number A to be increased according to the reduction proportion of the convergence coefficient B.
In a further improvement of the present invention, in the step S3, the timing control of the complete power-on process includes the following sub-steps:
step S301, sequentially carrying out AC power-on, uboot power-on and embedded operating system power-on within fixed starting time to complete BMC power-on starting;
step S302, starting off peak shifting control, executing delay action according to the peak shifting power-on delay time, and calling an embedded system to realize power-on start of a main system through a server start instruction.
The invention also provides a system for powering on the server independently and simultaneously with the peak offset, which adopts the method for powering on the server independently and simultaneously with the peak offset and comprises the following steps:
the variable factor acquisition module is used for acquiring a true random number through a random number generator and processing the range of the true random number through data processing to acquire a variable factor of a [0,1] closed interval;
the peak shifting power-on delay time calculation module is used for calculating peak shifting power-on delay time of each server in the cluster data center according to the variable factor;
and the off-peak power-on execution module executes delay action according to the off-peak power-on delay time, starts timing counting at the same time, and calls the embedded system to trigger the mainboard of the server through the GPIO interface to realize power-on after the current server reaches the off-peak power-on delay time.
Compared with the prior art, the invention has the beneficial effects that: the resources of the existing cluster data center can be directly utilized, a set of controller system is not required to be configured independently, the installation and maintenance strategies can be effectively simplified, and the cost is reduced. The invention adopts a decentralized design, effectively avoids the problem that a large number of servers are powered on simultaneously and instantly in a short time through an optimized off-peak power-on method on the basis of improving the reliability and reducing the failure rate, further effectively reduces the labor cost and the system maintenance cost, and is convenient for the operation and popularization of the system.
Detailed Description
Preferred embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
The present embodiment explains the noun, wherein the HASH algorithm is an irreversible algorithm, i.e., HASH algorithm, including MD5, SHA-256, HMAC (keyed HASH computation), etc., which can be used as a uniformly distributed generation algorithm. The BMC refers to a Baseboard Management Controller, comprises a basic control chip and an external Management SOC chip, and is a remote operation and maintenance Management support chip special for a server, and an IPMI protocol stack and an interface are deployed. IDC, referred to as an Integrated Data Center, is a centralized location for centralized placement and large-scale deployment of servers, providing the necessary power and heat dissipation management.
Existing clustered data centers have included the following available resources and environments:
1. the high performance ARM32 is an embedded processor system, powered by the Standby/Aux (auxiliary power system). Namely: before an X86 host system of the mainboard is powered on, the chip is in a working state, the power consumption is lower than 2W, and the power consumption is very low and negligible.
2. The true random number generator is a random number generator used for generating a true random number, the true random number refers to a random number depending on a physical random number generator, and an embedded Linux system configured and cut by an ARM32 system is adopted, so that a chaotic environment close to hardware simulation can be provided, and asynchronous events at a hardware level can be realized. The asynchronous events include, and are not limited to: the true random number generator comprehensively generates a true random number closer to a true random number by an algorithm than a pseudo random number generated by a software system, and is called a random number generator for short.
3. The BMC Power On capability is that the embedded system can be connected with a Power-On control circuit of a server mainboard where the embedded system is located, and the BMC system can output a control signal through an internal GPIO pin to control the Power-On time of the mainboard, wherein the resource is called the embedded system for short.
Therefore, based on the existing resources and environment of the clustered data center, in each server motherboard deployed in the IDC, the BMC platform on each motherboard can independently and independently execute a distributed independent self-determined off-peak power-on method through a common specific delay decision algorithm embedded therein, that is, a method for realizing independent and distributed off-peak power-on of the server.
More specifically, as shown in fig. 1, this embodiment provides a method for powering up servers independently and simultaneously in a peak-staggered manner, including the following steps:
s1, obtaining a true random number through a random number generator, and processing the range of the true random number through data processing to obtain a variable factor of a [0,1] closed interval;
s2, calculating the off-peak power-on delay time of each server in the cluster data center according to the variable factor;
and S3, executing a delay action according to the off-peak power-on delay time, starting timing counting at the same time, and calling the embedded system to trigger the mainboard of the server through the GPIO interface to realize power-on after the current server reaches the off-peak power-on delay time.
In this embodiment, the step S1 includes the following sub-steps:
step S101, initializing floating-point variable factors, wherein the pseudo code is in the form of float normalized rand =0.0;
Step S102, HASH HASH calculation is carried out on the random asynchronous event;
step S103, assigning the value of HASH HASH calculation to a variable factor;
step S104, normalizing the variable factor,obtaining [0,1]Closed interval variable factor normalized
rand The pseudo code is in the form of:
unnormalized
rand the variable factor obtained in step S103 is assigned.
In step S102 in this embodiment, the random asynchronous event includes an asynchronous interrupt, an asynchronous driving event, and whole information of a kernel clock. The asynchronous interrupt comes from a random interrupt signal and is used for preventing a pseudo-random sequence; the asynchronous driving event comes from a random system calling signal and is also used for preventing a pseudo-random sequence; the core clock is used to provide timestamp information. Therefore, step S102 in this example can collect interrupts of the embedded hardware environment or hardware asynchronous signals and events of other devices, combine information such as timestamp, and then perform HASH calculation on the whole information, and the result is 0 to 2 23 An integer of (2).
In this embodiment, the step S2 is represented by the formula T i =OP Round down on (normalized rand xAA) calculating the off-peak power-on delay time T of each server in the clustered data center i Wherein i represents a server serial number of the cluster data center, OP Round down on Representing a rounding operation function and a representing a preconfigured maximum number of delays. Further, the variable factor normalized can be obtained from step S1 rand The independent self-determination delay of each server is realized, and the requirement that all the server delays of the IDC are uniformly distributed and randomly distributed is met on the whole.
The preconfigured maximum delay number a in this embodiment may be configured by the IPMI management system in a unified manner, and the configuration is configured in batch at one time by operation and maintenance personnel, or is configured by default and fixed by a manufacturer before the factory, and may also be set and adjusted in a user-defined manner according to actual conditions and requirements. Preferably, the preconfigured maximum number of delays a is 0 to 2 23 An integer between, used to represent the overall sampling delay time of all servers of the clustered data center. The preconfigured maximum delayThe smaller the value set for the delay number a, the more converged the number of independent, self-determined delay seconds, the greater the number of devices that are simultaneously activated at the same time (e.g., within the same second), which means that the number of delay seconds for independent, self-determined sampling is more concentrated, as shown in fig. 2. Assuming that the maximum delay number a is set to 10 seconds and the number of servers in the entire IDC is 100, the present embodiment autonomously samples the delay for 10 seconds for 100 servers, and approximately 10 servers are powered on at the same time in the same second.
More preferably, the embodiment may further preset a maximum delay number configuration table, where the maximum delay number configuration table is used to record a maximum delay number a and the number of servers that are configured in advance, and the maximum delay number configuration table is queried by inputting the current number of servers in the user configuration stage to automatically obtain the maximum delay number a that is configured in advance.
More preferably, based on different use environments, a convergence coefficient B may be further added to the maximum delay number configuration table, where the convergence coefficient B refers to a self-defined coefficient for controlling a delay convergence speed, and the convergence coefficient B is set to 1 by default; when the convergence coefficient B increases, controlling the preconfigured maximum delay number a to decrease according to the increase proportion of the convergence coefficient B, for example, if the convergence coefficient B increases twice, dividing the maximum delay number a by 2 on the original basis, that is, decreasing to one half of the original, so as to achieve faster convergence; when the convergence coefficient B is decreased, the preconfigured maximum delay number a is controlled to be increased according to the decrease ratio of the convergence coefficient B, for example, the convergence coefficient B is decreased to one half of the original value, and the maximum delay number a is divided by one half of the original value, that is, the maximum delay number a is increased to two times of the original value, so as to realize a larger maximum delay number a, and further, the requirements of different environments or adaptability changes can be well met.
As shown in fig. 3, in step S3 of this embodiment, the time sequence control of the complete power-on process includes the following sub-steps:
step S301, sequentially carrying out AC power-on, uboot power-on and embedded operating system power-on within fixed starting time to complete BMC power-on starting; the fixed starting time refers to the basically fixed BMC power-on starting before peak shifting control, and the time of the part is basically unchanged, so that the peak shifting control is not needed;
step S302, starting off peak shifting control, executing delay action according to the peak shifting power-on delay time, and calling an embedded system to realize power-on start of a main system through a server start instruction.
The embodiment also provides a system for powering up a server independently and simultaneously with the peak offset, which adopts the method for powering up the server independently and simultaneously with the peak offset, and comprises:
the variable factor acquisition module is used for acquiring a true random number through a random number generator and processing the range of the true random number through data processing to acquire a variable factor of a [0,1] closed interval;
the peak staggering power-on delay time calculation module is used for calculating the peak staggering power-on delay time of each server in the cluster data center according to the variable factors;
and the off-peak power-on execution module executes delay action according to the off-peak power-on delay time, starts timing counting at the same time, and calls the embedded system to trigger the mainboard of the server through the GPIO interface to realize power-on after the current server reaches the off-peak power-on delay time.
In summary, the present embodiment can directly utilize the resources of the existing clustered data center, and a set of controller system does not need to be configured separately, so that the installation and maintenance strategies can be effectively simplified, and the cost is reduced. The invention adopts a decentralized design, effectively avoids the problem that a large number of servers are powered on simultaneously and instantly in a short time through an optimized off-peak power-on method on the basis of improving the reliability and reducing the failure rate, further effectively reduces the labor cost and the system maintenance cost, and is convenient for the operation and popularization of the system.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.