WO2020220436A1 - 面向不同老化场景的虚拟机工作队列和冗余队列更新方法 - Google Patents

面向不同老化场景的虚拟机工作队列和冗余队列更新方法 Download PDF

Info

Publication number
WO2020220436A1
WO2020220436A1 PCT/CN2019/090870 CN2019090870W WO2020220436A1 WO 2020220436 A1 WO2020220436 A1 WO 2020220436A1 CN 2019090870 W CN2019090870 W CN 2019090870W WO 2020220436 A1 WO2020220436 A1 WO 2020220436A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual machine
redundant
queue
scenario
cpu
Prior art date
Application number
PCT/CN2019/090870
Other languages
English (en)
French (fr)
Inventor
郭军
王馨悦
张斌
刘晨
侯帅
侯凯
李薇
柳波
王嘉怡
刘文凤
张瀚铎
张娅杰
Original Assignee
东北大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东北大学 filed Critical 东北大学
Publication of WO2020220436A1 publication Critical patent/WO2020220436A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Definitions

  • the invention relates to the technical field of cloud computing, in particular to a method for updating virtual machine work queues and redundant queues facing different aging scenarios.
  • cloud service providers need to do their best to ensure the quality of the system’s service and reduce the number of violations of service agreements; on the other hand, they need to improve resource utilization and reduce Service cost.
  • it is the most effective way to monitor cloud environment changes in real time and dynamically adjust cloud resources.
  • the aging of virtual machine software and concurrent business access are two factors that cannot be ignored.
  • the software aging problem in the cloud service system seriously affects the performance and reliability of the service.
  • Various aging factors of the virtual machine continue to accumulate under the continuous and high concurrent business access of 24 hours * 7 days, resulting in the gradual reduction of the available resources of the virtual machine.
  • the software runs slower, and the number of failed requests and request response time increase.
  • the cloud resource adjustment methods proposed by predecessors lack consideration of software aging, which may cause problems such as poor adjustment effects and inability to guarantee service quality.
  • the technical problem to be solved by the present invention is to provide a method for updating work queues and redundant queues of virtual machines for different aging scenarios in view of the above-mentioned shortcomings of the prior art, so as to realize the updating of work queues and redundant queues of virtual machines.
  • the technical solution adopted by the present invention is: a virtual machine work queue and redundant queue update method for different aging scenarios, including the following steps:
  • Step 1 Divide different software aging scenarios according to the life time of the virtual machine and the fluctuation of the load, the specific method is:
  • Step 1.1 Divide the scenario where all virtual machines are in a healthy state for a period of time in the cloud service system into a scenario with short virtual machine survival time, also called scenario one;
  • Step 1.2 Run the virtual machine uninterruptedly for a long time.
  • the software aging factor accumulates along with the business visits, causing some virtual machines to be in an unhealthy state, but they have passed the Augmented Dickey Fuller Test (Augmented Dickey Fuller Test).
  • the ADF judges that the total concurrency of the cloud service system changes smoothly and does not cause the working virtual machine to fail.
  • the scenario is divided into the scenario where the virtual machine has a long survival time and the business concurrency is stable, which is also called scenario two;
  • Step 1.3 The external load fluctuates greatly, resulting in frequent adjustments of virtual resources, and the cloud service system is overloaded during the adjustment process, that is, the ADF method is used to determine the non-steady change in the total concurrency of the cloud service system, and there are already some virtual machines Scenarios in an unhealthy state are classified as scenarios in which the virtual machine has a long survival time and the business concurrency is not stable, which is also called scenario three;
  • Step 2 Using the method of dynamic update of virtual machine work queue based on ridge regression, dynamically adjust the number and order of working virtual machine copies;
  • Step 2.1 Under the premise of ignoring software aging factors, consider the business concurrency of virtual machines as independent variables, and regard CPU, memory, disk IO and network IO as dependent variables, and establish a ridge regression model for the cloud service system, thereby The concurrency of the business calculates the amount of resources required by the cloud service system;
  • Step 2.1.1 Determine the software aging scenario of the virtual machine
  • Step 2.1.2 Collect all kinds of data from the newly started working virtual machine, and substitute the concurrent business access and CPU and memory data into the ridge regression model;
  • x j represents the concurrency of the j-th type of business in the cloud service system
  • j 1,...,k
  • k is the number of business types supported by the virtual machine
  • y 1 , y 2 , y 3 , and y 4 respectively represent expectations
  • z represents the amount of CPU or memory or disk IO or network IO resources required by the cloud service system
  • ⁇ j is the concurrency of the jth type of business in the resource calculation
  • Influence weights ⁇ 1 , ⁇ 2 , ⁇ 3 , and ⁇ 4 respectively represent the expected weights of CPU, memory, disk IO and network IO performance in the resource calculation process
  • is the error constant
  • Step 2.1.3 Use the least square method to iteratively solve the loss function of the ridge regression model to minimize the loss function Loss of the ridge regression model, as shown in the following formula:
  • n represents the number of concurrency of various services collected on the working virtual machine
  • Z i represents the actual resource demand
  • represents the regular term coefficient
  • Step 2.1.4 Minimize the loss function Loss of the ridge regression model, determine the parameters ⁇ 1 , ..., ⁇ k , ⁇ 1 , ⁇ 2 and ⁇ , and solve the minimum of the Loss function when the partial derivative value of the parameter is zero Value, as shown in the following formula:
  • Step 2.1.5 Solve the equation composed of all the parameters according to formulas 3 and 4, and substitute the collected business concurrency, resource utilization, and CPU, memory, disk IO and network IO resources to obtain the ridge regression model 2k+6 parameters to determine the relationship between various services and CPU, memory, disk IO and network IO;
  • Step 2.1.6 Substituting the business concurrency of the cloud service system into formula 1, to obtain the various resources required by the cloud service system;
  • Step 2.2 Determine the number of working virtual machines required according to the various resources required by the cloud service system.
  • the specific method is:
  • Step 2.2.1 Determine the loss of the virtual machine according to different scenarios
  • Step 2.2.1.1 For scenario two and scenario three, working virtual machines with different software aging levels have different memory resource consumption. When calculating existing cloud resources, the memory resources of each virtual machine are converted according to the software aging degree, and they are served at the same time Virtual machines that have expired will no longer be counted as available resources;
  • Step 2.2.1.2 All working virtual machines in scenario 1 are in a healthy state, and the aging loss is ignored in this scenario;
  • Step 2.2.2 There are f working virtual machines, the number of working virtual machines required in the next period of time Num work is calculated by the following formula, and the minimum value of Num work is one:
  • Res cpu and Res mem respectively represent the amount of resources available for the CPU and memory of the cloud service system
  • z cpu_h and z cpu_l are the upper and lower bounds of the CPU resources obtained according to the expected range of virtual machine performance
  • z mem_h , z mem_l is the upper bound and lower bound of the memory resources obtained according to the expected range of virtual machine performance
  • vm cpu and vm mem represent the number of CPU cores and memory size of a virtual machine copy
  • s is the software aging degree of the virtual machine
  • Step 2.3 Process the working virtual machine that has been down or the service has failed, the specific method is:
  • Step 2.3.1 Replace the virtual machine that has gone down
  • Step 2.3.2 Replace the virtual machine with invalid service
  • Step 2.3.2.1 If the virtual machine redundancy queue is not empty, immediately select the virtual machine from the tail of the redundancy queue to replace it, and restart the down virtual machine to the tail of the redundancy queue;
  • Step 2.3.2.2 If the virtual machine redundancy queue is empty, restart the down virtual machine directly, and put it at the end of the work queue after restart;
  • Step 2.4 Add and delete work virtual machines according to the calculated number of required work virtual machines Num work , and update the virtual machine work queue.
  • the specific method is:
  • Step 2.4.1 Add a working virtual machine
  • Step 2.4.1.1 Select a virtual machine from the tail of the virtual machine redundancy queue to add to the virtual machine work queue. If there are not enough redundant virtual machines, create a virtual machine and start adding it to the end of the work queue;
  • Step 2.4.1.2 Sort all virtual machines in the work queue according to the software aging degree from large to small;
  • Step 2.4.2 Release the working virtual machine, delete the virtual machine from the head of the virtual machine work queue, and put it into the virtual machine redundant queue;
  • Step 3 Dynamically update the redundant queue of the virtual machine based on the binary decision diagram, the specific method is:
  • Step 3.1 Determine the usage of redundant virtual machines according to the current software aging scenario of the cloud service system and the aging situation of the cloud service system;
  • Step 3.2 Use the Binary Decision Diagram (BDD) to dynamically update the redundant queue of virtual machines under scenario 3.
  • BDD Binary Decision Diagram
  • Step 3.2.1 Initialize the decision graph BDD with the character ‘#’, initialize the ‘0’ leaf node, initialize the ‘1’ leaf node, and then initialize the other nodes in the BDD with the character ‘#’;
  • Step 3.2.2 Calculate the service failure probability of the virtual machine, select the Weber distribution to fit the service failure time sample of the working virtual machine, and accumulate the Weber distribution function F(t), as shown in the following formula:
  • F(t) represents the probability of service failure of the virtual machine during the working hours of 0 to t.
  • the redundant virtual machine does not process any business requests in the sleep state.
  • the service failure rate is approximately 0, and ⁇ >0 is the proportional parameter. ⁇ >0 is the shape parameter;
  • Step 3.2.3 Calculate the number of redundant virtual machines
  • Step 3.2.3.1 According to step 2, the demand for working virtual machines is calculated to be n′ sets;
  • Step 3.2.3.2 Each circle in the binary decision diagram represents a virtual machine node, the '1' side and the '0' side respectively represent the normal and service failure status of the virtual machine, and the rectangle represents the status of the entire cloud service system; all arrivals
  • the meaning of the path of the '1' rectangular box is: there are already k'working virtual machines in the path in the normal state, and the system can work normally no matter whether other working virtual machines are normal or not; and the meaning of the path to the '0' rectangular box is : There are already n'-k'+1 working virtual machines in this path that have been out of service. No matter whether other virtual machines are normal or not, the system cannot guarantee the user's service performance;
  • Step 3.2.3.3 When generating a binary decision graph, use a global two-dimensional matrix for storage; the subscript of the virtual machine v x+y+1 is (x, y), and the subscript of the root node v 1 is (0, 0) ; The reliability of the cloud service system is expressed by calculating the path probabilities from the root to all '1' rectangles. The probability of the decision graph with the virtual machine v x + y + 1 as the root node is calculated by the following formula:
  • R x + y + 1 represents the probability of service failure of the virtual machine v x+y+1
  • BDD[x+1][y] and BDD[x][y+1] respectively represent the virtual machine v x + y + 1 's '1' side and '0' side connected sub-decision graph;
  • Step 3.2.3.5 Set the initial value of the number m of redundant virtual machines according to the average software aging of all working virtual machines, calculate k’, and get m;
  • Step 3.2.4 Adjust the redundant queue of virtual machines according to the number m of redundant virtual machines
  • the beneficial effects produced by adopting the above technical solution are: the virtual machine work queue and redundant queue update method for different aging scenarios provided by the present invention, the effect of software aging on virtual machine performance and reliability is different under different working scenarios. Dividing different aging scenarios for targeted adjustment of cloud resources can not only effectively reduce the impact of software aging, but also save a certain amount of resource costs.
  • the regressed virtual machine work queue dynamic update algorithm is used to dynamically adjust the number and order of working virtual machine copies to ensure the quality of service of the system;
  • the virtual machine redundant queue dynamic update algorithm based on the binary decision diagram is used even if the work virtual machine appears Service failure, redundant virtual machine can switch state in a short time, completely replace the service failure virtual machine.
  • Figure 1 is an example topology diagram of an aircraft online ordering system provided by an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for updating virtual machine work queues and redundant queues for different aging scenarios according to an embodiment of the present invention
  • Figure 3 is a schematic structural diagram of a binary decision diagram provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of the number of failed requests under different adjustment methods provided by embodiments of the present invention.
  • FIG. 6 is a schematic diagram of average memory utilization under different adjustment methods provided by embodiments of the present invention.
  • FIG. 7 is a schematic diagram of average CPU utilization under different adjustment methods provided by embodiments of the present invention.
  • an airline ticket online ordering system is used to simulate a PC-side user application, and the service system is built on the Sugon server.
  • the virtual machine work queue and redundant queue update method for different aging scenarios of the present invention is used to update the work queue and redundant queue of the virtual machine.
  • the experiment uses a total of three Sugon servers, one of which is responsible for load balancing, and is used to collect and analyze virtual machine data, formulate adjustment plans, etc., and the others are used to create multiple virtual machines. Each virtual machine is allocated 4 CPUs and 4G memory. And 20G disks, and install an online ticket ordering application with aging defects.
  • the adjustment method in the experiment is realized by Python and Shell language.
  • the example topology is shown in Figure 1.
  • the virtual machine work queue and redundant queue update methods for different aging scenarios include the following steps:
  • Step 1 Divide different software aging scenarios according to the life time of the virtual machine and the fluctuation of the load, the specific method is:
  • Step 1.1 Divide the scenario where all virtual machines are in a healthy state for a period of time in the cloud service system into a scenario with short virtual machine survival time, also called scenario one;
  • the creation time of all virtual machines of the cloud service system is relatively late and the continuous working time is short. Therefore, all virtual machines are in a healthy state for a period of time, that is, the software aging degree is between 0 and 0.2. In addition, these virtual machines may It is released in a relatively short period of time, so the software aging in this scenario has less impact on the virtual machine. From the perspective of cost saving, the software aging factor can be temporarily ignored when adjusting cloud resources.
  • Step 1.2 Run the virtual machine uninterruptedly for a long time.
  • the software aging factor accumulates along with the business visits, causing some virtual machines to be in an unhealthy state, but they have passed the Augmented Dickey Fuller Test (Augmented Dickey Fuller Test).
  • the ADF judges that the total concurrency of the cloud service system changes smoothly and does not cause the working virtual machine to fail.
  • the scenario is divided into the scenario where the virtual machine has a long survival time and the business concurrency is stable, which is also called scenario two;
  • the virtual machines in the cloud service system run uninterrupted for a long time, and the software aging factors continue to accumulate with business visits, causing some virtual machines to be in an unhealthy state, that is, the software aging degree is greater than 0.2, but the business concurrency changes relatively Stable, generally does not cause the work virtual machine to malfunction.
  • the ADF method is used to judge the stability of the total concurrency of the cloud service system. If there is no unit root, it means that the concurrency of the business changes smoothly.
  • Step 1.3 The external load fluctuates greatly, resulting in frequent adjustments of virtual resources, and the cloud service system is overloaded during the adjustment process, that is, the ADF method is used to determine the non-steady change in the total concurrency of the cloud service system, and there are already some virtual machines Scenarios in an unhealthy state are classified as scenarios in which the virtual machine has a long survival time and the business concurrency is not stable, which is also called scenario three;
  • the external load of the cloud service system fluctuates greatly, causing frequent adjustments of virtual resources, and the system may be overloaded during the adjustment process, thus accelerating the aging process; on the other hand, there are already some virtual machines in the system that are not healthy At this time, the system has high requirements for the reliability of each virtual machine, so it is necessary to add redundant virtual machines to ensure the quality of service of the system.
  • Step 2 Using the method of dynamic update of virtual machine work queue based on ridge regression, dynamically adjust the number and order of working virtual machine copies;
  • Step 2.1 Under the premise of ignoring software aging factors, consider the business concurrency of virtual machines as independent variables, and regard CPU, memory, disk IO and network IO as dependent variables, and establish a ridge regression model for the cloud service system, thereby The concurrency of the business calculates the amount of resources required by the cloud service system;
  • Step 2.1.1 Determine the software aging scenario of the virtual machine
  • Step 2.1.2 Collect all kinds of data from the newly started working virtual machine, and substitute the concurrent business access and CPU and memory data into the ridge regression model;
  • x j represents the concurrency of the j-th type of business in the cloud service system
  • j 1,...,k
  • k is the number of business types supported by the virtual machine
  • y 1 , y 2 , y 3 , and y 4 respectively represent expectations
  • z represents the amount of CPU or memory or disk IO or network IO resources required by the cloud service system
  • ⁇ j is the concurrency of the jth type of business in the resource calculation
  • Influence weights ⁇ 1 , ⁇ 2 , ⁇ 3 , and ⁇ 4 respectively represent the expected weights of CPU, memory, disk IO and network IO performance in the resource calculation process
  • is the error constant
  • Step 2.1.3 Use the least square method to iteratively solve the loss function of the ridge regression model to minimize the loss function Loss of the ridge regression model, as shown in the following formula:
  • n represents the number of concurrency of various services collected on the working virtual machine
  • Z i represents the actual resource demand
  • represents the regular term coefficient
  • Step 2.1.4 Minimize the loss function Loss of the ridge regression model, determine the parameters ⁇ 1 , ..., ⁇ k , ⁇ 1 , ⁇ 2 and ⁇ , and solve the minimum of the Loss function when the partial derivative value of the parameter is zero Value, as shown in the following formula:
  • Step 2.1.5 Solve the equation composed of all the parameters according to formulas 3 and 4, and substitute the collected business concurrency, resource utilization, and CPU, memory, disk IO and network IO resources to obtain the ridge regression model 2k+6 parameters to determine the relationship between various services and CPU, memory, disk IO and network IO;
  • Step 2.1.6 Substituting the business concurrency of the cloud service system into formula 1, to obtain the various resources required by the cloud service system;
  • Step 2.2 Determine the number of working virtual machines required according to the various resources required by the cloud service system.
  • the specific method is:
  • Step 2.2.1 Determine the loss of the virtual machine according to different scenarios
  • Step 2.2.1.1 For scenario two and scenario three, working virtual machines with different software aging levels have different memory resource consumption. When calculating existing cloud resources, the memory resources of each virtual machine are converted according to the software aging degree, and they are served at the same time Virtual machines that have expired will no longer be counted as available resources;
  • Step 2.2.1.2 All working virtual machines in scenario 1 are in a healthy state, and the aging loss is ignored in this scenario;
  • Step 2.2.2 There are f working virtual machines, the number of working virtual machines required in the next period of time Num work is calculated by the following formula, and the minimum value of Num work is one:
  • Res cpu and Res mem respectively represent the amount of resources available for the CPU and memory of the cloud service system
  • z cpu_h and z cpu_l are the upper and lower bounds of the CPU resources obtained according to the expected range of virtual machine performance
  • z mem_h , z mem_l is the upper bound and lower bound of the memory resources obtained according to the expected range of virtual machine performance
  • vm cpu and vm mem represent the number of CPU cores and memory size of a virtual machine copy
  • s is the software aging degree of the virtual machine
  • Step 2.3 Process the working virtual machine that has been down or the service has failed, the specific method is:
  • Step 2.3.1 Replace the virtual machine that has gone down
  • Step 2.3.2 Replace the virtual machine with invalid service
  • Step 2.3.2.1 If the virtual machine redundancy queue is not empty, immediately select the virtual machine from the tail of the redundancy queue to replace it, and restart the down virtual machine to the tail of the redundancy queue;
  • Step 2.3.2.2 If the virtual machine redundancy queue is empty, restart the down virtual machine directly, and put it at the end of the work queue after restart;
  • Step 2.4 Add and delete work virtual machines according to the calculated number of required work virtual machines Num work , and update the virtual machine work queue.
  • the specific method is:
  • Step 2.4.1 Add a working virtual machine
  • Step 2.4.1.1 Select a virtual machine from the tail of the virtual machine redundancy queue to add to the virtual machine work queue. If there are not enough redundant virtual machines, create a virtual machine and start adding it to the end of the work queue;
  • Step 2.4.1.2 Sort all virtual machines in the work queue according to the software aging degree from large to small;
  • Step 2.4.2 Release the working virtual machine, delete the virtual machine from the head of the virtual machine work queue, and put it into the virtual machine redundant queue;
  • Step 3 Dynamically update the redundant queue of the virtual machine based on the binary decision diagram, the specific method is:
  • Step 3.1 Determine the usage of redundant virtual machines according to the current software aging scenario of the cloud service system and the aging situation of the cloud service system;
  • Step 3.2 Use the Binary Decision Diagram (BDD) as shown in Figure 3 to dynamically update the redundant queue of virtual machines under scenario 3.
  • BDD Binary Decision Diagram
  • Step 3.2.1 Initialize the decision graph BDD with the character ‘#’, initialize the ‘0’ leaf node, initialize the ‘1’ leaf node, and then initialize the other nodes in the BDD with the character ‘#’;
  • Step 3.2.2 Calculate the service failure probability of the virtual machine, select the Weber distribution to fit the service failure time sample of the working virtual machine, and accumulate the Weber distribution function F(t), as shown in the following formula:
  • F(t) represents the probability of service failure of the virtual machine during the working hours of 0 to t.
  • the redundant virtual machine does not process any business requests in the sleep state.
  • the service failure rate is approximately 0, and ⁇ >0 is the proportional parameter. ⁇ >0 is the shape parameter;
  • Step 3.2.3 Calculate the number of redundant virtual machines
  • Step 3.2.3.1 Set according to step 2, the calculated demand for working virtual machines is n′;
  • Step 3.2.3.2 Each circle in the binary decision diagram represents a virtual machine node, the '1' side and the '0' side respectively represent the normal and service failure status of the virtual machine, and the rectangle represents the status of the entire cloud service system; all arrivals
  • the meaning of the path of the '1' rectangular box is: there are already k'working virtual machines in the path in the normal state, and the system can work normally no matter whether other working virtual machines are normal or not; and the meaning of the path to the '0' rectangular box is : There are already n'-k'+1 working virtual machines in the path that have been serviced out of service. No matter whether other virtual machines are normal or not, the system cannot guarantee the user's service performance;
  • Step 3.2.3.3 When generating a binary decision graph, use a global two-dimensional matrix for storage; the subscript of the virtual machine v x+y+1 is (x, y), and the subscript of the root node v 1 is (0, 0) ; The reliability of the cloud service system is expressed by calculating the path probabilities from the root to all '1' rectangles. The probability of the decision graph with the virtual machine v x + y + 1 as the root node is calculated by the following formula:
  • R x + y + 1 represents the probability of service failure of the virtual machine v x+y+1
  • BDD[x+1][y] and BDD[x][y+1] respectively represent the virtual machine v x + y + 1 's '1' side and '0' side connected sub-decision graph;
  • Step 3.2.3.5 Set the initial value of the number m of redundant virtual machines according to the average software aging of all working virtual machines, calculate k’, and get m;
  • Step 3.2.4 Adjust the redundant queue of virtual machines according to the number m of redundant virtual machines
  • This embodiment compares the method of the present invention with the following two resource adjustment methods that do not consider the aging of virtual machine software: passive adjustment method based on monitoring (recorded as comparison method 1) and adjustment method based on ARIMA prediction (recorded as comparison method 2) In contrast, the number of failed requests per hour, average response time, and average resource utilization are used as indicators to analyze the performance of each adjustment method.
  • Control method 1 Adjust the number of virtual machines by monitoring system performance. Set when the system's average CPU or memory resource utilization is greater than 80% for 5 minutes, add two working virtual machines, and reduce two working virtual machines when it lasts less than 30% for 10 minutes
  • ARIMA predicts the demand for CPU and memory resources to adjust the virtual machine.
  • LoadRunner is used to simulate the three types of aging scenarios in the present invention in turn, and three experiments are carried out in each scenario to test each adjustment method: the method of the present invention is used for the first time, and the control method is tested for the second time.
  • the third test compares method two, and finally compares the performance of each method from the number of failed requests, average response time, and average resource utilization.
  • the number of failed requests refers to the number of requests for which the server does not return a response.
  • Table 2 records the service quality under the three resource adjustment methods. It can be seen from the table that the two service indicators are the highest when the virtual machine is adjusted using the comparison method 1. This is because the virtual machine is statically adjusted by monitoring the performance. The adjustment action is delayed; although the number of failed requests after the comparison method 2 is reduced compared with the comparison method 1, it still has a longer request response time; and when the method of the present invention is used to adjust the virtual machine, the service quality is the best, every time The average number of failed requests per hour is 24, and the average response time is 0.361s. This is because the method of the present invention can ensure the normal operation of the working virtual machine through redundant virtual machines in various aging scenarios.
  • Adjustment method Number of failed requests/hour Average response time (s) Method of the invention 16 0.361 Control method one 105 0.617 Comparison method two 42 0.539
  • this embodiment compares the hourly average resource utilization of the system under each adjustment method, as shown in Figure 6 and Figure 7. It can be seen from the figure that compared to the two comparisons Method, the average resource utilization rate of the system is the lowest when the method of the present invention is applied. This is because some redundant resources are set during the adjustment process, but overall, the reduction in resource utilization rate is within an acceptable range, at 36
  • the average resource utilization rate of the virtual machine under the adjustment method of the present invention is between 50% and 70% within hours, which is relatively stable; the average resource utilization rate of the comparison method fluctuates greatly, which is caused by the delay of passive adjustment. The situation of resource idleness and resource shortage occurs; while in the second scenario, the resource utilization rate is too low and too high. This is because load fluctuations cause frequent resource adjustments, and the performance of some seriously aging working virtual machines drops sharply.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Hardware Redundancy (AREA)

Abstract

本发明提供一种面向不同老化场景的虚拟机工作队列和冗余队列更新方法,涉及云计算技术领域。该方法首先根据虚拟机的生存时间和负载的波动情况划分不同的软件老化场景,然后采用基于岭回归的虚拟机工作队列动态更新的方法,动态地调整工作虚拟机副本的数目和顺序;最后基于二元决策图动态更新虚拟机的冗余队列。本发明提供的面向不同老化场景的虚拟机工作队列和冗余队列更新方法,通过选择和切换策略平衡虚拟机的服务质量和资源成本,保证系统的服务质量,即使工作虚拟机出现服务失效,冗余虚拟机能在短时间内切换状态,完全替代服务失效虚拟机。

Description

面向不同老化场景的虚拟机工作队列和冗余队列更新方法 技术领域
本发明涉及云计算技术领域,尤其涉及一种面向不同老化场景的虚拟机工作队列和冗余队列更新方法。
背景技术
随着云计算技术的广泛应用,云环境更加复杂且难以掌控,云服务供应商一方面需要尽最大努力保证系统的服务质量,减少服务协议的违反次数;另一方面需要提高资源利用率,降低服务成本。为了达到上述目标,实时地监测云环境变化,动态地调整云资源是最有效的途径。在云资源调整过程中,虚拟机的软件老化和业务并发访问量是两个不能被忽视的因素。云服务系统中软件老化问题严重影响着服务的性能和可靠性,在24小时*7天持续、高并发的业务访问下虚拟机的各种老化因素不断累积,导致虚拟机可用资源逐渐减少,内部软件运行变慢,失败请求数和请求响应时间增加。
早期的云资源调整方法主要使用对云环境实时监控和预定规则触发的调整机制,这类调整方法也是目前应用比较成熟的一类;而最近几年许多研究通过机器学习等一些流行技术对系统的业务并发量预测,再根据业务并发量计算工作虚拟机的数目,提前进行虚拟机的调整。在上述这些云资源调整方法中,仍存在一些欠缺,前人提出的调整方法在评估云服务性能时,往往假设工作虚拟机的运行状态不发生改变,缺乏对虚拟机软件老化的充分考虑,显然这类评估方法较为粗略,尤其在一些长期运行的云服务系统中可能产生较大偏差;另外,前人方法一般通过设定静态阈值应对软件老化,只对高于老化阈值的虚拟机采取防范措施,而其他工作虚拟机一旦服务失效,则云服务系统无法立即做出调整,进而影响用户的正常访问,无法持续保障云服务系统服务的可靠性。而且前人提出的云资源调整方法在选择调整目标虚拟机时缺乏对软件老化的考虑,无法保证软件老化程度高的虚拟机被及时地重启,这极大地降低系统的性能和可靠性,增加了系统的运营成本。
综上分析,前人提出的云资源调整方法缺乏对软件老化的考虑,有可能造成调整效果差,服务质量无法保证等问题。
发明内容
本发明要解决的技术问题是针对上述现有技术的不足,提供一种面向不同老化场景的虚拟机工作队列和冗余队列更新方法,实现对虚拟机的工作队列和冗余队列进行更新。
为解决上述技术问题,本发明所采取的技术方案是:面向不同老化场景的虚拟机工作队列和冗余队列更新方法,包括以下步骤:
步骤1:根据虚拟机的生存时间和负载的波动情况划分不同的软件老化场景,具体方法为:
步骤1.1:将云服务系统中在一段时间内所有虚拟机都处于健康状态的场景划分为虚拟机生存时间短的场景,也称为场景一;
步骤1.2:将虚拟机长期不间断地运转,软件老化因素随着业务访问不断累积,导致一些虚拟机已经处于非健康的状态,但通过增广迪基-福勒检验(Augmented Dickey Fuller Test,即ADF)方法判断云服务系统总业务并发量变化平稳,不会造成工作虚拟机故障的场景划分为虚拟机生存时间长且业务并发量平稳的场景,也称为场景二;
步骤1.3:将外部负载波动大,造成虚拟资源的频繁调整,并且在调整过程中云服务系统处于过载状态,即通过ADF方法判断云服务系统总业务并发量非平稳变化,而且已经存在部分虚拟机处于非健康的状态的场景划分为虚拟机生存时间长且业务并发量非平稳的场景,也称为场景三;
步骤2:采用基于岭回归的虚拟机工作队列动态更新的方法,动态地调整工作虚拟机副本的数目和顺序;
步骤2.1:在忽略软件老化因素的前提下,将虚拟机的业务并发量看作自变量,把CPU、内存、磁盘IO和网络IO看作因变量,对云服务系统建立岭回归模型,从而由业务的并发量计算出云服务系统所需的资源量;
步骤2.1.1:判断虚拟机的软件老化场景;
步骤2.1.2:从新启动的工作虚拟机上采集各类数据,把业务并发访问量和CPU及内存数据代入岭回归模型中;
云服务系统所需的CPU、内存、磁盘IO或网络IO的资源量的计算方法如下公式所示:
z=α 1*x 12*x 2+...+α k*x k1*y 12*y 23*y 34*y 4+ε   (1)
其中,x j表示云服务系统中第j类业务的并发量,j=1,…,k,k为虚拟机所支持的业务类型数,y 1、y 2、y 3、y 4分别表示期望的CPU、内存、磁盘IO以及网络IO的使用率,z表示云服务系统所需的CPU或内存或磁盘IO或网络IO的资源量,α j为第j类业务的并发量在资源计算中的影响权重,β 1、β 2、β 3、β 4分别表示在资源计算过程中对CPU、内存、磁盘IO以及网络IO性能期望的权重,ε为误差常量;
步骤2.1.3:使用最小二乘法迭代求解岭回归模型的损失函数,使岭回归模型的损失函数Loss最小,如下公式所示:
Figure PCTCN2019090870-appb-000001
其中,n表示工作虚拟机上采集到的各类业务并发量的数目,Z i表示实际的资源需求量,
Figure PCTCN2019090870-appb-000002
表示由模型得到的资源需求量,λ表示正则项系数;
步骤2.1.4:使岭回归模型的损失函数Loss最小,确定参数α 1,...,α k、β 1、β 2和ε,当参数的偏导值为零解出Loss函数的极小值,如下公式所示:
Figure PCTCN2019090870-appb-000003
Figure PCTCN2019090870-appb-000004
步骤2.1.5:按公式3和4求解由所有参数构成的方程,并代入采集到的业务并发量、资源利用率和CPU、内存、磁盘IO以及网络IO的资源量,求解得到岭回归模型的2k+6个参数,从而确定各类业务与CPU、内存、磁盘IO以及网络IO之间的关系;
步骤2.1.6:将云服务系统的业务并发量代入公式1,获得云服务系统所需的各类资源量;
步骤2.2:根据云服务系统所需的各类资源量确定所需工作虚拟机的数量,具体方法为:
步骤2.2.1:根据不同场景确定虚拟机的损耗;
步骤2.2.1.1:对于场景二和场景三,软件老化程度不同的工作虚拟机存在不同的内存资源损耗,在统计现有云资源时根据软件老化度对每台虚拟机的内存资源折算,同时服务已经失效的虚拟机不再计入可用资源;
步骤2.2.1.2:场景一中的工作虚拟机全部处于健康状态,在该场景下忽略老化的损耗;
步骤2.2.2:现有f台工作虚拟机,则下一段时间所需的工作虚拟机数目Num work由如下公式计算,Num work的最小取值为一:
Figure PCTCN2019090870-appb-000005
Res cpu=f*vm cpu(6)
Figure PCTCN2019090870-appb-000006
其中,Res cpu、Res mem分别表示云服务系统CPU和内存可用的资源量,z cpu_h、z cpu_l分别为根据虚拟机性能的期望范围求得的CPU资源的上界和资源下界,z mem_h、z mem_l分别为根据虚拟机性能的期望范围求得的内存资源的上界和资源下界,vm cpu、vm mem表示一个虚拟机副本的CPU核数和内存大小,s为虚拟机的软件老化度,ρ表示软件老化度s在资源评估中的影响比重,在场景二和场景三中0<ρ≤1,在场景一中ρ=0;
步骤2.3:对已经宕机或者服务失效的工作虚拟机进行处理,具体方法为:
步骤2.3.1:替换已经宕机的虚拟机;
如果虚拟机冗余队列不为空,立即从冗余队列尾部选取虚拟机进行替换,并将宕机虚拟机重启转入冗余队列尾部;
如果虚拟机冗余队列为空,直接重启宕机虚拟机,重启后放入工作队列尾部;
步骤2.3.2:替换服务失效的虚拟机;
步骤2.3.2.1:如果虚拟机冗余队列不为空,立即从冗余队列尾部选取虚拟机进行替换,并将宕机虚拟机重启转入冗余队列尾部;
步骤2.3.2.2:如果虚拟机冗余队列为空,直接重启宕机虚拟机,重启后放入工作队列尾部;
步骤2.4:根据计算的所需工作虚拟机数目Num work增删工作虚拟机,更新虚拟机工作队列,具体方法为:
步骤2.4.1:增加工作虚拟机;
步骤2.4.1.1:从虚拟机冗余队列尾部选择虚拟机补充到虚拟机工作队列,如果没有足够的冗余虚拟机,创建一台虚拟机并启动加入到工作队列尾部;
步骤2.4.1.2:将工作队列中所有虚拟机按软件老化度从大到小排序;
步骤2.4.2:释放工作虚拟机,从虚拟机工作队列队首删除虚拟机,放入虚拟机冗余队列;
步骤3:基于二元决策图动态更新虚拟机的冗余队列,具体方法为:
步骤3.1:根据云服务系统当前的软件老化场景及云服务系统老化情况,决定冗余虚拟机使用情况;
若云服务系统当前处于场景一,不考虑冗余虚拟机;
若云服务系统当前处于场景二,对重度软件老化的工作虚拟机冗余,并且最少冗余一台;
若云服务系统当前处于场景三,利用二元决策图对场景三下的虚拟机冗余队列进行动态更新计算冗余虚拟机的数目;
步骤3.2:使用二元决策图(Binary Decision Diagram,即BDD)动态更新场景三下的虚拟机冗余队列,具体方法为:
步骤3.2.1:以字符’#’初始化决策图BDD,初始化‘0’叶子节点,初始化‘1’叶子节点,再以字 符‘#’初始化BDD中其他节点;
步骤3.2.2:计算虚拟机的服务失效概率,选定韦伯分布拟合工作虚拟机的服务失效时间样本,累积韦伯分布函数F(t),如下公式所示:
Figure PCTCN2019090870-appb-000007
其中,F(t)表示虚拟机在0~t的工作时间内服务失效的概率,冗余虚拟机在休眠状态下不处理任何业务请求,服务失效率近似为0,λ>0为比例参数,β>0为形状参数;
步骤3.2.3:计算冗余虚拟机的数量;
步骤3.2.3.1:根据步骤2,计算得到工作虚拟机的需求量为n′台;
步骤3.2.3.2:二元决策图中每个圆圈代表一个虚拟机节点,‘1’边和‘0’边分别代表虚拟机的正常、服务失效状态,矩形代表整个云服务系统的状态;所有到达‘1’矩形框的路径含义为:该路径中已经有k’台工作虚拟机处于正常状态,无论其他工作虚拟机是否正常,系统均能正常工作;而到达‘0’矩形框的路径含义为:该路径中已经有n′-k’+1台工作虚拟机已经服务失效,无论其他虚拟机是否正常,系统都无法保证用户的服务性能;
步骤3.2.3.3:生成二元决策图时,采用全局二维矩阵存储;虚拟机v x+y+1的下标记为(x,y),根节点v 1的下标为(0,0);云服务系统的可靠性通过计算根到所有‘1’矩形框的路径概率和表示,以虚拟机v x+ y+ 1为根节点的决策图的概率由如下公式计算:
P(BDD[x][y])=(1-R x+y+1)P(BDD[x+1][y])+R x+y+1P(BDD[x][y+1])   (9)
其中,R x+ y+ 1表示虚拟机v x+y+1服务失效的概率,BDD[x+1][y]、BDD[x][y+1]分别表示与虚拟机v x+ y+ 1的‘1’边、‘0’边相连的子决策图;
由于冗余虚拟机的数量未知,则k’的大小不确定;若按照传统的二元决策图计算方法,则k’从1到n分别取值计算概率,直到冗余虚拟机数目m达到所要求的概率;
步骤3.2.3.5:根据所有工作虚拟机的平均软件老化度设置冗余虚拟机数目m的初始值,计算k’,得出m;
步骤3.2.4:根据冗余虚拟机数目m,调整虚拟机的冗余队列;
增加冗余虚拟机时,新建并启动虚拟机,放入虚拟机冗余队列尾部;
释放冗余虚拟机时,从虚拟机冗余队列首部删除虚拟机。
采用上述技术方案所产生的有益效果在于:本发明提供的面向不同老化场景的虚拟机工作队列和冗余队列更新方法,不同的工作场景下软件老化对虚拟机性能和可靠性的影响效果不同,划分不同的老化场景有针对性地进行云资源调整,既能够有效降低软件老化的影响,又能节省一定的资源成本,也能通过选择和切换策略平衡虚拟机的服务质量和资源成本;基于岭回归的虚拟机工作队列动态更新算法用于动态地调整工作虚拟机副本的数目和顺序,保证系统的服务质量;基于二元决策 图的虚拟机冗余队列动态更新算法用于即使工作虚拟机出现服务失效,冗余虚拟机能在短时间内切换状态,完全替代服务失效虚拟机。
附图说明
图1为本发明实施例提供的飞机在线订购系统的实例拓扑图;
图2为本发明实施例提供的面向不同老化场景的虚拟机工作队列和冗余队列更新方法的流程图;
图3为本发明实施例提供的二元决策图结构示意图;
图4为本发明实施例提供的不同调整方法下失败请求数的示意图;
图5为本发明实施例提供的不同调整方法下的平均响应时间的示意图;
图6为本发明实施例提供的不同调整方法下的平均内存利用率的示意图;
图7为本发明实施例提供的不同调整方法下的平均CPU利用率的示意图。
图中,1、客户端;2、负载均衡;3、交换机;4、业务数据库。
具体实施方式
下面结合附图和实施例,对本发明的具体实施方式作进一步详细描述。以下实施例用于说明本发明,但不用来限制本发明的范围。
本实施例以某飞机票在线订购系统模拟PC端用户应用,在曙光服务器上搭建该服务系统,通过对飞机票在线订购系统加压模拟真实的业务并发场景,并采集不同的业务并发量数据为例,使用本发明的面向不同老化场景的虚拟机工作队列和冗余队列更新方法对该虚拟机的工作队列和冗余队列进行更新。实验总共使用三台曙光服务器,其中一台服务器负责负载均衡,同时用作采集分析虚拟机数据,制定调整方案等,其他用于创建多台虚拟机,每台虚拟机分配4个CPU、4G内存和20G磁盘,并安装带有老化缺陷的飞机票在线订购应用。实验中的调整方法由Python、Shell语言实现。实例拓扑图如图1所示。
面向不同老化场景的虚拟机工作队列和冗余队列更新方法,如图2所示,包括以下步骤:
步骤1:根据虚拟机的生存时间和负载的波动情况划分不同的软件老化场景,具体方法为:
步骤1.1:将云服务系统中在一段时间内所有虚拟机都处于健康状态的场景划分为虚拟机生存时间短的场景,也称为场景一;
该场景下云服务系统所有虚拟机的创建时间较晚,持续工作时间较短,所以在一段时间内所有虚拟机都处于健康状态,即软件老化度在0~0.2之间,另外这些虚拟机可能在较短时间内被释放掉,因此该场景下软件老化对虚拟机的影响较小,从节省成本方面考虑,在调整云资源时可以暂时忽略软件老化因素。
步骤1.2:将虚拟机长期不间断地运转,软件老化因素随着业务访问不断累积,导致一些虚拟机已经处于非健康的状态,但通过增广迪基-福勒检验(Augmented Dickey Fuller Test,即ADF)方 法判断云服务系统总业务并发量变化平稳,不会造成工作虚拟机故障的场景划分为虚拟机生存时间长且业务并发量平稳的场景,也称为场景二;
该场景下云服务系统中虚拟机长期不间断地运转,软件老化因素随着业务访问不断累积,导致一些虚拟机已经处于非健康的状态,即软件老化度大于0.2,但由于业务并发量变化较为平稳,一般不会造成工作虚拟机故障。通过ADF方法判断云服务系统总业务并发量的平稳性,如果不存在单位根则说明业务并发量变化平稳。
步骤1.3:将外部负载波动大,造成虚拟资源的频繁调整,并且在调整过程中云服务系统处于过载状态,即通过ADF方法判断云服务系统总业务并发量非平稳变化,而且已经存在部分虚拟机处于非健康的状态的场景划分为虚拟机生存时间长且业务并发量非平稳的场景,也称为场景三;
该场景下云服务系统的外部负载波动较大,造成虚拟资源的频繁调整,并且在调整过程中系统可能处于过载状态,从而加速了老化过程;另一方面系统中已经存在部分虚拟机处于非健康的状态,此时系统对每台虚拟机的可靠性要求较高,因此有必要增加冗余虚拟机来确保系统的服务质量。
步骤2:采用基于岭回归的虚拟机工作队列动态更新的方法,动态地调整工作虚拟机副本的数目和顺序;
步骤2.1:在忽略软件老化因素的前提下,将虚拟机的业务并发量看作自变量,把CPU、内存、磁盘IO和网络IO看作因变量,对云服务系统建立岭回归模型,从而由业务的并发量计算出云服务系统所需的资源量;
步骤2.1.1:判断虚拟机的软件老化场景;
步骤2.1.2:从新启动的工作虚拟机上采集各类数据,把业务并发访问量和CPU及内存数据代入岭回归模型中;
云服务系统所需的CPU、内存、磁盘IO或网络IO的资源量的计算方法如下公式所示:
z=α 1*x 12*x 2+...+α k*x k1*y 12*y 23*y 34*y 4+ε   (1)
其中,x j表示云服务系统中第j类业务的并发量,j=1,…,k,k为虚拟机所支持的业务类型数,y 1、y 2、y 3、y 4分别表示期望的CPU、内存、磁盘IO以及网络IO的使用率,z表示云服务系统所需的CPU或内存或磁盘IO或网络IO的资源量,α j为第j类业务的并发量在资源计算中的影响权重,β 1、β 2、β 3、β 4分别表示在资源计算过程中对CPU、内存、磁盘IO以及网络IO性能期望的权重,ε为误差常量;
步骤2.1.3:使用最小二乘法迭代求解岭回归模型的损失函数,使岭回归模型的损失函数Loss最小,如下公式所示:
Figure PCTCN2019090870-appb-000008
其中,n表示工作虚拟机上采集到的各类业务并发量的数目,Z i表示实际的资源需求量,
Figure PCTCN2019090870-appb-000009
表示由模型得到的资源需求量,λ表示正则项系数;
步骤2.1.4:使岭回归模型的损失函数Loss最小,确定参数α 1,...,α k、β 1、β 2和ε,当参数的偏导值为零解出Loss函数的极小值,如下公式所示:
Figure PCTCN2019090870-appb-000010
Figure PCTCN2019090870-appb-000011
步骤2.1.5:按公式3和4求解由所有参数构成的方程,并代入采集到的业务并发量、资源利用率和CPU、内存、磁盘IO以及网络IO的资源量,求解得到岭回归模型的2k+6个参数,从而确定各类业务与CPU、内存、磁盘IO以及网络IO之间的关系;
步骤2.1.6:将云服务系统的业务并发量代入公式1,获得云服务系统所需的各类资源量;
步骤2.2:根据云服务系统所需的各类资源量确定所需工作虚拟机的数量,具体方法为:
步骤2.2.1:根据不同场景确定虚拟机的损耗;
步骤2.2.1.1:对于场景二和场景三,软件老化程度不同的工作虚拟机存在不同的内存资源损耗,在统计现有云资源时根据软件老化度对每台虚拟机的内存资源折算,同时服务已经失效的虚拟机不再计入可用资源;
步骤2.2.1.2:场景一中的工作虚拟机全部处于健康状态,在该场景下忽略老化的损耗;
步骤2.2.2:现有f台工作虚拟机,则下一段时间所需的工作虚拟机数目Num work由如下公式计算,Num work的最小取值为一:
Figure PCTCN2019090870-appb-000012
Res cpu=f*vm cpu      (6)
Figure PCTCN2019090870-appb-000013
其中,Res cpu、Res mem分别表示云服务系统CPU和内存可用的资源量,z cpu_h、z cpu_l分别为根据虚拟机性能的期望范围求得的CPU资源的上界和资源下界,z mem_h、z mem_l分别为根据虚拟机性能的期望范围求得的内存资源的上界和资源下界,vm cpu、vm mem表示一个虚拟机副本的CPU核数和内存大小,s为虚拟机的软件老化度,ρ表示软件老化度s在资源评估中的影响比重,在场景二和场景三中0<ρ≤1,在场景一中ρ=0;
步骤2.3:对已经宕机或者服务失效的工作虚拟机进行处理,具体方法为:
步骤2.3.1:替换已经宕机的虚拟机;
如果虚拟机冗余队列不为空,立即从冗余队列尾部选取虚拟机进行替换,并将宕机虚拟机重启转入冗余队列尾部;
如果虚拟机冗余队列为空,直接重启宕机虚拟机,重启后放入工作队列尾部;
步骤2.3.2:替换服务失效的虚拟机;
步骤2.3.2.1:如果虚拟机冗余队列不为空,立即从冗余队列尾部选取虚拟机进行替换,并将宕机虚拟机重启转入冗余队列尾部;
步骤2.3.2.2:如果虚拟机冗余队列为空,直接重启宕机虚拟机,重启后放入工作队列尾部;
步骤2.4:根据计算的所需工作虚拟机数目Num work增删工作虚拟机,更新虚拟机工作队列,具体方法为:
步骤2.4.1:增加工作虚拟机;
步骤2.4.1.1:从虚拟机冗余队列尾部选择虚拟机补充到虚拟机工作队列,如果没有足够的冗余虚拟机,创建一台虚拟机并启动加入到工作队列尾部;
步骤2.4.1.2:将工作队列中所有虚拟机按软件老化度从大到小排序;
步骤2.4.2:释放工作虚拟机,从虚拟机工作队列队首删除虚拟机,放入虚拟机冗余队列;
步骤3:基于二元决策图动态更新虚拟机的冗余队列,具体方法为:
步骤3.1:根据云服务系统当前的软件老化场景及云服务系统老化情况,决定冗余虚拟机使用情况;
若云服务系统当前处于场景一,不考虑冗余虚拟机;
若云服务系统当前处于场景二,对重度软件老化的工作虚拟机冗余,并且最少冗余一台;
若云服务系统当前处于场景三,利用二元决策图对场景三下的虚拟机冗余队列进行动态更新计算冗余虚拟机的数目;
步骤3.2:使用如图3所示的二元决策图(Binary Decision Diagram,即BDD)动态更新场景三下的虚拟机冗余队列,具体方法为:
步骤3.2.1:以字符’#’初始化决策图BDD,初始化‘0’叶子节点,初始化‘1’叶子节点,再以字符‘#’初始化BDD中其他节点;
步骤3.2.2:计算虚拟机的服务失效概率,选定韦伯分布拟合工作虚拟机的服务失效时间样本,累积韦伯分布函数F(t),如下公式所示:
Figure PCTCN2019090870-appb-000014
其中,F(t)表示虚拟机在0~t的工作时间内服务失效的概率,冗余虚拟机在休眠状态下不处理任何业务请求,服务失效率近似为0,λ>0为比例参数,β>0为形状参数;
步骤3.2.3:计算冗余虚拟机的数量;
步骤3.2.3.1:设定根据步骤2,计算得到工作虚拟机的需求量为n′台;
步骤3.2.3.2:二元决策图中每个圆圈代表一个虚拟机节点,‘1’边和‘0’边分别代表虚拟机的正常、服务失效状态,矩形代表整个云服务系统的状态;所有到达‘1’矩形框的路径含义为:该路径中已经有k’台工作虚拟机处于正常状态,无论其他工作虚拟机是否正常,系统均能正常工作;而到达‘0’矩形框的路径含义为:该路径中已经有n′-k’+1台工作虚拟机已经服务失效,无论其他虚拟机是否正常,系统都无法保证用户的服务性能;
步骤3.2.3.3:生成二元决策图时,采用全局二维矩阵存储;虚拟机v x+y+1的下标记为(x,y),根节点v 1的下标为(0,0);云服务系统的可靠性通过计算根到所有‘1’矩形框的路径概率和表示,以虚拟机v x+ y+ 1为根节点的决策图的概率由如下公式计算:
P(BDD[x][y])=(1-R x+y+1)P(BDD[x+1][y])+R x+y+1P(BDD[x][y+1])   (9)
其中,R x+ y+ 1表示虚拟机v x+y+1服务失效的概率,BDD[x+1][y]、BDD[x][y+1]分别表示与虚拟机v x+ y+ 1的‘1’边、‘0’边相连的子决策图;
由于冗余虚拟机的数量未知,则k’的大小不确定;若按照传统的二元决策图计算方法,则k’从1到n分别取值计算概率,直到冗余虚拟机数目m达到所要求的概率;
步骤3.2.3.5:根据所有工作虚拟机的平均软件老化度设置冗余虚拟机数目m的初始值,计算k’,得出m;
步骤3.2.4:根据冗余虚拟机数目m,调整虚拟机的冗余队列;
增加冗余虚拟机时,新建并启动虚拟机,放入虚拟机冗余队列尾部;
释放冗余虚拟机时,从虚拟机冗余队列首部删除虚拟机。
本实施例将本发明方法与以下两种未考虑虚拟机软件老化的资源调整方法对比:基于监测的被动调整方法(记为对照方法一)和基于ARIMA预测的调整方法(记为对照方法二)对比,使用每小时的失败请求数、平均响应时间、平均资源利用率作为分析各调整方法性能的指标。
对照方法一通过监测系统性能来调整虚拟机数量,设置当系统的平均CPU或内存资源利用率持续5分钟大于80%时增加两台工作虚拟机,持续10分钟小于30%时减少两台工作虚拟机,对照方法二通过ARIMA预测CPU和内存资源需求量来调整虚拟机。本实施例按照表1中参数使用LoadRunner依次模拟本发明中的三类老化场景,在各场景下分别进行三次实验测试各调整方法:第一次采用本发明的方法,第二次测试对照方法一,第三次测试对照方法二,最后从失败请求数、平均响应时间和平均资源利用率对比各方法的性能,其中失败请求数是指服务端未返回响应的请求个数。
表1参数
参数 参数设置
一次实验总时长 36个小时
每台VM平均软件老化时长 10个小时
每台服务器上最大虚拟机数 8台
方法执行间隔 5分钟
场景一的模拟时间 前12个小时
场景二的模拟时间 第12个小时至第24个小时
场景三的模拟时间 第24个小时至第36个小时
场景一下系统业务并发量范围 每秒0~3000个并发请求
场景二下系统业务并发量范围 每秒3000~4000个并发请求
场景三下系统业务并发量范围 每秒2000~6000个并发请求
表2记录了三种资源调整方法下的服务质量,从表中可以看出,当采用对照方法一调整虚拟机时两项服务指标最高,这是由于通过监测性能的方式静态地调整虚拟机,调整动作存在延迟造成的;采用对照方法二后虽然失败请求数比对照方法一有所减少,但是仍具有较长的请求响应时间;而当使用本发明方法调整虚拟机时服务质量最优,每小时的平均失败请求数是24,平均响应时间为 0.361s,这是因为本发明方法可以在各老化场景下通过冗余虚拟机保证工作虚拟机的正常运行。
表2各调整方法下的整体服务质量对比
调整方法 失败请求数/小时 平均响应时间(s)
本发明方法 16 0.361
对照方法一 105 0.617
对照方法二 42 0.539
三类老化场景中使用三种方法调整后的情况如图4和图5所示,从图中可以看出在36个小时内三种方法得到的两项服务指标大致表现为递增趋势,说明场景三中的虚拟机较其他场景中的虚拟机受软件老化影响大,因此场景三下需要更多的冗余保障工作虚拟机的性能和可靠性。另外对照方法二与本文方法在场景一和场景二中的效果较为接近,但是在场景三下的失败请求数和响应时间突增,说明在并发量波动大、老化积累严重的场景下,基于时间序列预测的传统调整方法无法较好地保证服务质量。
为了进一步研究虚拟机资源的利用情况,本实施例将各调整方法下系统每小时的平均资源利用率进行对比,如图6和图7所示,从图中可以看出,相比两个对照方法,施加本发明方法时系统的平均资源利用率最低,这是因为在调整过程中设置了部分冗余资源,但整体来看,资源利用率的降低幅度在可接受范围之内,在36个小时内本发明调整方法下虚拟机的平均资源使用率都在50%至70%之间,相对来说比较稳定;对照方法一下的平均资源利用率波动较大,由于被动调整的延迟性导致一些资源空闲和资源紧张的情况出现;而对照方法二在场景三下存在资源利用率过低和过高的情况,这是因为负载波动导致资源频繁调整,一些老化严重的工作虚拟机性能急剧下降。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明权利要求所限定的范围。

Claims (6)

  1. 一种面向不同老化场景的虚拟机工作队列和冗余队列更新方法,其特征在于:包括以下步骤:
    步骤1:根据虚拟机的生存时间和负载的波动情况划分不同的软件老化场景,具体方法为:
    步骤1.1:将云服务系统中在一段时间内所有虚拟机都处于健康状态的场景划分为虚拟机生存时间短的场景,也称为场景一;
    步骤1.2:将虚拟机长期不间断地运转,软件老化因素随着业务访问不断累积,导致一些虚拟机已经处于非健康的状态,但通过增广迪基-福勒检验方法判断云服务系统总业务并发量变化平稳,不会造成工作虚拟机故障的场景划分为虚拟机生存时间长且业务并发量平稳的场景,也称为场景二;
    步骤1.3:将外部负载波动大,造成虚拟资源的频繁调整,并且在调整过程中云服务系统处于过载状态,即通过ADF方法判断云服务系统总业务并发量非平稳变化,而且已经存在部分虚拟机处于非健康的状态的场景划分为虚拟机生存时间长且业务并发量非平稳的场景,也称为场景三;
    步骤2:采用基于岭回归的虚拟机工作队列动态更新的方法,动态地调整工作虚拟机副本的数目和顺序;
    步骤2.1:在忽略软件老化因素的前提下,将虚拟机的业务并发量看作自变量,把CPU、内存、磁盘IO和网络IO看作因变量,对云服务系统建立岭回归模型,从而由业务的并发量计算出云服务系统所需的资源量;
    步骤2.2:根据云服务系统所需的各类资源量确定所需工作虚拟机的数量;
    步骤2.3:对已经宕机或者服务失效的工作虚拟机进行处理;
    步骤2.4:根据计算的所需工作虚拟机数目Num work增删工作虚拟机,更新虚拟机工作队列;
    步骤3:基于二元决策图动态更新虚拟机的冗余队列,具体方法为:
    步骤3.1:根据云服务系统当前的软件老化场景及云服务系统老化情况,决定冗余虚拟机使用情况;
    若云服务系统当前处于场景一,不考虑冗余虚拟机;
    若云服务系统当前处于场景二,对重度软件老化的工作虚拟机冗余,并且最少冗余一台;
    若云服务系统当前处于场景三,利用二元决策图对场景三下的虚拟机冗余队列进行动态更新计算冗余虚拟机的数目;
    步骤3.2:使用二元决策图动态更新场景三下的虚拟机冗余队列。
  2. 根据权利要求1所述的面向不同老化场景的虚拟机工作队列和冗余队列更新方法,其特征在于:所述步骤2.1的具体方法为:
    步骤2.1.1:判断虚拟机的软件老化场景;
    步骤2.1.2:从新启动的工作虚拟机上采集各类数据,把业务并发访问量和CPU及内存数据代入岭回归模型中;
    云服务系统所需的CPU、内存、磁盘IO或网络IO的资源量的计算方法如下公式所示:
    z=α 1*x 12*x 2+...+α k*x k1*y 12*y 23*y 34*y 4+ε  (1)
    其中,x j表示云服务系统中第j类业务的并发量,j=1,...,k,k为虚拟机所支持的业务类型数,y 1、y 2、y 3、y 4分别表示期望的CPU、内存、磁盘IO以及网络IO的使用率,z表示云服务系统所需的CPU或内存或磁盘IO或网络IO的资源量,α j为第j类业务的并发量在资源计算中的影响权重,β 1、β 2、β 3、β 4分别表示在资源计算过程中对CPU、内存、磁盘IO以及网络IO性能期望的权重,ε为误差常量;
    步骤2.1.3:使用最小二乘法迭代求解岭回归模型的损失函数,使岭回归模型的损失函数Loss最小,如下公式所示:
    Figure PCTCN2019090870-appb-100001
    其中,n表示工作虚拟机上采集到的各类业务并发量的数目,Z i表示实际的资源需求量,
    Figure PCTCN2019090870-appb-100002
    表示由模型得到的资源需求量,λ表示正则项系数;
    步骤2.1.4:使岭回归模型的损失函数Loss最小,确定参数α 1,…,α k、β 1、β 2和ε,当参数的偏导值为零解出Loss函数的极小值,如下公式所示:
    Figure PCTCN2019090870-appb-100003
    Figure PCTCN2019090870-appb-100004
    步骤2.1.5:按公式3和4求解由所有参数构成的方程,并代入采集到的业务并发量、资源利用率和CPU、内存、磁盘IO以及网络IO的资源量,求解得到岭回归模型的2k+6个参数,从而确定各类业务与CPU、内存、磁盘IO以及网络IO之间的关系;
    步骤2.1.6:将云服务系统的业务并发量代入公式1,获得云服务系统所需的各类资源量。
  3. 根据权利要求2所述的面向不同老化场景的虚拟机工作队列和冗余队列更新方法,其特征在于:所述步骤2.2的具体方法为:
    步骤2.2.1:根据不同场景确定虚拟机的损耗;
    步骤2.2.1.1:对于场景二和场景三,软件老化程度不同的工作虚拟机存在不同的内存资源损耗,在统计现有云资源时根据软件老化度对每台虚拟机的内存资源折算,同时服务已经失效的虚拟机不再计入可用资源;
    步骤2.2.1.2:场景一中的工作虚拟机全部处于健康状态,在该场景下忽略老化的损耗;
    步骤2.2.2:现有f台工作虚拟机,则下一段时间所需的工作虚拟机数目Num work由如下公式计算,Num work的最小取值为一:
    Figure PCTCN2019090870-appb-100005
    Res cpu=f*vm cpu         (6)
    Figure PCTCN2019090870-appb-100006
    其中,Res cpu、Res mem分别表示云服务系统CPU和内存可用的资源量,z cpu_h、z cpu_l分别为根据虚拟机性能的期望范围求得的CPU资源的上界和资源下界,z mem_h、z mem_l分别为根据虚拟机性能的期望范围求得的内存资源的上界和资源下界,vm cpu、vm mem表示一个虚拟机副本的CPU核数和内存大小,s为虚拟机的软件老化度,ρ表示软件老化度s在资源评估中的影响比重,在场景二和场景三中0<ρ≤1,在场景一中ρ=0。
  4. 根据权利要求3所述的面向不同老化场景的虚拟机工作队列和冗余队列更新方法,其特征在于:所述步骤2.3的具体方法为:
    步骤2.3.1:替换已经宕机的虚拟机;
    如果虚拟机冗余队列不为空,立即从冗余队列尾部选取虚拟机进行替换,并将宕机虚拟机重启转入冗余队列尾部;
    如果虚拟机冗余队列为空,直接重启宕机虚拟机,重启后放入工作队列尾部;
    步骤2.3.2:替换服务失效的虚拟机;
    步骤2.3.2.1:如果虚拟机冗余队列不为空,立即从冗余队列尾部选取虚拟机进行替换,并将宕机虚拟机重启转入冗余队列尾部;
    步骤2.3.2.2:如果虚拟机冗余队列为空,直接重启宕机虚拟机,重启后放入工作队列尾部。
  5. 根据权利要求4所述的面向不同老化场景的虚拟机工作队列和冗余队列更新方法,其特征在于:所述步骤2.4的具体方法为:
    步骤2.4.1:增加工作虚拟机;
    步骤2.4.1.1:从虚拟机冗余队列尾部选择虚拟机补充到虚拟机工作队列,如果没有足够的冗余虚拟机,创建一台虚拟机并启动加入到工作队列尾部;
    步骤2.4.1.2:将工作队列中所有虚拟机按软件老化度从大到小排序;
    步骤2.4.2:释放工作虚拟机,从虚拟机工作队列队首删除虚拟机,放入虚拟机冗余队列。
  6. 根据权利要求5所述的面向不同老化场景的虚拟机工作队列和冗余队列更新方法,其特征在于:所述步骤3.2的具体方法为:
    步骤3.2.1:以字符’#’初始化决策图BDD,初始化‘0’叶子节点,初始化‘1’叶子节点,再以字符‘#’初始化BDD中其他节点;
    步骤3.2.2:计算虚拟机的服务失效概率,选定韦伯分布拟合工作虚拟机的服务失效时间 样本,累积韦伯分布函数F(t),如下公式所示:
    Figure PCTCN2019090870-appb-100007
    其中,F(t)表示虚拟机在0~t的工作时间内服务失效的概率,冗余虚拟机在休眠状态下不处理任何业务请求,服务失效率近似为0,λ>0为比例参数,β>0为形状参数;
    步骤3.2.3:计算冗余虚拟机的数量;
    步骤3.2.3.1:根据步骤2,计算得到工作虚拟机的需求量为n′台;
    步骤3.2.3.2:二元决策图中每个圆圈代表一个虚拟机节点,‘1’边和‘0’边分别代表虚拟机的正常、服务失效状态,矩形代表整个云服务系统的状态;所有到达‘1’矩形框的路径含义为:该路径中已经有k’台工作虚拟机处于正常状态,无论其他工作虚拟机是否正常,系统均能正常工作;而到达‘0’矩形框的路径含义为:该路径中已经有n′-k’+1台工作虚拟机已经服务失效,无论其他虚拟机是否正常,系统都无法保证用户的服务性能;
    步骤3.2.3.3:生成二元决策图时,采用全局二维矩阵存储;虚拟机v x+y+1的下标记为(x,y),根节点v 1的下标为(0,0);云服务系统的可靠性通过计算根到所有‘1’矩形框的路径概率和表示,以虚拟机v x+ y+ 1为根节点的决策图的概率由如下公式计算:
    P(BDD[x][y])=(1-R x+y+1)P(BDD[x+1][y])+R x+y+1P(BDD[x][y+1])     (9)
    其中,R x+ y+ 1表示虚拟机v x+y+1服务失效的概率,BDD[x+1][y]、BDD[x][y+1]分别表示与虚拟机v x+ y+ 1的‘1’边、‘0’边相连的子决策图;
    由于冗余虚拟机的数量未知,则k’的大小不确定;若按照传统的二元决策图计算方法,则k’从1到n分别取值计算概率,直到冗余虚拟机数目m达到所要求的概率;
    步骤3.2.3.5:根据所有工作虚拟机的平均软件老化度设置冗余虚拟机数目m的初始值,计算k’,得出m;
    步骤3.2.4:根据冗余虚拟机数目m,调整虚拟机的冗余队列;
    增加冗余虚拟机时,新建并启动虚拟机,放入虚拟机冗余队列尾部;
    释放冗余虚拟机时,从虚拟机冗余队列首部删除虚拟机。
    Figure PCTCN2019090870-appb-100008
    Figure PCTCN2019090870-appb-100009
    Res cpu=f*vm cpu         (6)
    Figure PCTCN2019090870-appb-100010
    其中,Res cpu、Res mem分别表示云服务系统CPU和内存可用的资源量,z cpu_h、z cpu_l分别为根据虚拟机性能的期望范围求得的CPU资源的上界和资源下界,z mem_h、z mem_l分别为根据虚拟机性能的期望范围求得的内存资源的上界和资源下界,vm cpu、vm mem表示一个虚拟机副本的CPU核数和内存大小,s为虚拟机的软件老化度,ρ表示软件老化度s在资源评估中的影响比重,在场景二和场景三中0<ρ≤1,在场景一中ρ=0。
PCT/CN2019/090870 2019-04-29 2019-06-12 面向不同老化场景的虚拟机工作队列和冗余队列更新方法 WO2020220436A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910354679.7A CN110109733B (zh) 2019-04-29 2019-04-29 面向不同老化场景的虚拟机工作队列和冗余队列更新方法
CN201910354679.7 2019-04-29

Publications (1)

Publication Number Publication Date
WO2020220436A1 true WO2020220436A1 (zh) 2020-11-05

Family

ID=67487401

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/090870 WO2020220436A1 (zh) 2019-04-29 2019-06-12 面向不同老化场景的虚拟机工作队列和冗余队列更新方法

Country Status (2)

Country Link
CN (1) CN110109733B (zh)
WO (1) WO2020220436A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114803391A (zh) * 2022-05-12 2022-07-29 北京华能新锐控制技术有限公司 一种智慧燃料系统斗轮机无人值守自动取料方法
CN115001896A (zh) * 2022-06-28 2022-09-02 中国人民解放军海军工程大学 一种冗余通道的自适应切换方法
CN116680062A (zh) * 2023-08-03 2023-09-01 湖南博信创远信息科技有限公司 一种基于大数据集群的应用调度部署方法及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111274111B (zh) * 2020-01-20 2021-11-19 西安交通大学 一种用于微服务老化的预测与抗衰方法
CN111369160A (zh) * 2020-03-12 2020-07-03 苏州随身玩信息技术有限公司 一种讲解器的均衡分配方法、机柜及服务器
CN116155695A (zh) * 2023-04-19 2023-05-23 杭州美创科技股份有限公司 集群多节点管理方法、装置、计算机设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598298A (zh) * 2015-02-04 2015-05-06 上海交通大学 基于虚拟机当前工作性质以及任务负载的虚拟机调度算法
CN107992353A (zh) * 2017-07-31 2018-05-04 南京邮电大学 一种基于最小迁移量的容器动态迁移方法及系统
CN108595250A (zh) * 2018-05-02 2018-09-28 南京大学 一种面向IaaS云平台的资源调度效率优化方法及系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5175679A (en) * 1990-09-28 1992-12-29 Xerox Corporation Control for electronic image processing systems
US8261268B1 (en) * 2009-08-05 2012-09-04 Netapp, Inc. System and method for dynamic allocation of virtual machines in a virtual server environment
CN102662763B (zh) * 2012-04-11 2014-03-26 华中科技大学 基于服务质量的虚拟机资源调度方法
CN103605567B (zh) * 2013-10-29 2017-03-22 河海大学 面向实时性需求变化的云计算任务调度方法
US10437623B2 (en) * 2015-12-24 2019-10-08 Intel IP Corporation Fast switching between virtual machines without interrupt virtualization for high-performance, secure trusted-execution environment
CN107589980A (zh) * 2017-08-01 2018-01-16 佛山市深研信息技术有限公司 一种云计算资源的调度方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598298A (zh) * 2015-02-04 2015-05-06 上海交通大学 基于虚拟机当前工作性质以及任务负载的虚拟机调度算法
CN107992353A (zh) * 2017-07-31 2018-05-04 南京邮电大学 一种基于最小迁移量的容器动态迁移方法及系统
CN108595250A (zh) * 2018-05-02 2018-09-28 南京大学 一种面向IaaS云平台的资源调度效率优化方法及系统

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114803391A (zh) * 2022-05-12 2022-07-29 北京华能新锐控制技术有限公司 一种智慧燃料系统斗轮机无人值守自动取料方法
CN114803391B (zh) * 2022-05-12 2023-11-03 北京华能新锐控制技术有限公司 一种智慧燃料系统斗轮机无人值守自动取料方法
CN115001896A (zh) * 2022-06-28 2022-09-02 中国人民解放军海军工程大学 一种冗余通道的自适应切换方法
CN115001896B (zh) * 2022-06-28 2024-01-19 中国人民解放军海军工程大学 一种冗余通道的自适应切换方法
CN116680062A (zh) * 2023-08-03 2023-09-01 湖南博信创远信息科技有限公司 一种基于大数据集群的应用调度部署方法及存储介质
CN116680062B (zh) * 2023-08-03 2023-12-01 湖南博创高新实业有限公司 一种基于大数据集群的应用调度部署方法及存储介质

Also Published As

Publication number Publication date
CN110109733A (zh) 2019-08-09
CN110109733B (zh) 2022-06-24

Similar Documents

Publication Publication Date Title
WO2020220436A1 (zh) 面向不同老化场景的虚拟机工作队列和冗余队列更新方法
US5537542A (en) Apparatus and method for managing a server workload according to client performance goals in a client/server data processing system
US11106485B2 (en) Modeling space consumption of a migrated VM
US7401248B2 (en) Method for deciding server in occurrence of fault
Lu Aqueduct: Online data migration with performance guarantees
CN104391737B (zh) 云平台中负载平衡的优化方法
US8181050B2 (en) Adaptive throttling for data processing systems
US20050154576A1 (en) Policy simulator for analyzing autonomic system management policy of a computer system
CN104462432B (zh) 自适应的分布式计算方法
US7581052B1 (en) Approach for distributing multiple interrupts among multiple processors
US10564998B1 (en) Load balancing using predictive VM-based analytics
CN104407926B (zh) 一种云计算资源的调度方法
US20090228446A1 (en) Method for controlling load balancing in heterogeneous computer system
Wang et al. eraid: Conserving energy in conventional disk-based raid system
CN112835698A (zh) 一种基于异构集群的请求分类处理的动态负载均衡方法
CN110543355A (zh) 一种自动均衡云平台资源的方法
US20180007128A1 (en) Modeling and Forecasting Reserve Capacity for Overbooked Clusters
Ghanbari et al. Adaptive learning of metric correlations for temperature-aware database provisioning
WO2020220437A1 (zh) 一种基于AdaBoost-Elman的虚拟机软件老化预测方法
Lang et al. Not for the Timid: On the Impact of Aggressive Over-booking in the Cloud
Cheng et al. Self-tuning batching with dvfs for improving performance and energy efficiency in servers
CN110909023B (zh) 一种查询计划的获取方法、数据查询方法及装置
CN113254256B (zh) 数据重构方法、存储设备及存储介质
CN117170861A (zh) 一种云服务器微服务调度方法及系统
Huixi et al. A combination of host overloading detection and virtual machine selection in cloud server consolidation based on learning method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19926814

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19926814

Country of ref document: EP

Kind code of ref document: A1