WO2020237729A1 - 一种基于模式转移的虚拟机混合备用动态可靠性评估方法 - Google Patents

一种基于模式转移的虚拟机混合备用动态可靠性评估方法 Download PDF

Info

Publication number
WO2020237729A1
WO2020237729A1 PCT/CN2019/090868 CN2019090868W WO2020237729A1 WO 2020237729 A1 WO2020237729 A1 WO 2020237729A1 CN 2019090868 W CN2019090868 W CN 2019090868W WO 2020237729 A1 WO2020237729 A1 WO 2020237729A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual machine
state
probability
time
formula
Prior art date
Application number
PCT/CN2019/090868
Other languages
English (en)
French (fr)
Inventor
郭军
刘文凤
张斌
刘晨
侯帅
侯凯
李薇
柳波
王嘉怡
王馨悦
张瀚铎
张娅杰
Original Assignee
东北大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东北大学 filed Critical 东北大学
Publication of WO2020237729A1 publication Critical patent/WO2020237729A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/008Reliability or availability analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects

Definitions

  • the invention belongs to the field of cloud computing, and specifically relates to a virtual machine hybrid standby dynamic reliability evaluation method based on mode transfer.
  • the present invention divides the virtual machine into three different modes, namely operation mode, cold backup mode and hot backup mode, and adopts mode transfer, so that the standby virtual machine replaces the failed working virtual machine when needed to maintain the operation of the system, and adopts multi-valued decision making Figure for reliability assessment.
  • the present invention proposes a virtual machine hybrid standby dynamic reliability evaluation method based on mode transfer, which specifically includes the following steps:
  • Step 1 Collect resource and performance data, and perform feature selection and standardization processing.
  • the specific steps include steps 1.1 to 1.3:
  • Step 1.1 Collect historical resources and performance data to form a data matrix, which in turn includes: computing resources, storage resources, disk IO resources, network resources; among them, computing resources include: CPU idle percentage, CPU running time, CPU usage rate, Storage resources include: memory usage, maximum memory occupied, memory size, and maximum memory usage.
  • Disk IO resources include: virtual fast device I/O, throughput, network resources include: network load rate, virtual network receiving data volume, Virtual network sending data volume, virtual network receiving data volume ratio, virtual network sending data volume ratio;
  • Step 1.2 Use the PCA (Principal Component Analysis) method to select the features of the collected data, including steps 1.2.1 to 1.2.7:
  • Step 1.2.1 For the collected data matrix, calculate the covariance matrix R of the correlation coefficient between each parameter;
  • Step 1.2.2 Calculate the eigenvalue ⁇ i and the eigenvector ⁇ i of the matrix
  • Step 1.2.3 Calculate the contribution rate ⁇ and cumulative contribution rate ⁇ sum of each parameter according to the eigenvalue ⁇ i and the eigenvector ⁇ i of the matrix.
  • the formula is as follows:
  • r is the number of data indicators, and t is the maximum value of i;
  • Step 1.2.4 Select the data whose cumulative contribution rate is greater than the set threshold q as the main component
  • Step 1.2.5 Calculate the principal component load, the principal component load l ij represents the correlation between the principal component and the original variable, as shown in the following formula:
  • Step 1.2.6 Calculate the principal component score Z, that is, perform a weighted summation on the principal components, and the weight is the load of each principal component;
  • Step 1.2.7 Through principal component analysis, the parameters that have a greater impact on the failure rate of the virtual machine are: CPU utilization, memory utilization, network utilization and disk IO speed, which are selected by the PCA method for the collected data feature.
  • Step 1.3 Using preprocessing methods, based on historical data and real-time data, standardize the parameters that have a greater impact on the failure rate of the virtual machine, and obtain standardized historical data and real-time data, including steps 1.3.1 to 1.3.2 :
  • Step 1.3.1 For historical data, standardize the CPU utilization, memory utilization, network utilization, and disk IO speed data, so that the result falls to the [0,1] interval, and the standardized historical data is obtained.
  • the formula is as follows Show:
  • x i is the i-th parameter that has a greater impact on the virtual machine failure rate
  • x j is the j-th parameter that has a greater impact on the virtual machine's failure rate
  • Step 1.3.2 For real-time data, obtain the time series data ⁇ p 1 ,p 2 ,...,p n ⁇ of the virtual machine state changes according to the real-time operating status information of the virtual machine, and use the Z-SCORE method for standardization.
  • Real-time data the calculation formula is as follows:
  • p'k is the real-time data after standardization
  • Step 2 Predict the HSMM-based virtual machine failure probability based on the standardized data, which specifically includes steps 2.1 to 2.4:
  • Step 2.1 Perform HSMM modeling on the failure trend of virtual machines, which specifically includes steps 2.1.1 to 2.1.2;
  • Step 2.1.2 Discretize the feature vector, that is, the principal component, and divide each feature vector into 3 states: high is the cumulative contribution rate of more than 50%, medium is the cumulative contribution rate of 20%-50%, and low For the cumulative contribution rate of 0-20%, the probability matrix B that constitutes the output state of HSMM is obtained.
  • the formula is as follows:
  • N and M are the number of hidden states and observable states, respectively, and b ik is the probability that the hidden state of the system produces an observed state at time t;
  • Step 2.2 Construct a state space based on the virtual machine running state data, which specifically includes steps 2.2.1 to 2.2.2;
  • Step 2.2.1 Assuming that the number of states of a state vector O in the observable state is represented by the function f(O), then the observation state ⁇ C 1 ′,M 1 ′,N 1 ′,D 1 ′ ⁇ The sequence number in the dimensional space, the formula is as follows:
  • N( ⁇ C 1 ′,M 1 ′,N 1 ′,D 1 ′)) C 1 ′f(M′)f(N′)f(D′)+M 1 ′f(N′)f( D′)+N 1 ′f(D′)+D 1 ′ (10)
  • N is the number of hidden states
  • a ij is the hidden state si of the system at time t
  • Step 2.3 Train the model parameters to correctly reflect the operating characteristics of the virtual machine, and obtain the hidden Markov model parameters, which specifically include steps 2.3.1 to 2.3.6;
  • Step 2.3.1 Initialize the 4 parameters A, B, ⁇ , and D of HSMM randomly, where ⁇ is the initial probability, A is the state probability matrix between states, B is the output probability matrix, and D is the state resident parameter;
  • Step 2.3.2 observation time t is calculated sequence (O t + 1, O t + 2, ..., O T), and the hidden state S i is the probability of the forward variable ⁇ t (i), calculation of the specific
  • the formula is as follows:
  • a ij is the hidden state S i at time t and the probability of transitioning to the state S j at t+1;
  • p j (d) is the probability of staying in the hidden state S j at time d;
  • b j (O s ) Is the probability that the hidden state of the system produces the observed state O s at time t
  • Step 2.3.3 calculating at time t hidden observation sequence satisfying S i (O t + 1, O t + 2, ..., O T) the probability of the backward variable ⁇ t (i), calculation of the specific
  • S i (O t + 1, O t + 2, ..., O T) the probability of the backward variable ⁇ t (i)
  • a ij is the hidden state S i at time t and the probability of transitioning to state S j at t+1;
  • p j (d) is the probability of staying in hidden state S j at time d;
  • b j (O s ) Is the probability that the hidden state of the system produces the observed state O s at time t
  • Step 2.3.4 Calculate the probability ⁇ i (i,j) that the virtual machine is in the hidden state i at time t and in the hidden state j at time t+1, the formula is as follows; you can also get the hidden state i at time t The probability of ⁇ t (i), the formula is as follows:
  • Step 2.3.5 According to formulas 12-15, get the expected value of the hidden state i at the initial moment Expected value of hidden state transition matrix Expected value of transition state matrix
  • the formulas are as follows:
  • Step 2.3.6 According to formulas 16-19, the expected value obtained is substituted into ⁇ t (i) and ⁇ t (i) for iterative calculation until the difference between the expected values of the two parameters does not exceed 1%, then the iteration is ended, and the historical Hidden Markov model parameters A, B, ⁇ , D trained in the data;
  • Step 2.4 Based on the time series data obtained from the real-time state changes of the virtual machine after standardization, the failure rate prediction model is established to calculate the failure probability of the virtual machine at the next moment, which specifically includes steps 2.4.1 to 2.4.4
  • Step 2.4.1 Calculate the probability f(i,1) of the initial state according to the observed value, the formula is as follows:
  • p i is the probability of the failure state
  • B i is the transition state matrix
  • y 1 is the observation sequence of the initial state
  • Step 2.4.2 Calculate the probability f(j,t) of different states at time t according to the following formula:
  • Step 2.4.3 Find the transition probability C ij of each state at T+1, the formula is as follows:
  • p j (d) is the probability of staying in the hidden state S j at time d
  • Step 2.4.4 obtaining state at time T 1 + S i is the failure probability P i, the following equation:
  • Step 3 Perform reliability evaluation of the cold and hot backup cloud system based on the multi-value decision diagram (MDD), specifically: according to the virtual machine failure rate prediction method of step 2, obtain the failure probability of each operating mode and hot mode virtual machine, According to the results of the prediction, MDD is constructed, that is, a cold and hot backup cloud system based on a multi-value decision diagram, and the reliability evaluation is carried out, including steps 3.1 to 3.3:
  • MDD multi-value decision diagram
  • Step 3.1 Build MDD for the operating mode virtual machine, which specifically includes steps 3.1.1 to 3.1.3;
  • Step 3.1.1 Discrete the system prediction time interval into m segments to obtain a multi-value decision diagram with m+1 branches
  • Step 3.1.2 The terminal of each branch is a tuple among them, Can take the values 0 or 1, A c respectively denote virtual machine is running to paragraph d time.
  • Step 3.1.3 Substituting the operating parameters of the operating mode virtual machine into the previous failure rate prediction model, and obtaining the probability that the virtual machine fails at time t is Po (t), the formula is as follows:
  • Step 3.2 Build MDD for the hot standby mode virtual machine, which specifically includes steps 3.2.1 to 3.2.4;
  • Step 3.2.1 Assuming that the operating mode virtual machine fails at T d-1 (1 ⁇ d ⁇ m), we get I.e. hot backup VM A c d to be activated at the time period, and they have (m-d + 2) th branch
  • Step 3.2.2 After the standby operation is completed for d-1 time period, the operating state is switched from the standby state at time d, and the operation fails during the d'-d time period;
  • Step 3.2.3 Put the virtual machine in hot backup mode At the beginning of the forecast, it is in the backup state. Calculate the period of operation d-1, and use the probability of failure in the d section Means
  • Step 3.2.4 Substitute the operating parameters of the virtual machine in hot backup mode, namely CPU utilization, memory utilization, network utilization, disk read and write speed into the failure rate prediction model, and obtain the probability of failure of the virtual machine at time t as P h (t), the specific formula is as follows:
  • Step 3.3 Evaluate the MDD model of the system to obtain system reliability, including steps 3.3.1 to 3.3.5;
  • Step 3.3.1 Combine the MDD model of each virtual machine in operation mode and the MDD model of the virtual machine in hot mode to obtain the system MDD model;
  • Step 3.3.2 Create node X 1 in the first layer of the system MDD model, and set the terminal value to
  • Step 3.3.3 Add the operation mode virtual machine MDD to the system MDD model, calculate the terminal value E * , the formula is as follows:
  • Step 3.3.3.1 If E * ⁇ k, add the MDD of the virtual machine in A c (2 ⁇ c ⁇ k) operation mode to the path of the current system MDD model, and update the terminal value of the path.
  • the formula is:
  • Step 3.3.3.2 If E * ⁇ k, do not add;
  • Step 3.3.4 Add the hot backup mode virtual machine MDD to the system MDD model, and calculate the terminal value E * , the formula is as follows:
  • Step 3.3.4.1 If E * ⁇ k, find d * , the formula is as follows:
  • d * means that the number of virtual machines working normally in the operation mode stage is less than k;
  • Step 3.3.4.3 If E * ⁇ k, do not add;
  • Step 3.3.5 Combine the paths according to the optimization rules to calculate the reliability of the system.
  • the optimization rules are specifically as follows: 1) That is, if the branches of two non-terminal points point to the same node (it can be a terminal point or a non-terminal point) ), their branches all point to this node, and the two same nodes are merged to reduce one duplicate node. 2) If all possible state branches of a non-terminal point point to the next same node (it can be a terminal point or a non-terminal point), then this non-terminal point is eliminated from the MDD graph, and the introduced node points directly to The next node.
  • Step 3.3.5.2 Calculate the reliability of the system by summing the path probabilities of all paths whose endpoint value is 1.
  • the present invention adopts a virtual machine mode transfer hybrid backup optimization method based on reliability evaluation.
  • the present invention obtains HSMM parameters through training historical data, and uses the HSMM model to predict the failure rate of virtual machines , Construct MDD according to the predicted result, then simplify the terminal value of MDD, and take the total probability of occurrence of all paths from the root node to 1 as the reliability of the system.
  • the virtual machine is divided into three different modes, namely operation mode, cold backup mode, and hot backup mode. Mode transfer is adopted to enable the standby virtual machine to replace the failed working virtual machine when needed to maintain the operation of the system.
  • the multi-value decision diagram is used for Reliability assessment.
  • Figure 1 is a flow chart of a virtual machine hybrid standby dynamic optimization method based on mode transfer according to an embodiment of the present invention (I will add the text part after it is determined);
  • FIG. 2 is a flowchart of predicting the failure probability of a virtual machine based on HSMM according to an embodiment of the present invention
  • FIG. 3 is a flowchart of reliability evaluation of a cold and hot backup cloud system based on a multi-value decision diagram according to an embodiment of the present invention
  • FIG. 4 is an average response time under the scenario of an embodiment of the present invention.
  • FIG. 5 is a prediction state of a scene in an embodiment of the present invention.
  • FIG. 6 is the average response time under scenario 2 of the embodiment of the present invention.
  • Fig. 7 is a prediction state under scenario 2 of an embodiment of the present invention.
  • Figure 8 The average response time under scenario 3 of the embodiment of the present invention.
  • FIG. 9 is a prediction state under scenario 3 of an embodiment of the present invention.
  • FIG. 10 is an average request failure rate under each adjustment method according to an embodiment of the present invention.
  • FIG. 11 is the average response time of each adjustment method according to an embodiment of the present invention.
  • the experiment uses HPZ820 workstation to build a cloud service system, installs Ubuntu Kylin Linux 18.10 x86_64 system, KVM virtualization software, Nginx load balancer, Tomcat Web middleware, MySQL database on the workstation, and loads the cloud service performance adaptive optimization software package. And add electrical working condition monitor to each server to monitor system power consumption information.
  • the experiment uses Collectd+InfluxDB+Grafana tools to monitor virtual machine running status information.
  • a virtual machine hybrid standby dynamic reliability evaluation method based on mode transfer specifically includes the following steps:
  • Step 1 Collect resource and performance data, and perform feature selection and standardization processing.
  • the specific steps include steps 1.1 to 1.3:
  • Step 1.1 Collect historical resources and performance data to form a data matrix, which in turn includes: computing resources, storage resources, disk IO resources, network resources; among them, computing resources include: CPU idle percentage, CPU running time, CPU usage rate, Storage resources include: memory usage, maximum memory occupied, memory size, and maximum memory usage.
  • Disk IO resources include: virtual fast device I/O, throughput, network resources include: network load rate, virtual network receiving data volume, Virtual network sending data volume, virtual network receiving data volume ratio, virtual network sending data volume ratio;
  • Step 1.2 Use the PCA (Principal Component Analysis) method to select the features of the collected data, including steps 1.2.1 to 1.2.7:
  • Step 1.2.1 For the collected data matrix, calculate the covariance matrix R of the correlation coefficient between each parameter;
  • Step 1.2.2 Calculate the eigenvalue ⁇ i and the eigenvector ⁇ i of the matrix
  • Step 1.2.3 Calculate the contribution rate ⁇ and cumulative contribution rate ⁇ sum of each parameter according to the eigenvalue ⁇ i and the eigenvector ⁇ i of the matrix.
  • the formula is as follows:
  • r is the number of data indicators, and t is the maximum value of i;
  • Step 1.2.4 Select the data whose cumulative contribution rate is greater than the set threshold q as the main component. In the experiment of this embodiment, q is selected as 80%.
  • Step 1.2.5 Calculate the principal component load, the principal component load l ij represents the correlation between the principal component and the original variable, as shown in the following formula:
  • Step 1.2.6 Calculate the principal component score Z, that is, perform a weighted summation on the principal components, and the weight is the load of each principal component;
  • Step 1.2.7 Through principal component analysis, the parameters that have a greater impact on the failure rate of the virtual machine are: CPU utilization, memory utilization, network utilization and disk IO speed, which are selected by the PCA method for the collected data feature.
  • Step 1.3 Using the preprocessing method, based on historical data and real-time data, standardize the parameters that have a greater impact on the failure rate of the virtual machine, and obtain standardized historical data and real-time data, including steps 1.3.1 to 1.3.2 :
  • Step 1.3.1 For historical data, standardize the CPU utilization, memory utilization, network utilization, and disk IO speed data, so that the result falls to the [0,1] interval, and the standardized historical data is obtained.
  • the formula is as follows Show:
  • x i is the i-th parameter that has a greater impact on the virtual machine failure rate
  • x j is the j-th parameter that has a greater impact on the virtual machine's failure rate
  • Step 1.3.2 For real-time data, obtain the time series data ⁇ p 1 ,p 2 ,...,p n ⁇ of the virtual machine state changes according to the real-time operating status information of the virtual machine, and use the Z-SCORE method for standardization.
  • Real-time data the calculation formula is as follows:
  • p′ k is the real-time data after standardization
  • Step 2 Predict the HSMM-based virtual machine failure probability based on the standardized data, as shown in Figure 2, which specifically includes steps 2.1 to 2.4:
  • Step 2.1 Perform HSMM modeling on the failure trend of virtual machines, which specifically includes steps 2.1.1 to 2.1.2;
  • Step 2.1.2 Discretize the feature vector, that is, the principal component, and divide each feature vector into 3 states: high is the cumulative contribution rate of more than 50%, medium is the cumulative contribution rate of 20%-50%, and low For the cumulative contribution rate of 0-20%, the probability matrix B that constitutes the output state of HSMM is obtained.
  • the formula is as follows:
  • N and M are the number of hidden states and observable states, respectively, and bik is the probability of the hidden state of the system at time t producing an observed state;
  • Step 2.2 Construct a state space based on the virtual machine running state data, which specifically includes steps 2.2.1 to 2.2.2;
  • Step 2.2.1 Assuming that the number of states of a state vector O in the observable state is represented by the function f(O), then the observation state ⁇ C 1 ′,M 1 ′,N 1 ′,D 1 ′ ⁇ The sequence number in the dimensional space, the formula is as follows:
  • N( ⁇ C 1 ′,M 1 ′,N 1 ′,D 1 ′)) C 1 ′f(M′)f(N′)f(D′)+M 1 ′f(N′)f( D′)+N 1 ′f(D′)+D 1 ′ (10)
  • N is the number of hidden states
  • a ij is the hidden state si of the system at time t
  • Step 2.3 Train the model parameters to correctly reflect the operating characteristics of the virtual machine, and obtain the hidden Markov model parameters, which specifically include steps 2.3.1 to 2.3.6;
  • Step 2.3.1 Initialize the 4 parameters A, B, ⁇ , and D of HSMM randomly, where ⁇ is the initial probability, A is the state probability matrix between states, B is the output probability matrix, and D is the state resident parameter;
  • Step 2.3.2 observation time t is calculated sequence (O t + 1, O t + 2, ..., O T), and the hidden state S i is the probability of the forward variable ⁇ t (i), calculation of the specific
  • the formula is as follows:
  • a ij is the hidden state S i at time t and the probability of transitioning to the state S j at t+1;
  • p j (d) is the probability of staying in the hidden state S j at time d;
  • b j (O s ) Is the probability that the hidden state of the system produces the observed state O s at time t
  • Step 2.3.3 calculating at time t hidden observation sequence satisfying S i (O t + 1, O t + 2, ..., O T) the probability of the backward variable ⁇ t (i), calculation of the specific
  • S i (O t + 1, O t + 2, ..., O T) the probability of the backward variable ⁇ t (i)
  • a ij is the hidden state S i at time t and the probability of transitioning to the state S j at t+1;
  • p j (d) is the probability of staying in the hidden state S j at time d;
  • b j (O s ) Is the probability that the hidden state of the system produces the observed state O s at time t
  • Step 2.3.4 Calculate the probability ⁇ i (i,j) that the virtual machine is in the hidden state i at time t and in the hidden state j at time t+1, the formula is as follows; you can also get the hidden state i at time t The probability of ⁇ t (i), the formula is as follows:
  • Step 2.3.6 According to formulas 16-19, the expected value obtained is substituted into ⁇ t (i) and ⁇ t (i) for iterative calculation until the difference between the expected values of the two parameters does not exceed 1%, then the iteration is ended, and the historical Hidden Markov model parameters A, B, ⁇ , D trained in the data;
  • Step 2.4 Based on the time series data obtained from the real-time state changes of the virtual machine after standardization, the failure rate prediction model is established to calculate the failure probability of the virtual machine at the next moment, which specifically includes steps 2.4.1 to 2.4.4
  • Step 2.4.1 Calculate the probability f(i,1) of the initial state according to the observed value, the formula is as follows:
  • p i is the probability of the failure state
  • B i is the transition state matrix
  • y 1 is the observation sequence of the initial state
  • Step 2.4.2 Calculate the probability f(j,t) of different states at time t according to the following formula:
  • Step 2.4.3 Find the transition probability C ij of each state at T+1, the formula is as follows:
  • p j (d) is the probability of staying in the hidden state S j at time d
  • Step 2.4.4 obtaining state at time T 1 + S i is the failure probability P i, the following equation:
  • Step 3 Perform reliability evaluation of the cold and hot backup cloud system based on the Multi-value Decision Diagram (MDD), as shown in Figure 3. Specifically: According to the virtual machine failure rate prediction method in Step 2, each operating mode and hot mode are obtained The failure probability of the virtual machine is constructed according to the predicted result, which is a cold and hot backup cloud system based on a multi-value decision diagram, and the reliability evaluation is carried out, including steps 3.1 to 3.3:
  • Step 3.1 Build MDD for the operating mode virtual machine, which specifically includes steps 3.1.1 to 3.1.3;
  • Step 3.1.1 Discrete the system prediction time interval into m segments to obtain a multi-value decision diagram with m+1 branches
  • Step 3.1.2 The terminal of each branch is a tuple among them, Can take the values 0 or 1, A c respectively denote virtual machine is running to paragraph d time.
  • Step 3.1.3 Substituting the operating parameters of the operating mode virtual machine into the previous failure rate prediction model, and obtaining the probability that the virtual machine fails at time t is Po (t), the formula is as follows:
  • Step 3.2 Build MDD for the hot standby mode virtual machine, which specifically includes steps 3.2.1 to 3.2.4;
  • Step 3.2.1 Assuming that the operating mode virtual machine fails at T d-1 (1 ⁇ d ⁇ m), we get I.e. hot backup VM A c d to be activated at the time period, and they have (m-d + 2) th branch
  • Step 3.2.2 After the standby operation is completed for d-1 time period, the operating state is switched from the standby state at time d, and the operation fails during the d'-d time period;
  • Step 3.2.3 Put the virtual machine in hot backup mode At the beginning of the forecast, it is in the backup state. Calculate the period of operation d-1, and use the Means
  • Step 3.2.4 Substitute the operating parameters of the virtual machine in hot backup mode, namely CPU utilization, memory utilization, network utilization, disk read and write speed into the failure rate prediction model, and obtain the probability of failure of the virtual machine at time t as P h (t), the specific formula is as follows:
  • Step 3.3 Evaluate the MDD model of the system to obtain system reliability, including steps 3.3.1 to 3.3.5;
  • Step 3.3.1 Combine the MDD model of each virtual machine in operating mode and the MDD model of the virtual machine in hot mode to obtain the system MDD model;
  • Step 3.3.2 Create node X1 in the first layer of the system MDD model, and set the terminal value to
  • Step 3.3.3 Add the operation mode virtual machine MDD to the system MDD model, calculate the terminal value E * , the formula is as follows:
  • Step 3.3.3.1 If E * ⁇ k, add the MDD of the virtual machine in A c (2 ⁇ c ⁇ k) operation mode to the path of the current system MDD model, and update the terminal value of the path.
  • the formula is:
  • Step 3.3.3.2 If E * ⁇ k, do not add;
  • Step 3.3.4 Add the hot backup mode virtual machine MDD to the system MDD model, and calculate the terminal value E * , the formula is as follows:
  • Step 3.3.4.1 If E * ⁇ k, find d * , the formula is as follows:
  • d * means that the number of virtual machines working normally in the operation mode stage is less than k;
  • Step 3.3.4.3 If E * ⁇ k, do not add;
  • Step 3.3.5 Combine the paths according to the optimization rules to calculate the reliability of the system.
  • the optimization rules are specifically as follows: 1) That is, if the branches of two non-terminal points point to the same node (it can be a terminal point or a non-terminal point) ), their branches all point to this node, and the two same nodes are merged to reduce one duplicate node. 2) If all possible state branches of a non-terminal point point to the next same node (it can be a terminal point or a non-terminal point), then this non-terminal point is eliminated from the MDD graph, and the introduced node points directly to The next node.
  • Step 3.3.5.1 The system is normal before time t, that is, the path probability of non-failure, the calculation formula is:
  • Step 3.3.5.2 Calculate the reliability of the system by summing the path probabilities of all paths whose endpoint value is 1.
  • the present invention uses scenario one, scenario two, and scenario three to evaluate reliability respectively.
  • the single business operation under scenario user login operation; try to increase the load gradually;
  • the single business operation under scenario two query and order ticket operation; Try to increase the load suddenly;
  • mixed business operations under scenario three test the login and query operations of the system at the same time;
  • the prediction result of scenario 1 is shown in Figure 4.
  • the average response time of things is 0.41s, which is the response time of the business system in an abnormal range.
  • the system has an abnormal state;
  • Figure 5 shows that the hidden half mark
  • the output state of the husband model can be changed in advance, which can better predict the running state of the virtual machine.
  • scenario two The prediction results of scenario two are shown in Figure 6 and Figure 7.
  • the model can better predict the running state of the virtual machine; and comparing the average response time and the predicted state diagram under scenario one and scenario two, you can see that a sudden increase in load causes virtual The machine state degradation accelerates, and the virtual machine state returns to the normal state when the load decreases. These changes can be better reflected in the state prediction of the hidden semi-Markov model.
  • scenario three The prediction results of scenario three are shown in Figure 8 and Figure 9. Comparing the average response time and predicted state diagrams under scenario one and scenario three, it can be seen that hybrid services cause the acceleration of virtual machine state degradation, and this model can also better reflect the virtual machine’s situation. Kind of state change.
  • the virtual machine state prediction results of hidden semi-Markov model and hidden Markov model are compared and analyzed.
  • the HSMM recognition accuracy rate is 0.96, and the HMM recognition accuracy rate is 0.707;
  • the present invention designs three sets of comparative experiments, and the prediction results are shown in Figures 10 and 11: all are operating modes (recorded as the control method one), traditional backup methods (recorded as the control method two), under the three-mode transfer mechanism The passive adjustment method (recorded as control method three), compare with this method;
  • Loadrunner is used to pressurize the backup system in 3 periods; 0 ⁇ 4h simulation pressurization is 0 ⁇ 5000 requests per second; 5 ⁇ 8h simulation pressurization is 5000 ⁇ 10000 requests per second; 8 ⁇ 12h simulation pressurization is 1000 ⁇ per second 10,000 requests; compare the performance of each adjustment method from the average failed request rate and average response time, and the results are shown in Figures 10 and 11; the method used in the present invention is less than the average failed request rate and average response time in comparison method 1. Compared with method 2, and it can be seen that the indicators obtained by the three methods within 12 hours are roughly increasing, indicating that as the running time increases, the system reliability and performance are reduced. The reliability and performance of the method proposed in this article vary little, which verifies the correctness of the method in this article from the side.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明提出一种基于模式转移的虚拟机混合备用动态可靠性评估方法,包括:采集资源和性能数据,并进行特性选择及标准化处理;对标准化后数据,预测基于HSMM的虚拟机失效概率;基于多值决策图的冷热备份云系统进行可靠性评估;为了达到准确的对系统的可靠性进行定量评估,本发明简化MDD的终端值,将从根节点到1的所有路径的发生概率的总和作为系统的可靠性。并将虚拟机分成三种不同模式即操作模式、冷备份模式和热备份模式,采用模式转移,使备用虚拟机在需要时替换失效的工作虚拟机来维持系统的运行,采用多值决策图进行可靠性评估。通过三组对比试验,验证得到本发明的平均响应时间和失效率低,可靠性高,从侧面验证本发明方法的正确性。

Description

一种基于模式转移的虚拟机混合备用动态可靠性评估方法 技术领域
本发明属于云计算领域,具体涉及一种基于模式转移的虚拟机混合备用动态可靠性评估方法。
背景技术
在网络技术高速发展的今天,云系统的结构越来越复杂化,容错技术多样化,人们对系统可靠性要求越来越高。为云系统做出全面准确地可靠性评估已经成为一个重要研究课题。随着运行时间的增长,虚拟机不可避免地会受到自身老化和服务请求高并发等因素的影响,虚拟机的失效率具有增大的趋势,可靠性逐渐恶化,进而增大系统的运行风险。要对系统可靠性进行准确的定量评估,则必须先预测虚拟机的可靠性参数变化规律,特别是虚拟机的失效率。因此本发明将虚拟机分成三种不同模式即操作模式、冷备份模式和热备份模式,采用模式转移,使备用虚拟机在需要时替换失效的工作虚拟机来维持系统的运行,采用多值决策图进行可靠性评估。
发明内容
基于以上技术问题,本发明提出一种基于模式转移的虚拟机混合备用动态可靠性评估方法,具体包括如下步骤:
步骤1:采集资源和性能数据,并进行特性选择及标准化处理,具体步骤包括步骤1.1~步骤1.3:
步骤1.1:采集历史资源和性能数据,组成数据矩阵,依次包括:计算资源、存储资源、磁盘IO资源、网络资源;其中,计算资源包括:CPU空闲的百分比、CPU的运行时间、CPU使用率,存储资源包括:内存使用率、占用的最大内存、内存大小、内存最大使用率,磁盘IO资源包括:虚拟快设备I/O、吞吐量,网络资源包括:网络负载率、虚拟网络接受数据量、虚拟网络发送数据量、虚拟网络接收数据量比例、虚拟网络发送数据量比例;
步骤1.2:采用PCA(主成分分析)方法对所采集的数据进行特征选择,具体包括步骤1.2.1~步骤1.2.7:
步骤1.2.1:针对采集的数据矩阵,计算各参数之间相关系数的协方差矩阵R;
Figure PCTCN2019090868-appb-000001
步骤1.2.2:计算矩阵的特征值λ i和特征向量α i
步骤1.2.3:根据矩阵的特征值λ i和特征向量α i,计算各参数贡献率κ和累计贡献率κ sum,公式如下所示:
Figure PCTCN2019090868-appb-000002
Figure PCTCN2019090868-appb-000003
其中,r为数据指标的个数,t就是i的最大值;
步骤1.2.4:选取累计贡献率大于设定阈值q的数据为主成分;
步骤1.2.5:计算主成分载荷,主成分载荷l ij表示主成分与原变量之间的关联大小,如下公式所示:
Figure PCTCN2019090868-appb-000004
步骤1.2.6:计算主成分得分Z,即对主成分进行加权求和,权数为每个主成分的载荷;
步骤1.2.7:通过主成分分析得到对虚拟机失效率影响较大的参数为:CPU利用率、内存利用率、网络利用率和磁盘IO速度,即为采用PCA方法对所采集的数据选择的特征。
步骤1.3:采用预处理方法,分别基于历史数据与实时数据,对虚拟机失效率影响较大的参数进行标准化处理,得到标准化后历史数据与实时数据,具体包括步骤1.3.1~步骤1.3.2:
步骤1.3.1:针对历史数据,对CPU利用率、内存利用率、网络利用率和磁盘IO速度数据进行标准化处理,使结果降落到[0,1]区间,得到标准化后历史数据,公式如下所示:
Figure PCTCN2019090868-appb-000005
其中,x i为第i个对虚拟机失效率影响较大的参数,x j为第j个对虚拟机失效率影响较大的参数;
步骤1.3.2:针对实时数据,根据虚拟机实时运行状态信息,得到虚拟机状态变化的时间序列数据{p 1,p 2,…,p n},采用Z-SCORE方法进行标准化,得到标准化后实时数据,计算公式如下:
Figure PCTCN2019090868-appb-000006
Figure PCTCN2019090868-appb-000007
Figure PCTCN2019090868-appb-000008
其中,p' k为标准化后实时数据;
步骤2:对标准化后数据预测基于HSMM的虚拟机失效概率,具体包括步骤2.1~步骤2.4:
步骤2.1:对虚拟机失效趋势进行HSMM建模,具体包括步骤2.1.1~步骤2.1.2;
步骤2.1.1:针对标准化后的历史数据,用CPU利用率、内存利用率、网络利用率、磁盘读写速度构成特征向量数组V={C′,M′,N′,D′},表示系统的可观测状态;
步骤2.1.2:对特征向量即主成分进行离散化处理,将每个特征向量划分成3个状态:较高为累计贡献率50%以上、中等为累计贡献率20%~50%、较低为累计贡献率0~20%,得到构成HSMM的输出状态的概率矩阵B,公式如下所示:
Figure PCTCN2019090868-appb-000009
其中,N和M分别为隐藏状态和可观测状态的个数,b ik为系统在t时刻隐含状态产生观测状态的概率;
步骤2.2:根据虚拟机运行状态数据构建状态空间,具体包括步骤2.2.1~步骤2.2.2;
步骤2.2.1:假设可观测状态中某个状态向量O的状态个数用函数f(O)表示,则得到观测状态{C 1′,M 1′,N 1′,D 1′}在一维空间中的顺序号,公式如下所示:
N({C 1′,M 1′,N 1′,D 1′})=C 1′f(M′)f(N′)f(D′)+M 1′f(N′)f(D′)+N 1′f(D′)+D 1′    (10)
步骤2.2.2:系统的状况用H={h 1,h 2,h 3,h 4},分别表示正常、异常、注意和失效状态,得到系统的4个隐藏状态以及它们之间的概率转移矩阵A,公式如下所示。
Figure PCTCN2019090868-appb-000010
其中,N为隐藏状态的个数,a ij为系统在t时刻隐含状态si,t+1时刻转移到状态sj的概率;
步骤2.3:对模型参数进行训练使其能正确反应虚拟机运行特征,得到隐马尔科夫模型参数,具体包括步骤2.3.1~步骤2.3.6;
步骤2.3.1:随机初始化HSMM的4个参数A,B,π,D,其中,π是初始概率,A是各个状态间的状态概率矩阵,B是输出概率矩阵,D是状态驻留参数;
步骤2.3.2:计算t时刻观测序列为(O t+1,O t+2,...,O T),且处于隐藏状态S i概率的向前变量为α t(i),具体计算公式如下所示:
Figure PCTCN2019090868-appb-000011
其中,a ij为系统在t时刻隐含状态S i,t+1时刻转移到状态S j的概率;p j(d)为d时刻,在隐藏状态S j停留的概率;b j(O s)为系统在t时刻隐含状态产生观测状态O s的概率
步骤2.3.3:计算时刻t隐藏状态处于S i且满足观测序列(O t+1,O t+2,...,O T)的概率的向后变量为β t(i),具体计算公式如下所示:
Figure PCTCN2019090868-appb-000012
其中,a ij为系统在t时刻隐含状态S i,t+1时刻转移到状态S j的概率;p j(d)为d时刻,在隐藏状态S j停留的概率;b j(O s)为系统在t时刻隐含状态产生观测状态O s的概率
步骤2.3.4:计算虚拟机在t时刻处于隐藏状态i,在t+1时刻处于隐藏状态j的概率ξ i(i,j),公式如下所示;还可以得到在t时刻隐藏状态处于i的概率γ t(i),公式如下所示:
Figure PCTCN2019090868-appb-000013
Figure PCTCN2019090868-appb-000014
步骤2.3.5:根据公式12~15,得到初始时刻隐藏状态i的期望值
Figure PCTCN2019090868-appb-000015
隐藏状态转移矩阵的期望值
Figure PCTCN2019090868-appb-000016
转移状态矩阵的期望值
Figure PCTCN2019090868-appb-000017
分别如下公式所示:
Figure PCTCN2019090868-appb-000018
Figure PCTCN2019090868-appb-000019
Figure PCTCN2019090868-appb-000020
Figure PCTCN2019090868-appb-000021
步骤2.3.6:根据公式16~19,得到的期望值代入α t(i),β t(i)进行迭代计算,直到前后两次参数的期望值相差不超过1%,则结束迭代,得到从历史数据中训练的隐马尔科夫模型参数A,B,π,D;
步骤2.4:根据标准化后的虚拟机实时状态变化,得到的时间序列数据,建立失效率预测模型,计算出下一时刻虚拟机失效概率,具体包括步骤2.4.1~步骤2.4.4
步骤2.4.1:根据观测值求出初始状态的概率f(i,1),公式如下:
f(i,1)=p i*B iy 1            (20)
其中,p i为失效状态的概率,B i为转移状态矩阵,y 1为初始状态的观测序列;
步骤2.4.2:根据如下公式求出t时刻不同状态的概率f(j,t):
f(j,t)=f(j,t)+f(i,t-d)*a ij*p j(d)*b j(O t)          (21)
其中,d=1,2,...,t。
步骤2.4.3:求出T+1时刻各状态的转移概率C ij,公式如下:
C ij=C ij+f(i,T-d)*a ij*p j(d)                       (22)
其中,p j(d)为d时刻,在隐藏状态S j停留的概率
步骤2.4.4:求出T+1时刻状态为S i的失效概率P i,公式如下:
P i=P i+C iS i                              (23)
步骤3:基于多值决策图(MDD)的冷热备份云系统进行可靠性评估,具体为:根据步骤2的虚拟机失效率预测方法,得到每台操作模式和热模式虚拟机的失效概率,根据该预测的结果构建MDD,即基于多值决策图的冷热备份云系统,进行可靠性评估,具体包括步骤3.1~步骤3.3:
步骤3.1:为操作模式虚拟机构建MDD,具体包括步骤3.1.1~步骤3.1.3;
步骤3.1.1:将系统预测时间间隔离散为m段,得到具有m+1个分支的多值决策图
步骤3.1.2:每个分支的终端是一个元组
Figure PCTCN2019090868-appb-000022
其中,
Figure PCTCN2019090868-appb-000023
可以取值为1或0,分别表示第A c个虚拟机是否运行到了第d段时间。
步骤3.1.3:将操作模式虚拟机运行参数代入之前的失效率预测模型,得到虚拟机在t时刻失效的概率为P o(t),公式如下:
Figure PCTCN2019090868-appb-000024
其中,
Figure PCTCN2019090868-appb-000025
操作模式虚拟机A c在预测一开始处于运行状态,在执行了d-1段时间,在d段失效的概率
步骤3.2:为热备用模式虚拟机构建MDD,具体包括步骤3.2.1~步骤3.2.4;
步骤3.2.1:假设操作模式虚拟机在T d-1(1≤d≤m)失效,则得到
Figure PCTCN2019090868-appb-000026
即热备份虚拟机A c在第d个时间段被激活,并且他们有(m-d+2)个分支
步骤3.2.2:在经历d-1个时间段备用操作结束后,在d时刻从备用状态切换运行状态,并且在运行d'-d时间段失败;
步骤3.2.3:将热备份模式虚拟机
Figure PCTCN2019090868-appb-000027
在预测一开始就处于备份状态,计算运行d-1段时间,在第d段失效的概率用
Figure PCTCN2019090868-appb-000028
表示;
步骤3.2.4:将热备份模式虚拟机运行参数,即CPU利用率、内存利用率、网络利用率、磁盘读写速度,代入失效率预测模型中,求得虚拟机在t时刻失效的概率为P h(t),具体公式如下:
Figure PCTCN2019090868-appb-000029
步骤3.3:评估系统MDD模型以获得系统可靠性,具体包括步骤3.3.1~步骤3.3.5;
步骤3.3.1:每一台操作模式虚拟机MDD模型和热模式的虚拟机MDD模型结合起来,得到系统MDD模型;
步骤3.3.2:在系统MDD模型第一层创建节点X 1,终端值设置为
Figure PCTCN2019090868-appb-000030
步骤3.3.3:将操作模式虚拟机MDD添加到系统MDD模型,计算终端值E *,公式如下:
E *=min{E d+(n-c+1),1≤d≤m}                     (26)
其中,2≤c≤k;
步骤3.3.3.1:若E *≥k,将A c(2≤c≤k)操作模式虚拟机的MDD添加到当前的系统MDD模型的路径中,并更新路径的终端值,公式为:
Figure PCTCN2019090868-appb-000031
步骤3.3.3.2:若E *<k,则不添加;
步骤3.3.4:将热备份模式虚拟机MDD添加到系统MDD模型之中,计算终端值E *,公式如下:
E *=min{E d+(n-c+1),1≤d≤m}          (28)
其中,k+1≤c≤m;
步骤3.3.4.1:若E *≥k,则求d *,公式如下:
d *=min(E d|E d<k)            (29)
其中,d *指操作模式阶段虚拟机正常工作的个数少于k个;
步骤3.3.4.2:若d *存在,将热备份模式虚拟机A c的MDD表示添加到系统MDD模型的路径,公式为:
Figure PCTCN2019090868-appb-000032
步骤3.3.4.3:若E *<k,则不添加;
步骤3.3.5:按照优化规则将路径合并,计算出系统的可靠性,所述优化规则具体为:1)即若两个非终结点的分支指向同一个节点(可以为终结点或非终结点),则它们的那条分支都指向这个节点,将这两个相同的节点合并,减少一个重复节点。2)若有一个非终结点的所有可能的状态分支都指向下一个相同的节点(可以为终结点或非终结点),则将这个非终结点从MDD图中消去,直接有该引入节点指向下一个节点。
Figure PCTCN2019090868-appb-000033
步骤3.3.5.2:将所有路径终点值为1的路径概率求和计算出系统的可靠性。
有益技术效果:
本发明采用基于可靠性评估的虚拟机模式转移混合备用优化方法,为了达到准确的对系统的可靠性进行定量评估,本发明通过训练历史数据得到HSMM参数,使用HSMM模型对虚拟机失效率进行预测,根据预测的结果构建MDD,然后简化MDD的终端值,将从根节点到1的所有路径的发生概率的总和作为系统的可靠性。并将虚拟机分成三种不同模式即操作模式、冷备份模式和热备份模式,采用模式转移,使备用虚拟机在需要时替换失效的工作虚拟机来维持系统的运行,采用多值决策图进行可靠性评估。通过三组对比试验,即全部都是操作模式、传统备份模式,和本发明采用的虚拟机模式转移的混合备份方法,验证得到本发明的平均响应时间和失效率低,可靠性高,从侧面验证本发明方法的正确性。
附图说明
图1是本发明实施例的一种基于模式转移的虚拟机混合备用动态优化方法流程图(等确定好了文字部分我会添加上去的);
图2是本发明实施例的预测基于HSMM的虚拟机失效概率流程图;
图3是本发明实施例的基于多值决策图的冷热备份云系统进行可靠性评估流程图;
图4是本发明实施例的场景一下的平均响应时间;
图5是本发明实施例的场景一下的预测状态;
图6是本发明实施例的场景二下的平均响应时间;
图7是本发明实施例的场景二下的预测状态;
图8本发明实施例的场景三下的平均响应时间;
图9是本发明实施例的场景三下的预测状态;
图10是本发明实施例的各调整方法下平均请求失效率;
图11是本发明实施例的各调整方法下平均响应时间;
具体实施方式
下面结合附图和具体实施实例对发明做进一步说明,所有实验数据均来自于飞机票在线订购云服务系统,此云服务系统主要提供注册、登录、购票、查询、退票等多种云服务。实验采用HPZ820工作站来搭建云服务系统,在工作站上安装Ubuntu Kylin Linux 18.10 x86_64系统、KVM虚拟化软件、Nginx负载均衡器、Tomcat Web中间件、MySQL数据库,装载云服务性能自适应优化软件包。并且为每台服务器添加电器工况监测仪来监测系统功耗信息。实验通过Collectd+InfluxDB+Grafana工具来监测虚拟机运行状态信息。
一种基于模式转移的虚拟机混合备用动态可靠性评估方法,如图1所示,具体包括如下步骤:
步骤1:采集资源和性能数据,并进行特性选择及标准化处理,具体步骤包括步骤1.1~步骤1.3:
步骤1.1:采集历史资源和性能数据,组成数据矩阵,依次包括:计算资源、存储资源、磁盘IO资源、网络资源;其中,计算资源包括:CPU空闲的百分比、CPU的运行时间、CPU使用率,存储资源包括:内存使用率、占用的最大内存、内存大小、内存最大使用率,磁盘IO资源包括:虚拟快设备I/O、吞吐量,网络资源包括:网络负载率、虚拟网络接受数据量、虚拟网络发送数据量、虚拟网络接收数据量比例、虚拟网络发送数据量比例;
步骤1.2:采用PCA(主成分分析)方法对所采集的数据进行特征选择,具体包括步骤1.2.1~步骤1.2.7:
步骤1.2.1:针对采集的数据矩阵,计算各参数之间相关系数的协方差矩阵R;
Figure PCTCN2019090868-appb-000034
步骤1.2.2:计算矩阵的特征值λ i和特征向量α i
步骤1.2.3:根据矩阵的特征值λ i和特征向量α i,计算各参数贡献率κ和累计贡献率κ sum,公式如下所示:
Figure PCTCN2019090868-appb-000035
Figure PCTCN2019090868-appb-000036
其中,r为数据指标的个数,t就是i的最大值;
步骤1.2.4:选取累计贡献率大于设定阈值q的数据为主成分,在本实施例实验中选取q为80%。
步骤1.2.5:计算主成分载荷,主成分载荷l ij表示主成分与原变量之间的关联大小,如下公式所示:
Figure PCTCN2019090868-appb-000037
步骤1.2.6:计算主成分得分Z,即对主成分进行加权求和,权数为每个主成分的载荷;
步骤1.2.7:通过主成分分析得到对虚拟机失效率影响较大的参数为:CPU利用率、内存利用率、网络利用率和磁盘IO速度,即为采用PCA方法对所采集的数据选择的特征。
步骤1.3:采用预处理方法,分别基于历史数据与实时数据,对虚拟机失效率影响较大的参数进行标准化处理,得到标准化后历史数据与实时数据,具体包括步骤1.3.1~步骤1.3.2:
步骤1.3.1:针对历史数据,对CPU利用率、内存利用率、网络利用率和磁盘IO速度数据进行标准化处理,使结果降落到[0,1]区间,得到标准化后历史数据,公式如下所示:
Figure PCTCN2019090868-appb-000038
其中,x i为第i个对虚拟机失效率影响较大的参数,x j为第j个对虚拟机失效率影响较大的参数;
步骤1.3.2:针对实时数据,根据虚拟机实时运行状态信息,得到虚拟机状态变化的时间序列数据{p 1,p 2,…,p n},采用Z-SCORE方法进行标准化,得到标准化后实时数据,计算公式如下:
Figure PCTCN2019090868-appb-000039
Figure PCTCN2019090868-appb-000040
Figure PCTCN2019090868-appb-000041
其中,p′ k为标准化后实时数据;
步骤2:对标准化后数据预测基于HSMM的虚拟机失效概率,如图2所示,具体包括步骤2.1~步骤2.4:
步骤2.1:对虚拟机失效趋势进行HSMM建模,具体包括步骤2.1.1~步骤2.1.2;
步骤2.1.1:针对标准化后的历史数据,用CPU利用率、内存利用率、网络利用率、磁盘读写速度构成特征向量数组V={C′,M′,N′,D′},表示系统的可观测状态;
步骤2.1.2:对特征向量即主成分进行离散化处理,将每个特征向量划分成3个状态:较高为累计贡献率50%以上、中等为累计贡献率20%~50%、较低为累计贡献率0~20%,得到构成HSMM的输出状态的概率矩阵B,公式如下所示:
Figure PCTCN2019090868-appb-000042
其中,N和M分别为隐藏状态和可观测状态的个数,bik为系统在t时刻隐含状态产生观测状态的概率;
步骤2.2:根据虚拟机运行状态数据构建状态空间,具体包括步骤2.2.1~步骤2.2.2;
步骤2.2.1:假设可观测状态中某个状态向量O的状态个数用函数f(O)表示,则得到观测状态 {C 1′,M 1′,N 1′,D 1′}在一维空间中的顺序号,公式如下所示:
N({C 1′,M 1′,N 1′,D 1′})=C 1′f(M′)f(N′)f(D′)+M 1′f(N′)f(D′)+N 1′f(D′)+D 1′    (10)
步骤2.2.2:系统的状况用H={h1,h2,h3,h4},分别表示正常、异常、注意和失效状态,得到系统的4个隐藏状态以及它们之间的概率转移矩阵A,公式如下所示。
Figure PCTCN2019090868-appb-000043
其中,N为隐藏状态的个数,a ij为系统在t时刻隐含状态si,t+1时刻转移到状态sj的概率;
步骤2.3:对模型参数进行训练使其能正确反应虚拟机运行特征,得到隐马尔科夫模型参数,具体包括步骤2.3.1~步骤2.3.6;
步骤2.3.1:随机初始化HSMM的4个参数A,B,π,D,其中,π是初始概率,A是各个状态间的状态概率矩阵,B是输出概率矩阵,D是状态驻留参数;
步骤2.3.2:计算t时刻观测序列为(O t+1,O t+2,...,O T),且处于隐藏状态S i概率的向前变量为α t(i),具体计算公式如下所示:
Figure PCTCN2019090868-appb-000044
其中,a ij为系统在t时刻隐含状态S i,t+1时刻转移到状态S j的概率;p j(d)为d时刻,在隐藏状态S j停留的概率;b j(O s)为系统在t时刻隐含状态产生观测状态O s的概率
步骤2.3.3:计算时刻t隐藏状态处于S i且满足观测序列(O t+1,O t+2,...,O T)的概率的向后变量为β t(i),具体计算公式如下所示:
Figure PCTCN2019090868-appb-000045
其中,a ij为系统在t时刻隐含状态S i,t+1时刻转移到状态S j的概率;p j(d)为d时刻,在隐藏状态S j停留的概率;b j(O s)为系统在t时刻隐含状态产生观测状态O s的概率
步骤2.3.4:计算虚拟机在t时刻处于隐藏状态i,在t+1时刻处于隐藏状态j的概率ξ i(i,j),公式如下所示;还可以得到在t时刻隐藏状态处于i的概率γ t(i),公式如下所示:
Figure PCTCN2019090868-appb-000046
Figure PCTCN2019090868-appb-000047
Figure PCTCN2019090868-appb-000048
Figure PCTCN2019090868-appb-000049
Figure PCTCN2019090868-appb-000050
Figure PCTCN2019090868-appb-000051
Figure PCTCN2019090868-appb-000052
步骤2.3.6:根据公式16~19,得到的期望值代入α t(i),β t(i)进行迭代计算,直到前后两次参数的期望值相差不超过1%,则结束迭代,得到从历史数据中训练的隐马尔科夫模型参数A,B,π,D;
步骤2.4:根据标准化后的虚拟机实时状态变化,得到的时间序列数据,建立失效率预测模型,计算出下一时刻虚拟机失效概率,具体包括步骤2.4.1~步骤2.4.4
步骤2.4.1:根据观测值求出初始状态的概率f(i,1),公式如下:
f(i,1)=p i*B iy 1            (20)
其中,p i为失效状态的概率,B i为转移状态矩阵,y 1为初始状态的观测序列;
步骤2.4.2:根据如下公式求出t时刻不同状态的概率f(j,t):
f(j,t)=f(j,t)+f(i,t-d)*a ij*p j(d)*b j(O t)          (21)
其中,d=1,2,...,t。
步骤2.4.3:求出T+1时刻各状态的转移概率C ij,公式如下:
C ij=C ij+f(i,T-d)*a ij*p j(d)           (22)
其中,p j(d)为d时刻,在隐藏状态S j停留的概率
步骤2.4.4:求出T+1时刻状态为S i的失效概率P i,公式如下:
P i=P i+C iS i              (23)
步骤3:基于多值决策图(MDD)的冷热备份云系统进行可靠性评估,如图3所示,具体为:根据步骤2的虚拟机失效率预测方法,得到每台操作模式和热模式虚拟机的失效概率,根据该预测的结果构建MDD,即基于多值决策图的冷热备份云系统,进行可靠性评估,具体包括步骤3.1~步骤3.3:
步骤3.1:为操作模式虚拟机构建MDD,具体包括步骤3.1.1~步骤3.1.3;
步骤3.1.1:将系统预测时间间隔离散为m段,得到具有m+1个分支的多值决策图
步骤3.1.2:每个分支的终端是一个元组
Figure PCTCN2019090868-appb-000053
其中,
Figure PCTCN2019090868-appb-000054
可以取值为1或0,分别表示第A c个虚拟机是否运行到了第d段时间。
步骤3.1.3:将操作模式虚拟机运行参数代入之前的失效率预测模型,得到虚拟机在t时刻失效的概率为P o(t),公式如下:
Figure PCTCN2019090868-appb-000055
其中,
Figure PCTCN2019090868-appb-000056
操作模式虚拟机A c在预测一开始处于运行状态,在执行了d-1段时间,在d段失效的概率
步骤3.2:为热备用模式虚拟机构建MDD,具体包括步骤3.2.1~步骤3.2.4;
步骤3.2.1:假设操作模式虚拟机在T d-1(1≤d≤m)失效,则得到
Figure PCTCN2019090868-appb-000057
即热备份虚拟机A c在第d个时间段被激活,并且他们有(m-d+2)个分支
步骤3.2.2:在经历d-1个时间段备用操作结束后,在d时刻从备用状态切换运行状态,并且在运行d'-d时间段失败;
步骤3.2.3:将热备份模式虚拟机
Figure PCTCN2019090868-appb-000058
在预测一开始就处于备份状态,计算运行d-1段时间,在第d段失效的概率用
Figure PCTCN2019090868-appb-000059
表示;
步骤3.2.4:将热备份模式虚拟机运行参数,即CPU利用率、内存利用率、网络利用率、磁盘读写速度,代入失效率预测模型中,求得虚拟机在t时刻失效的概率为P h(t),具体公式如下:
Figure PCTCN2019090868-appb-000060
步骤3.3:评估系统MDD模型以获得系统可靠性,具体包括步骤3.3.1~步骤3.3.5;
步骤3.3.1:每一台操作模式虚拟机MDD模型和热模式的虚拟机MDD模型结合起来,得到系统MDD模型;
步骤3.3.2:在系统MDD模型第一层创建节点X1,终端值设置为
Figure PCTCN2019090868-appb-000061
步骤3.3.3:将操作模式虚拟机MDD添加到系统MDD模型,计算终端值E *,公式如下:
E *=min{E d+(n-c+1),1≤d≤m}           (26)
其中,2≤c≤k;
步骤3.3.3.1:若E *≥k,将A c(2≤c≤k)操作模式虚拟机的MDD添加到当前的系统MDD模型的路径中,并更新路径的终端值,公式为:
Figure PCTCN2019090868-appb-000062
步骤3.3.3.2:若E *<k,则不添加;
步骤3.3.4:将热备份模式虚拟机MDD添加到系统MDD模型之中,计算终端值E *,公式如下:
E *=min{E d+(n-c+1),1≤d≤m}         (28)
其中,k+1≤c≤m;
步骤3.3.4.1:若E *≥k,则求d *,公式如下:
d *=min(E d|E d<k)            (29)
其中,d *指操作模式阶段虚拟机正常工作的个数少于k个;
步骤3.3.4.2:若d *存在,将热备份模式虚拟机A c的MDD表示添加到系统MDD模型的路径,公式为:
Figure PCTCN2019090868-appb-000063
步骤3.3.4.3:若E *<k,则不添加;
步骤3.3.5:按照优化规则将路径合并,计算出系统的可靠性,所述优化规则具体为:1)即若两个非终结点的分支指向同一个节点(可以为终结点或非终结点),则它们的那条分支都指向这个节点,将这两个相同的节点合并,减少一个重复节点。2)若有一个非终结点的所有可能的状态分支都指向下一个相同的节点(可以为终结点或非终结点),则将这个非终结点从MDD图中消去,直接有该引入节点指向下一个节点。
步骤3.3.5.1:系统在t时刻之前都正常,即不失效的路径概率,计算公式为:
Figure PCTCN2019090868-appb-000064
步骤3.3.5.2:将所有路径终点值为1的路径概率求和计算出系统的可靠性。
仿真实验:
本发明分别采用场景一,场景二,场景三进行可靠性的评估,其中,场景一下的单业务操作:用户登陆操作;尝试逐渐加大负载;场景二下的单业务操作:查询订票操作;尝试突然增加负载;场景三下的混合业务操作:同时测试系统的登陆与查询操作;
场景一预测结果如图4,在时间点126左右,事物的平均响应时间为0.41s,属于不正常范围业务系统的响应时间,此时系统出现异常状态;在图5可看出隐半马尔可夫模型输出状态以提前发生改变,可较好预测出虚拟机运行状态。
场景二预测结果如图6和图7,该模型可以较好地预测出虚拟机的运行状态;并且比较场景一和场景二下的平均响应时间和预测状态图,可以看到突然增加负载造成虚拟机状态退化的加快,当负载减少时虚拟机状态又恢复到正常状态。这些变化隐半马尔科夫模型状态预测都能够较好反映。
场景三预测结果如图8和图9,比较场景一和场景三下的平均响应时间和预测状态图,可以看到混合业务造成虚拟机状态退化的加快,该模型也能较好反映虚拟机这种状态变化。
对隐半马尔科夫模型和隐马尔科夫模型的虚拟机状态预测结果进行比较分析。HSMM识别准确率为0.96,HMM识别准确率为0.707;
本发明设计三组对比实验,预测结果如图10和图11所示:全部都是操作模式(记为对照方法一)、传统备份方式(记为对照方法二)、在三种模式转移机制下的被动调整方法(记为对照方法三),与本方法进行对比;
采用loadrunner分3个时段对备份系统加压;0~4h模拟加压每秒0~5000个请求;5~8h模拟加压每秒5000~10000个请求;8~12h模拟加压每秒1000~10000个请求;从平均失败请求率、平均响应时间对比各调整方法的性能,结果如图10与图11所示;本发明所采用的方法在平均失败请求率,平均响应时间都小于对比方法一和对比方法二,且可以看出在12个小时内三种方法得到的各项指标都是大致呈递增趋势,表明随着运行时间加大,系统可靠性和性能都有所降低。本文所提方法的可靠性和性能变化较小,从侧面验证了本文方法的正确性。

Claims (3)

  1. 一种基于模式转移的虚拟机混合备用动态可靠性评估方法,其特征在于,具体步骤如下:
    步骤1:采集资源和性能数据,并进行特性选择及标准化处理,具体步骤包括步骤1.1~步骤1.3:
    步骤1.1:采集历史资源和性能数据,组成数据矩阵,依次包括:计算资源、存储资源、磁盘IO资源、网络资源;其中,计算资源包括:CPU空闲的百分比、CPU的运行时间、CPU使用率,存储资源包括:内存使用率、占用的最大内存、内存大小、内存最大使用率,磁盘IO资源包括:虚拟快设备I/O、吞吐量,网络资源包括:网络负载率、虚拟网络接受数据量、虚拟网络发送数据量、虚拟网络接收数据量比例、虚拟网络发送数据量比例;
    步骤1.2:采用PCA方法对所采集的数据进行特征选择,具体包括步骤1.2.1~步骤1.2.7:
    步骤1.2.1:针对采集的数据矩阵,计算各参数之间相关系数的协方差矩阵R;
    Figure PCTCN2019090868-appb-100001
    步骤1.2.2:计算矩阵的特征值λ i和特征向量α i
    步骤1.2.3:根据矩阵的特征值λ i和特征向量α i,计算各参数贡献率κ和累计贡献率κ sum,公式如下所示:
    Figure PCTCN2019090868-appb-100002
    Figure PCTCN2019090868-appb-100003
    其中,r为数据指标的个数,t就是i的最大值;
    步骤1.2.4:选取累计贡献率大于设定阈值q的数据为主成分;
    步骤1.2.5:计算主成分载荷,主成分载荷l ij表示主成分与原变量之间的关联大小,如下公式所示:
    Figure PCTCN2019090868-appb-100004
    步骤1.2.6:计算主成分得分Z,即对主成分进行加权求和,权数为每个主成分的载荷;
    步骤1.2.7:通过主成分分析得到对虚拟机失效率影响较大的参数为:CPU利用率、内存利用率、网络利用率和磁盘IO速度,即为采用PCA方法对所采集的数据选择的特征;
    步骤1.3:采用预处理方法,分别基于历史数据与实时数据,对虚拟机失效率影响较大的参数进行标准化处理,得到标准化后历史数据与实时数据,具体包括步骤1.3.1~步骤1.3.2:
    步骤1.3.1:针对历史数据,对CPU利用率、内存利用率、网络利用率和磁盘IO速度数据进行标准化处理,使结果降落到[0,1]区间,得到标准化后历史数据,公式如下所示:
    Figure PCTCN2019090868-appb-100005
    其中,x i为第i个对虚拟机失效率影响较大的参数,x j为第j个对虚拟机失效率影响较大的参数;
    步骤1.3.2:针对实时数据,根据虚拟机实时运行状态信息,得到虚拟机状态变化的时间序列数据{p 1,p 2,...,p n},采用Z-SCORE方法进行标准化,得到标准化后实时数据,计算公式如下:
    Figure PCTCN2019090868-appb-100006
    Figure PCTCN2019090868-appb-100007
    Figure PCTCN2019090868-appb-100008
    其中,p′ k为标准化后实时数据;
    步骤2:对标准化后数据,预测基于HSMM的虚拟机失效概率;
    步骤3:基于多值决策图的冷热备份云系统进行可靠性评估,具体为:根据步骤2的虚拟机失效率预测方法,得到每台操作模式和热模式虚拟机的失效概率,根据该预测的结果构建MDD,即基于多值决策图的冷热备份云系统,进行可靠性评估。
  2. 根据权利要求1所述基于模式转移的虚拟机混合备用动态可靠性评估方法,其特征在于,所述步骤2,具体包括步骤2.1~步骤2.4:
    步骤2.1:对虚拟机失效趋势进行HSMM建模,具体包括步骤2.1.1~步骤2.1.2;
    步骤2.1.1:针对标准化后的历史数据,用CPU利用率、内存利用率、网络利用率、磁盘读写速度构成特征向量数组V={C′,M′,N′,D′},表示系统的可观测状态;
    步骤2.1.2:对特征向量即主成分进行离散化处理,将每个特征向量划分成3个状态:较高为累计贡献率50%以上、中等为累计贡献率20%~50%、较低为累计贡献率0~20%,得到构成HSMM的输出状态的概率矩阵B,公式如下所示:
    Figure PCTCN2019090868-appb-100009
    其中,N和M分别为隐藏状态和可观测状态的个数,bik为系统在t时刻隐含状态产生观测状态的概率;
    步骤2.2:根据虚拟机运行状态数据构建状态空间,具体包括步骤2.2.1~步骤2.2.2;
    步骤2.2.1:假设可观测状态中某个状态向量O的状态个数用函数f(O)表示,则得到观测状态{C 1′,M 1′,N 1′,D 1′}在一维空间中的顺序号,公式如下所示:
    N({C 1′,M 1′,N 1′,D 1′})=C 1′f(M′)f(N′)f(D′)+M 1′f(N′)f(D′)+N 1′f(D′)+D 1′   (10)
    步骤2.2.2:系统的状况用H={h 1,h 2,h 3,h 4},分别表示正常、异常、注意和失效状态,得到系统的4个隐藏状态以及它们之间的概率转移矩阵A,公式如下所示:
    Figure PCTCN2019090868-appb-100010
    其中,N为隐藏状态的个数,a ij为系统在t时刻隐含状态si,t+1时刻转移到状态sj的概率;
    步骤2.3:对模型参数进行训练使其能正确反应虚拟机运行特征,得到隐马尔科夫模型参数,具体包括步骤2.3.1~步骤2.3.6;
    步骤2.3.1:随机初始化HSMM的4个参数A,B,π,D,其中,π是初始概率,A是各个状态间的状态概率矩阵,B是输出概率矩阵,D是状态驻留参数;
    步骤2.3.2:计算t时刻观测序列为(O t+1,O t+2,...,O T),且处于隐藏状态S i概率的向前变量为α t(i),具体计算公式如下所示:
    Figure PCTCN2019090868-appb-100011
    其中,a ij为系统在t时刻隐含状态S i,t+1时刻转移到状态S j的概率;p j(d)为d时刻,在隐藏状态S j停留的概率;b j(O s)为系统在t时刻隐含状态产生观测状态O s的概率
    步骤2.3.3:计算时刻t隐藏状态处于S i且满足观测序列(O t+1,O t+2,...,O T)的概率的向后变量为β t(i),具体计算公式如下所示:
    Figure PCTCN2019090868-appb-100012
    其中,a ij为系统在t时刻隐含状态S i,t+1时刻转移到状态S j的概率;p j(d)为d时刻,在隐藏状态S j停留的概率;b j(O s)为系统在t时刻隐含状态产生观测状态O s的概率
    步骤2.3.4:计算虚拟机在t时刻处于隐藏状态i,在t+1时刻处于隐藏状态j的概率ξ i(i,j),公式如下所示;还可以得到在t时刻隐藏状态处于i的概率γ t(i),公式如下所示:
    Figure PCTCN2019090868-appb-100013
    Figure PCTCN2019090868-appb-100014
    步骤2.3.5:根据公式12~15,得到初始时刻隐藏状态i的期望值
    Figure PCTCN2019090868-appb-100015
    隐藏状态转移矩阵的期望值
    Figure PCTCN2019090868-appb-100016
    转移状态矩阵的期望值
    Figure PCTCN2019090868-appb-100017
    分别如下公式所示:
    Figure PCTCN2019090868-appb-100018
    Figure PCTCN2019090868-appb-100019
    Figure PCTCN2019090868-appb-100020
    Figure PCTCN2019090868-appb-100021
    步骤2.3.6:根据公式16~19,得到的期望值代入α t(i),β t(i)进行迭代计算,得到从历史数据中训练的隐马尔科夫模型参数A,B,π,D;
    步骤2.4:根据标准化后的虚拟机实时状态变化,得到的时间序列数据,建立失效率预测模型,计算出下一时刻虚拟机失效概率,具体包括步骤2.4.1~步骤2.4.4
    步骤2.4.1:根据观测值求出初始状态的概率f(i,1),公式如下:
    f(i,1)=p i*B iy 1          (20)
    其中,p i为失效状态的概率,B i为转移状态矩阵,y 1为初始状态的观测序列;
    步骤2.4.2:根据如下公式求出t时刻不同状态的概率f(j,t):
    f(j,t)=f(j,t)+f(i,t-d)*a ij*p j(d)*b j(O t)         (21)
    其中,d=1,2,...,t;
    步骤2.4.3:求出T+1时刻各状态的转移概率C ij,公式如下:
    C ij=C ij+f(i,T-d)*a ij*p j(d)        (22)
    其中,p j(d)为d时刻,在隐藏状态S j停留的概率
    步骤2.4.4:求出T+1时刻状态为S i的失效概率P i,公式如下:
    P i=P i+C iS i          (23)。
  3. 根据权利要求1所述基于模式转移的虚拟机混合备用动态可靠性评估方法,其特征在于,所述步骤3,具体包括步骤3.1~步骤3.3:
    步骤3.1:为操作模式虚拟机构建MDD,具体包括步骤3.1.1~步骤3.1.3;
    步骤3.1.1:将系统预测时间间隔离散为m段,得到具有m+1个分支的多值决策图
    步骤3.1.2:每个分支的终端是一个元组
    Figure PCTCN2019090868-appb-100022
    其中,
    Figure PCTCN2019090868-appb-100023
    可以取值为1或0,分别表示第A c个虚拟机是否运行到了第d段时间;
    步骤3.1.3:将操作模式虚拟机运行参数代入之前的失效率预测模型,得到虚拟机在t时刻失效的概率为P o(t),公式如下:
    Figure PCTCN2019090868-appb-100024
    其中,
    Figure PCTCN2019090868-appb-100025
    操作模式虚拟机A c在预测一开始处于运行状态,在执行了d-1段时间,在d段失效的概率
    步骤3.2:为热备用模式虚拟机构建MDD,具体包括步骤3.2.1~步骤3.2.4;
    步骤3.2.1:假设操作模式虚拟机在T d-1(1≤d≤m)失效,则得到
    Figure PCTCN2019090868-appb-100026
    即热备份虚拟机A c在第d个时间段被激活,并且他们有(m-d+2)个分支
    步骤3.2.2:在经历d-1个时间段备用操作结束后,在d时刻从备用状态切换运行状态,并且在运行d'-d时间段失败;
    步骤3.2.3:将热备份模式虚拟机
    Figure PCTCN2019090868-appb-100027
    在预测一开始就处于备份状态,计算运行d-1段时间,在第d段失效的概率用
    Figure PCTCN2019090868-appb-100028
    表示;
    步骤3.2.4:将热备份模式虚拟机运行参数,即CPU利用率、内存利用率、网络利用率、磁盘读写速度,代入失效率预测模型中,求得虚拟机在t时刻失效的概率为P h(t),具体公式如下:
    Figure PCTCN2019090868-appb-100029
    步骤3.3:评估系统MDD模型以获得系统可靠性,具体包括步骤3.3.1~步骤3.3.5;
    步骤3.3.1:每一台操作模式虚拟机MDD模型和热模式的虚拟机MDD模型结合起来,得到系统MDD模型;
    步骤3.3.2:在系统MDD模型第一层创建节点X 1,终端值设置为
    Figure PCTCN2019090868-appb-100030
    步骤3.3.3:将操作模式虚拟机MDD添加到系统MDD模型,计算终端值E *,公式如下:
    E *=min{E d+(n-c+1),1≤d≤m}         (26)
    其中,2≤c≤k;
    步骤3.3.3.1:若E *≥k,将A c(2≤c≤k)操作模式虚拟机的MDD添加到当前的系统MDD模型的路径中,并更新路径的终端值,公式为:
    Figure PCTCN2019090868-appb-100031
    步骤3.3.3.2:若E *<k,则不添加;
    步骤3.3.4:将热备份模式虚拟机MDD添加到系统MDD模型之中,计算终端值E *,公式如下:
    E *=min{E d+(n-c+1),1≤d≤m}          (28)
    其中,k+1≤c≤m
    步骤3.3.4.1:若E *≥k,则求d *,公式如下:
    d *=min(E d|E d<k)          (29)
    其中,d *指操作模式阶段虚拟机正常工作的个数少于k个;
    步骤3.3.4.2:若d *存在,将热备份模式虚拟机A c的MDD表示添加到系统MDD模型的 路径,公式为:
    Figure PCTCN2019090868-appb-100032
    步骤3.3.4.3:若E *<k,则不添加;
    步骤3.3.5:按照优化规则将路径合并,计算出系统的可靠性,所述优化规则具体为:1)即若两个非终结点的分支指向同一个节点,则它们的那条分支都指向这个节点,将这两个相同的节点合并,减少一个重复节点;2)若有一个非终结点的所有可能的状态分支都指向下一个相同的节点,则将这个非终结点从MDD图中消去,直接有该引入节点指向下一个节点;
    步骤3.3.5.1:系统在t时刻之前都正常,即不失效的路径概率,计算公式为:
    Figure PCTCN2019090868-appb-100033
    步骤3.3.5.2:将所有路径终点值为1的路径概率求和计算出系统的可靠性。
PCT/CN2019/090868 2019-05-31 2019-06-12 一种基于模式转移的虚拟机混合备用动态可靠性评估方法 WO2020237729A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910466719.7 2019-05-31
CN201910466719.7A CN110187990B (zh) 2019-05-31 2019-05-31 一种基于模式转移的虚拟机混合备用动态可靠性评估方法

Publications (1)

Publication Number Publication Date
WO2020237729A1 true WO2020237729A1 (zh) 2020-12-03

Family

ID=67719223

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/090868 WO2020237729A1 (zh) 2019-05-31 2019-06-12 一种基于模式转移的虚拟机混合备用动态可靠性评估方法

Country Status (2)

Country Link
CN (1) CN110187990B (zh)
WO (1) WO2020237729A1 (zh)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507562A (zh) * 2020-12-15 2021-03-16 中国航空综合技术研究所 基于二维阵列系统进行可靠度评估的方法
CN112734201A (zh) * 2020-12-31 2021-04-30 国网浙江省电力有限公司电力科学研究院 基于预期故障概率的多台设备整体质量评价方法
CN112766657A (zh) * 2020-12-31 2021-05-07 国网浙江省电力有限公司电力科学研究院 基于故障概率和设备状态的单台设备质量评价方法
CN112799890A (zh) * 2020-12-31 2021-05-14 南京航空航天大学 一种总线抗seu的可靠性建模与评估方法
CN114169173A (zh) * 2021-12-09 2022-03-11 浙江大学 一种考虑热失控传播的电池储能系统可靠度计算方法
CN114462252A (zh) * 2022-02-18 2022-05-10 国网浙江省电力有限公司经济技术研究院 基于Lz变换的多状态电网信息物理系统可靠性提高方法
CN114915658A (zh) * 2022-05-11 2022-08-16 朱宝德 一种基于分布式缓存技术的计算机系统缓存优化清理方法
WO2023131257A1 (zh) * 2022-01-10 2023-07-13 华东理工大学 一种基于大数据的炼油过程模式识别及优化方法
CN116520756A (zh) * 2023-06-29 2023-08-01 北京创博联航科技有限公司 数据采集监控系统、航电系统以及无人机
CN117131464A (zh) * 2023-10-25 2023-11-28 湖北华中电力科技开发有限责任公司 一种电网数据的可用性评估方法及系统
CN118101444A (zh) * 2024-02-29 2024-05-28 广州市信息技术职业学校 一种基于分节点的国标设备动态调度方法
CN118426395A (zh) * 2024-07-05 2024-08-02 江苏南极星新能源技术股份有限公司 一种基于协同运行的分布式能源通信控制方法及系统

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102320317B1 (ko) * 2019-11-11 2021-11-02 한국전자기술연구원 클라우드 엣지 환경에서 예측 기반 마이그레이션 후보 및 대상 선정 방법
CN111966449B (zh) * 2020-07-17 2022-05-31 苏州浪潮智能科技有限公司 一种虚拟机备份管理方法、系统、终端及存储介质
CN114662261B (zh) * 2020-12-22 2024-10-18 中核武汉核电运行技术股份有限公司 一种基于分层mmdd与mdd的多状态隔离效应建模方法
CN117309891B (zh) * 2023-11-29 2024-02-06 深圳市润博电子有限公司 一种基于智能反馈机制的玻璃钢化膜检测方法及系统
CN117544419B (zh) * 2024-01-05 2024-05-14 北京数盾信息科技有限公司 用于提高物联网设备间信息通信安全性的高速加密方法
CN117875947B (zh) * 2024-03-11 2024-06-25 浙江大学 k/n负载均担系统的可靠性评估和维修决策方法及系统

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105389434A (zh) * 2015-11-10 2016-03-09 浙江师范大学 一种用于多故障模式云计算平台的可靠性评估方法
CN105677538A (zh) * 2016-01-11 2016-06-15 中国科学院软件研究所 一种基于故障预测的云计算系统自适应监测方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105511944B (zh) * 2016-01-07 2018-09-28 上海海事大学 一种云系统内部虚拟机的异常检测方法
CN106101116B (zh) * 2016-06-29 2019-01-08 东北大学 一种基于主成分分析的用户行为异常检测系统及方法
US10834213B2 (en) * 2017-07-20 2020-11-10 International Business Machines Corporation System and method for measuring user engagement
CN108959039A (zh) * 2018-07-18 2018-12-07 郑州云海信息技术有限公司 一种虚拟机故障预测的方法及装置

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105389434A (zh) * 2015-11-10 2016-03-09 浙江师范大学 一种用于多故障模式云计算平台的可靠性评估方法
CN105677538A (zh) * 2016-01-11 2016-06-15 中国科学院软件研究所 一种基于故障预测的云计算系统自适应监测方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG, JIANHUA ET AL.: "Approach of Virtual Machine Failure Recovery Based on Hidden Markov Model", JOURNAL OF SOFTWARE, vol. 25, no. 11, 30 November 2014 (2014-11-30), pages 2702 - 2714, XP055762139, ISSN: 1000-9825 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507562A (zh) * 2020-12-15 2021-03-16 中国航空综合技术研究所 基于二维阵列系统进行可靠度评估的方法
CN112507562B (zh) * 2020-12-15 2022-12-09 中国航空综合技术研究所 基于二维阵列系统进行可靠度评估的方法
CN112734201A (zh) * 2020-12-31 2021-04-30 国网浙江省电力有限公司电力科学研究院 基于预期故障概率的多台设备整体质量评价方法
CN112766657A (zh) * 2020-12-31 2021-05-07 国网浙江省电力有限公司电力科学研究院 基于故障概率和设备状态的单台设备质量评价方法
CN112799890A (zh) * 2020-12-31 2021-05-14 南京航空航天大学 一种总线抗seu的可靠性建模与评估方法
CN112766657B (zh) * 2020-12-31 2022-07-05 国网浙江省电力有限公司电力科学研究院 基于故障概率和设备状态的单台设备质量评价方法
CN112734201B (zh) * 2020-12-31 2022-07-05 国网浙江省电力有限公司电力科学研究院 基于预期故障概率的多台设备整体质量评价方法
CN112799890B (zh) * 2020-12-31 2022-10-14 南京航空航天大学 一种总线抗seu的可靠性建模与评估方法
CN114169173A (zh) * 2021-12-09 2022-03-11 浙江大学 一种考虑热失控传播的电池储能系统可靠度计算方法
WO2023131257A1 (zh) * 2022-01-10 2023-07-13 华东理工大学 一种基于大数据的炼油过程模式识别及优化方法
CN114462252A (zh) * 2022-02-18 2022-05-10 国网浙江省电力有限公司经济技术研究院 基于Lz变换的多状态电网信息物理系统可靠性提高方法
CN114915658A (zh) * 2022-05-11 2022-08-16 朱宝德 一种基于分布式缓存技术的计算机系统缓存优化清理方法
CN116520756A (zh) * 2023-06-29 2023-08-01 北京创博联航科技有限公司 数据采集监控系统、航电系统以及无人机
CN116520756B (zh) * 2023-06-29 2023-09-26 北京创博联航科技有限公司 数据采集监控系统、航电系统以及无人机
CN117131464A (zh) * 2023-10-25 2023-11-28 湖北华中电力科技开发有限责任公司 一种电网数据的可用性评估方法及系统
CN117131464B (zh) * 2023-10-25 2024-01-09 湖北华中电力科技开发有限责任公司 一种电网数据的可用性评估方法及系统
CN118101444A (zh) * 2024-02-29 2024-05-28 广州市信息技术职业学校 一种基于分节点的国标设备动态调度方法
CN118426395A (zh) * 2024-07-05 2024-08-02 江苏南极星新能源技术股份有限公司 一种基于协同运行的分布式能源通信控制方法及系统

Also Published As

Publication number Publication date
CN110187990B (zh) 2021-11-16
CN110187990A (zh) 2019-08-30

Similar Documents

Publication Publication Date Title
WO2020237729A1 (zh) 一种基于模式转移的虚拟机混合备用动态可靠性评估方法
CN110413227B (zh) 一种硬盘设备的剩余使用寿命在线预测方法和系统
CN110704542A (zh) 一种基于节点负载的数据动态分区系统
CN111638958B (zh) 云主机负载处理方法、装置、控制设备及存储介质
CN112686464A (zh) 短期风电功率预测方法及装置
CN106776288B (zh) 一种基于Hadoop的分布式系统的健康度量方法
CN110109733B (zh) 面向不同老化场景的虚拟机工作队列和冗余队列更新方法
Yu et al. Integrating clustering and learning for improved workload prediction in the cloud
WO2020220437A1 (zh) 一种基于AdaBoost-Elman的虚拟机软件老化预测方法
CN110413657B (zh) 面向季节型非平稳并发量的平均响应时间评估方法
CN115248757A (zh) 一种硬盘健康评估方法和存储设备
US10248462B2 (en) Management server which constructs a request load model for an object system, load estimation method thereof and storage medium for storing program
CN116909378A (zh) 一种基于深度强化学习的gpu动态能源效率优化运行时方法及系统
Li et al. Resource usage prediction based on BiLSTM-GRU combination model
Liu et al. Failure prediction of tasks in the cloud at an earlier stage: a solution based on domain information mining
CN117556331A (zh) 基于ai增强的空压机维护决策方法及系统
WO2024149388A1 (zh) 能耗模型的训练方法、能耗表征方法及相关设备
CN114510871A (zh) 基于思维进化和lstm的云服务器性能衰退预测方法
CN117435451A (zh) 移动边缘计算中虚拟计算单元的功耗和性能模型建立方法
CN113886454A (zh) 一种基于lstm-rbf的云资源预测方法
CN105260304B (zh) 一种基于qbgsa‑rvr的软件可靠性预测方法
Sun et al. Aledar: An attentions-based encoder-decoder and autoregressive model for workload forecasting of cloud data center
CN108241533A (zh) 一种基于预测和分层抽样的资源池未来负载生成方法
CN112882917A (zh) 一种基于贝叶斯网络迁移的虚拟机服务质量动态预测方法
CN112860531A (zh) 基于深度异构图神经网络的区块链广泛共识性能评测方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19931408

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19931408

Country of ref document: EP

Kind code of ref document: A1