WO2020220437A1 - 一种基于AdaBoost-Elman的虚拟机软件老化预测方法 - Google Patents

一种基于AdaBoost-Elman的虚拟机软件老化预测方法 Download PDF

Info

Publication number
WO2020220437A1
WO2020220437A1 PCT/CN2019/090871 CN2019090871W WO2020220437A1 WO 2020220437 A1 WO2020220437 A1 WO 2020220437A1 CN 2019090871 W CN2019090871 W CN 2019090871W WO 2020220437 A1 WO2020220437 A1 WO 2020220437A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual machine
cur
disk
software aging
elman
Prior art date
Application number
PCT/CN2019/090871
Other languages
English (en)
French (fr)
Inventor
郭军
王馨悦
张斌
刘晨
侯帅
侯凯
李薇
柳波
刘文凤
王嘉怡
张瀚铎
张娅杰
Original Assignee
东北大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东北大学 filed Critical 东北大学
Publication of WO2020220437A1 publication Critical patent/WO2020220437A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/815Virtual

Definitions

  • the invention belongs to the technical field of cloud computing, and relates to a virtual machine software aging prediction method based on AdaBoost-Elman.
  • the virtual machine processes concurrent business requests uninterruptedly for a long time, and software aging will gradually occur, leading to interruption or even failure of cloud services.
  • the virtual machine service is usually restarted before it fails to restore the initial state of the virtual machine application and system.
  • the prediction of the software aging trend is the key to solving the problem of virtual machine software aging. If the measures are taken too early and the restart cost is high, resources will be wasted; if the measures are taken too late, the loss will not be reduced.
  • the aging of virtual machine software is a long and complicated process. Various errors may appear in the virtual machine system and accumulate.
  • the request response time and the number of failed requests are two effective indicators for judging the aging of the virtual machine software. As the software in the virtual machine ages, the request response time becomes longer and the number of failed requests increases.
  • the resource indicators of virtual machines are more convenient to obtain, and the reduction of available resources is a specific manifestation of software aging.
  • Memory leakage is one of the most common aging phenomena of cloud service systems. Too little available memory in the system causes the virtual machine to run slowly or even crash directly.
  • the software aging of virtual machines is ultimately caused by a large number of business requests.
  • the existing software aging prediction methods often directly fit the historical sequence of virtual machine resources, and lack various types of virtual machines.
  • all businesses share virtual machine resources, and different business requests require different resource types and resource quantities.
  • Predecessor methods have errors in directly fitting virtual machine software aging indicators.
  • the technical problem to be solved by the present invention is to provide a virtual machine software aging prediction method based on AdaBoost-Elman in view of the above-mentioned shortcomings of the prior art, so as to realize the prediction of the virtual machine software aging situation.
  • a virtual machine software aging prediction method based on AdaBoost-Elman includes the following steps:
  • Step 1 Set the level for evaluating the aging degree of virtual machine software, the specific method is:
  • Step 1.1 Select the utilization of disk, physical memory and virtual memory as the evaluation index of virtual machine software aging, and calculate the performance loss of the average utilization of virtual machine disk, physical memory and virtual memory wasage disk , wasage mem , wastage swap , As shown in the following formula:
  • cur disk , cur mem , and cur swap are the average disk utilization, average physical memory utilization, and average virtual memory utilization of the virtual machine, while confer disk , confer mem , and confer swap are the disk and physical The benchmark value of the average utilization of memory and virtual memory;
  • Step 1.2 Calculate the virtual machine software aging degree s, which represents the software aging degree of the virtual machine, as shown in the following formula:
  • ⁇ 1 , ⁇ 2 , and ⁇ 3 are the weight coefficients of the performance loss of the average utilization of physical memory, virtual memory and disk;
  • Step 1.3 According to the software aging degree s, the health status of the virtual machine is divided into five levels, specifically:
  • Step 2 Predict the offline training process of virtual machine software aging, as follows:
  • Step 2.1 Train the software aging index prediction model of the virtual machine
  • Step 2.1.1 Extract historical data in the virtual machine performance log library and virtual machine business concurrency log library, and preprocess the extracted historical data;
  • Step 2.1.1.1 Process the missing points of the extracted virtual machine service concurrency
  • Step 2.1.1.2 Adjust the abnormal value of the very small samples with abnormal fluctuations in the collected virtual machine service concurrent volume
  • Step 2.1.1.3 Adjust the data interval for the business concurrency and CPU utilization data extracted from the virtual machine log database and virtual machine business concurrency log database, and merge the collected data in seconds, minutes or hours. ;
  • Step 2.1.1.4 Use the maximum and minimum normalization method to normalize the data processed in step 2.1.1.3;
  • Step 2.1.2 Establish a relationship model between service concurrency and software aging indicators through Elman neural network, that is, a prediction model of virtual machine software aging indicators;
  • Step 2.1.2.1 Set the number of layers of the Elman neural network to 3;
  • Step 2.1.2.2 The number of services supported by the virtual machine is n, the number of input nodes in the Elman neural network is set to n+3, and the number of output nodes out is 3;
  • Step 2.1.2.3 Use Kolmogorov's theorem to obtain the approximate range of the number of hidden nodes in the Elman neural network, as shown in the following formula, and then verify the accuracy of the results one by one;
  • Step 2.1.2.4 The transfer function of the Elman neural network output layer adopts the ReLU linear rectification function or the Sigmod function, and the transfer function of the hidden layer adopts the Sigmod function to predict the service concurrency of the virtual machine and the software aging index;
  • Step 2.1.2.5 Combine the three types of performance indicators cur mem (t), cur swap (t), and cur disk (t) of the virtual machine, the predicted value con i (t+1) of the business concurrency on the virtual machine and the physical memory
  • the influence factors ⁇ 1 , ⁇ 2 , and ⁇ 3 between utilization rate, virtual memory utilization rate and disk utilization rate are input into the Elman neural network model together;
  • Step 2.1.2.6 Output the virtual machine's average physical memory utilization, average virtual memory utilization, and the nonlinear relationship between average disk utilization and business concurrency, as shown in the following formula:
  • cur mem (t+1) f′(con i (t+1),cur mem (t),cur swap (t),cur disk (t))+ ⁇ 1 (4a)
  • cur swap (t+1) g(con i (t+1),cur mem (t),cur swap (t),cur disk (t))+ ⁇ 2 (4b)
  • cur disk (t+1) h(con i (t+1),cur mem (t),cur swap (t),cur disk (t))+ ⁇ 3 (4c)
  • f′(), g(), and h() are respectively the non-linear relationship functions between average physical memory utilization, average virtual memory utilization, and average disk utilization and business concurrency;
  • Step 2.1.3 Use AdaBoost.RT algorithm to optimize the prediction model of virtual machine software aging indicators, and use some Elman neural networks as weak prediction models to synthesize strong prediction model Ada-Elman;
  • Step 2.1.3.1 Input the training sample set, initialize the parameters of each Elman neural network predictor f(x), the weight of the training sample and the threshold of training error;
  • the weight of the training sample and the threshold of the training error are as follows:
  • m is the number of Elman neural network predictors, Is the weight of the i-th sample of the t iteration, Is the threshold of training error;
  • Step 2.1.3.2 Set the average error rate e t to zero, read the training samples, train the t-th Elman neural network predictor f t (x), and then synthesize the strong prediction model Ada-Elman;
  • Step 2.1.3.3 Calculate the error of the AdaBoost-Elman model on the training set As shown in the following formula:
  • I the absolute error of the i-th sample of the t-th iteration
  • y i the i-th sample value
  • Step 2.1.3.4 If Adjust the average error rate
  • Step 2.1.3.5 Set the average relative error of each Elman neural network The initial value of is 0.2, the ideal upper bound is 0.35, and the ideal lower bound is 0.1, as shown in formulas (7) and (8):
  • Step 2.1.3.6 Calculate the weight adjustment factor, as shown in the following formula:
  • D t is the normalization factor of the sample weight of the t-th iteration
  • Step 2.1.3.8 Determine whether the maximum number of iterations is reached
  • Step 2.2 Train an unaged virtual machine to refer to the prediction model
  • Step 2.2.1 Extract the data in the performance log database and business concurrency log database of the newly created and just started virtual machine, and preprocess the extracted data;
  • Step 2.2.2 Establish and train the unaged virtual machine reference prediction model by using the method of Elman neural network in step 2.1.2 to establish the relational model and the method of using the AdaBoost.RT algorithm to optimize the relational model in step 2.1.3;
  • Step 3 Predict the online training process of virtual machine software aging, as follows:
  • Step 3.1 Input the predicted value and performance data of the business concurrency into the software aging index prediction model of the virtual machine trained in the offline process and the reference prediction model of the unaging virtual machine;
  • Step 3.2 The software aging index prediction model of the virtual machine and the reference prediction model of the unaging virtual machine respectively output the software aging index prediction result of the virtual machine and the reference prediction result of the unaging virtual machine;
  • Step 3.3 Combining the method of evaluating virtual machine software aging in Step 1, evaluate the software aging trend of the virtual machine based on the virtual machine's software aging index prediction result and the reference prediction result of the unaging virtual machine.
  • the present invention provides an Ada-Elman-based virtual machine software aging prediction method, establishes an Ada-Elman-based virtual machine software aging model, and fine-grained research on various types of service concurrency The relationship with the virtual machine software aging index, and then predict the current working virtual machine software aging index, and compare it with the unaging virtual machine, so as to get the virtual machine software aging degree in the next period of time, and take preventive measures in advance.
  • Figure 1 is an example topology diagram of an online ticket ordering system provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a prediction process of a virtual machine software aging prediction method based on AdaBoost-Elman provided by an embodiment of the present invention
  • FIG. 3 is a comparison diagram of the prediction results of the average virtual memory utilization of virtual machines by three different models provided by the embodiments of the present invention
  • FIG. 5 is a comparison diagram of prediction results of average disk utilization of virtual machines provided by an embodiment of the present invention.
  • This embodiment uses the plane ticket online counting and purchasing system shown in Figure 1 to simulate the PC-side user application, builds the service system on the Sugon server, and simulates real business concurrency scenarios by pressurizing the plane ticket online ordering system, and collects different Using the AdaBoost-Elman-based virtual machine software aging prediction method of the present invention to predict the virtual machine software aging situation.
  • client 1 uses LoadRunner software to generate concurrent business access. It can simulate a large number of users clicking on the airline ticket ordering system page at the same time.
  • the load balancing Nginx2 realizes the reception and service request Assign, finally the server 4 installs Tomcat and deploys the airline ticket online booking system, responsible for reading and writing the business database MySQL5, and processing the request sent by LoadRunner.
  • the open source monitoring tool Collectd periodically collects the performance data of each working virtual machine and saves it in the Influxdb distributed database. The collected virtual machine data is used to build a model based on Ada-Elman.
  • a virtual machine software aging prediction method based on AdaBoost-Elman, as shown in Figure 2 includes the following steps:
  • Step 1 Set the level for evaluating the aging degree of virtual machine software, the specific method is:
  • Step 1.1 Select the utilization of disk, physical memory and virtual memory as the evaluation indicators of virtual machine software aging, and calculate the performance loss of the average utilization of the virtual machine's disk, physical memory and virtual memory wasage disk , wasage mem , and wasage swap , As shown in the following formula:
  • cur disk , cur mem , and cur swap are the average disk utilization, average physical memory utilization, and average virtual memory utilization of the virtual machine, while confer disk , confer mem , and confer swap are the disk and physical The benchmark value of the average utilization of memory and virtual memory;
  • Step 1.2 Calculate the virtual machine software aging degree s, which represents the software aging degree of the virtual machine, as shown in the following formula:
  • ⁇ 1 , ⁇ 2 , and ⁇ 3 are the weight coefficients of the performance loss of the average utilization of physical memory, virtual memory and disk;
  • Step 1.3 According to the software aging degree s, the health status of the virtual machine is divided into five levels, specifically:
  • Step 2 Predict the offline training process of virtual machine software aging, as follows:
  • Step 2.1 Train the software aging index prediction model of the virtual machine
  • Step 2.1.1 Extract historical data in the virtual machine performance log library and virtual machine business concurrency log library, and preprocess the extracted historical data;
  • Step 2.1.1.1 Process the missing points of the extracted virtual machine service concurrency
  • Step 2.1.1.2 Adjust the abnormal value of the very small samples with abnormal fluctuations in the collected virtual machine service concurrent volume
  • Step 2.1.1.3 Adjust the data interval for the business concurrency and CPU utilization data extracted from the virtual machine log database and virtual machine business concurrency log database, and merge the collected data in seconds, minutes or hours. ;
  • Step 2.1.1.4 Use the maximum and minimum normalization method to normalize the data processed in step 2.1.1.3;
  • Step 2.1.2 Establish a relationship model between service concurrency and software aging indicators through Elman neural network, that is, a prediction model of virtual machine software aging indicators;
  • Step 2.1.2.1 Set the number of layers of the Elman neural network to 3;
  • Step 2.1.2.2 The number of services supported by the virtual machine is n, the number of input nodes in the Elman neural network is set to n+3, and the number of output nodes out is 3;
  • Step 2.1.2.3 Use kolmogorov's theorem to obtain the approximate range of the number of hidden nodes in the Elman neural network, as shown in the following formula, and then verify the accuracy of the results one by one;
  • Step 2.1.2.4 The transfer function of the Elman neural network output layer adopts the ReLU linear rectification function or the Sigmod function, and the transfer function of the hidden layer adopts the Sigmod function to predict the service concurrency of the virtual machine and the software aging index;
  • Step 2.1.2.5 Combine the three types of performance indicators cur mem (t), cur swap (t), and cur disk (t) of the virtual machine, the predicted value con i (t+1) of the business concurrency on the virtual machine and the physical memory
  • the influence factors ⁇ 1 , ⁇ 2 , and ⁇ 3 between utilization rate, virtual memory utilization rate and disk utilization rate are input into the Elman neural network model together;
  • Step 2.1.2.6 Output the virtual machine's average physical memory utilization, average virtual memory utilization, and the nonlinear relationship between average disk utilization and business concurrency, as shown in the following formula:
  • cur mem (t+1) f′(con i (t+1),cur mem (t),cur swap (t),cur disk (t))+ ⁇ 1 (4a)
  • cur swap (t+1) g(con i (t+1),cur mem (t),cur swap (t),cur disk (t))+ ⁇ 2 (4b)
  • cur disk (t+1) h(con i (t+1),cur mem (t),cur swap (t),cur disk (t))+ ⁇ 3 (4c)
  • f′(), g(), and h() are respectively the non-linear relationship functions between average physical memory utilization, average virtual memory utilization, and average disk utilization and business concurrency;
  • Step 2.1.3 Use AdaBoost.RT algorithm to optimize the prediction model of virtual machine software aging indicators, and use some Elman neural networks as weak prediction models to synthesize strong prediction model Ada-Elman;
  • Step 2.1.3.1 Input the training sample set, initialize the parameters of each Elman neural network predictor f(x), the weight of the training sample and the threshold of training error;
  • the weight of the training sample and the threshold of the training error are as follows:
  • m is the number of Elman neural network predictors, Is the weight of the i-th sample of the t iteration, Is the threshold of training error;
  • Step 2.1.3.2 Set the average error rate e t to zero, read the training samples, train the t-th Elman neural network predictor f t (x), and then synthesize the strong prediction model Ada-Elman;
  • Step 2.1.3.3 Calculate the error of the AdaBoost-Elman model on the training set As shown in the following formula:
  • I the absolute error of the i-th sample of the t-th iteration
  • y i the i-th sample value
  • Step 2.1.3.4 If Adjust the average error rate
  • Step 2.1.3.5 Set the average relative error of each Elman neural network The initial value of is 0.2, the ideal upper bound is 0.35, and the ideal lower bound is 0.1, as shown in formulas (7) and (8):
  • Step 2.1.3.6 Calculate the weight adjustment factor, as shown in the following formula:
  • D t is the normalization factor of the sample weight of the t-th iteration
  • Step 2.1.3.8 Determine whether the maximum number of iterations is reached
  • Step 2.2 Train an unaged virtual machine to refer to the prediction model
  • Step 2.2.1 Extract the data in the performance log database and business concurrency log database of the newly created and just started virtual machine, and preprocess the extracted data;
  • Step 2.2.2 Establish and train the unaged virtual machine reference prediction model by using the method of Elman neural network in step 2.1.2 to establish the relational model and the method of using the AdaBoost.RT algorithm to optimize the relational model in step 2.1.3;
  • Step 3 Predict the online training process of virtual machine software aging, as follows;
  • Step 3.1 Input the predicted value and performance data of the business concurrency into the software aging index prediction model of the virtual machine trained in the offline process and the reference prediction model of the unaging virtual machine;
  • Step 3.2 The software aging index prediction model of the virtual machine and the reference prediction model of the unaging virtual machine respectively output the software aging index prediction result of the virtual machine and the reference prediction result of the unaging virtual machine;
  • Step 3.3 Combining the method of evaluating virtual machine software aging in Step 1, evaluate the software aging trend of the virtual machine based on the virtual machine's software aging index prediction result and the reference prediction result of the unaging virtual machine.
  • data_health is taken from 3 hours after the new virtual machine is started, from 9:00 on October 8, 2018 to 12:00 on October 8, 2018; data_aging is taken from the monitoring data of 3 hours after the virtual machine has been working for a period of time, time From 18:00 on October 8, 2018 to 21:00 on October 8, 2018.
  • the input of the Ada-Elman and Elman models includes the concurrency of the eight types of services and physical memory, virtual memory, and disk data.
  • the BP model is directly related to the three types.
  • the historical sequence of aging indicators is fitted. After many experiments, the number of nodes in each layer of the Elman model is set to 11-8-3, the learning rate is 0.2, the number of Elman predictors in the Ada-Elman model is 10, and the number of nodes in each layer of each predictor is 11- 7-3, the learning rate is 0.2, the number of nodes in each layer of the BP model is 11-8-3, the learning rate is 0.3, and the maximum number of iterations of the three models is 1000.
  • the prediction results of the three models of aging_Ada-Elman, aging_Elman and aging_BP on the virtual machine software aging indicators are shown in Figures 3-5.
  • the prediction result of the aging_Ada-Elman model is closer to the real performance value of the virtual machine. Especially when the virtual memory utilization rate is predicted with frequent fluctuations, the fitting results of aging_BP and aging_Elman have a large deviation, and the fitting effect of aging_Ada-Elman is better.
  • the errors predicted by different models are shown in Table 1.
  • the average absolute error MAE and the mean square error MSE predicted by Ada-Elman are both smaller than the average absolute error and mean square predicted by Elman
  • the error indicates that the prediction accuracy of the Ada-Elman proposed in this paper is higher than that of a single Elman model.
  • the average absolute error and the mean square error of the Ada-Elman prediction are also less than the average absolute error and the mean square error of the BP model prediction. This is because Ada- Elman did not directly model historical sequences, but fully considered the relationship between business concurrency and software aging indicators.
  • Table 1 reflects the time costs of different models. The results in the table are the average values after multiple predictions. Among them, the BP model takes the shortest time, and the Ada-Elman and Elman models take longer. This is because the BP model directly affects the software aging Model the historical sequence of indicators without entering the business concurrency.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明提供一种基于AdaBoost-Elman的虚拟机软件老化预测方法,涉及云计算技术领域。该方法首先设定评估虚拟机软件老化程度的等级,并训练虚拟机的软件老化指标预测模型和未老化虚拟机参照预测模型;然后将业务并发量预测值和性能数据输入到离线过程训练的虚拟机的软件老化指标预测模型和未老化虚拟机参照预测模型中,输出虚拟机的软件老化指标预测结果和未老化虚拟机的参照预测结果;最后根据虚拟机的软件老化指标预测结果和未老化虚拟机的参照预测结果来评估虚拟机的软件老化趋势。本发明方法能够预测出当前工作虚拟机的软件老化指标,并与未老化的虚拟机进行对比,从而得到下一段时间虚拟机的软件老化程度,提前采取防范措施。

Description

一种基于AdaBoost-Elman的虚拟机软件老化预测方法 技术领域
本发明属于云计算技术领域,涉及一种基于AdaBoost-Elman的虚拟机软件老化预测方法。
背景技术
在云服务系统中,虚拟机长时间不间断地处理业务并发请求,会逐渐出现软件老化,从而导致云服务中断甚至失效。为了保证云服务的性能和可靠性,通常在虚拟机服务失效之前对其重启,恢复虚拟机应用和系统的初始状态。而软件老化趋势的预测是解决虚拟机软件老化问题的关键,如果采取措施太早,重启代价较高,则会造成资源的浪费;如果采取措施太晚,则起不到降低损失的作用。
虚拟机的软件老化是一个漫长复杂的过程,各种各样的错误都可能在虚拟机系统中出现并不断累积。对用户来说,请求的响应时间与失败的请求数是判断虚拟机软件老化的两个有效指标,随着虚拟机内软件老化,请求响应时间变长并且失败请求数增加,然而对云平台管理者来说,实时地获取请求响应时间和失败请求数两个指标会有延迟,而虚拟机的资源指标更加方便获取,并且可用资源的减少是软件老化的具体表现。内存泄漏是云服务系统最常见的老化现象之一,系统的可用内存过少导致虚拟机运行缓慢甚至直接崩溃,当虚拟机系统的物理内存过少时,磁盘上的虚拟内存发挥作用从而占用较多的磁盘资源。前人方法大多是设定固定的老化阈值,然后通过监测或者预测虚拟机的资源使用状况,判断是否超过老化阈值决定是否采取措施。但是这种单一的阈值监控方法并不能准确地表现虚拟机的“健康状况”,经常受到外界负载影响而造成误判。
在云服务系统中,虚拟机的软件老化归根结底是由大量的业务请求带来的,然而现有的软件老化预测方法往往直接对虚拟机资源的历史序列进行拟合,缺乏对虚拟机上各类业务的考虑,所有业务共享虚拟机资源,不同的业务请求需要的资源类型和资源数量不同,前人方法对虚拟机的软件老化指标直接拟合是有误差的。
发明内容
本发明要解决的技术问题是针对上述现有技术的不足,提供一种基于AdaBoost-Elman的虚拟机软件老化预测方法,实现对虚拟机软件老化情况进行预测。
一种基于AdaBoost-Elman的虚拟机软件老化预测方法,包括以下步骤:
步骤1:设定评估虚拟机软件老化程度的等级,具体方法为:
步骤1.1:选取磁盘、物理内存和虚拟内存的利用率作为虚拟机软件老化的评估指标,计算虚拟机的磁盘、物理内存和虚拟内存的平均利用率的性能损耗量wastage disk、wastage mem、 wastage swap,如下公式所示:
wastage disk=|cur disk-confer disk|   (1a)
wastage mem=|cur mem-confer mem|   (1b)
wastage swap=|cur swap-confer swap|   (1c)
其中,cur disk、cur mem、cur swap为虚拟机的平均磁盘利用率、平均物理内存利用率和平均虚拟内存利用率,而confer disk、confer mem、confer swap则是进行软件老化评估的磁盘、物理内存和虚拟内存的平均利用率的基准值;
步骤1.2:计算代表虚拟机的软件老化程度的虚拟机软件老化度s,如下公式所示:
s=ω 1*wastage mem2*wastage swap3*wastage disk   (2)
其中,ω 1、ω 2、ω 3为物理内存、虚拟内存和磁盘的平均利用率的性能损耗量的权重系数;
步骤1.3:根据软件老化度s将虚拟机的健康状态划分为五个等级,具体为:
当0≤s<0.2时,判定该虚拟机处于健康状况;
当0.2≤s<0.4时,判定该虚拟机处于轻微软件老化状况;
当0.4≤s<0.6时,判定该虚拟机处于中度软件老化状况;
当0.6≤s<0.8时,判定该虚拟机处于重度软件老化状况;
当0.8≤s≤1时,判定该虚拟机故障,无法正常使用;
步骤2:预测虚拟机的软件老化的离线训练过程,具体如下:
步骤2.1:训练虚拟机的软件老化指标预测模型;
步骤2.1.1:提取虚拟机性能日志库和虚拟机业务并发量日志库中的历史数据,并对提取的历史数据进行预处理;
步骤2.1.1.1:对提取的虚拟机业务并发量缺失点进行处理;
对于个别采样点缺失的情况,采用前一周期和后一周期业务并发量的平均值进行填补;
对于采样点缺失达到百分之九十以上的情况,舍弃全部采样并且将该段时间内业务并发量的值置为零;
步骤2.1.1.2:对于采集到的虚拟机业务并发量中存在异常波动的极大极小样本进行异常值调整;
步骤2.1.1.3:对从虚拟机日志数据库和虚拟机业务并发量日志库中提取到的业务并发量和CPU利用率数据进行数据间隔调整,对采集的数据以秒、分钟或小时为单位进行合并;
步骤2.1.1.4:采用最大最小值归一法将步骤2.1.1.3处理后的数据进行归一化;
步骤2.1.2:通过Elman神经网络建立业务并发量与软件老化指标之间的关系模型,即虚拟机软件老化指标的预测模型;
步骤2.1.2.1:设置Elman神经网络的层数为3;
步骤2.1.2.2:虚拟机支撑的业务数类型为n,设置Elman神经网络的输入节点数in为n+3,输出节点数out为3;
步骤2.1.2.3:采用柯尔莫哥洛夫定理得出Elman神经网络中隐藏节点数hide的大致范围,如下公式所示,然后逐一验证结果准确性;
Figure PCTCN2019090871-appb-000001
其中,a∈(1,10);
步骤2.1.2.4:Elman神经网络输出层的传递函数采用ReLU线性整流函数或者Sigmod函数,隐藏层的传递函数采用Sigmod函数来对虚拟机的业务并发量和软件老化指标进行预测;
步骤2.1.2.5:将虚拟机的三类性能指标cur mem(t)、cur swap(t)、cur disk(t),虚拟机上业务并发量的预测值con i(t+1)和物理内存利用率、虚拟内存利用率以及磁盘利用率之间的影响因子σ 1、σ 2、σ 3一同输入到Elman神经网络模型中;
步骤2.1.2.6:输出虚拟机的平均物理内存利用率、平均虚拟内存利用率以及平均磁盘利用率与业务并发量之间的非线性关系,如下公式所示:
cur mem(t+1)=f′(con i(t+1),cur mem(t),cur swap(t),cur disk(t))+σ 1  (4a)
cur swap(t+1)=g(con i(t+1),cur mem(t),cur swap(t),cur disk(t))+σ 2  (4b)
cur disk(t+1)=h(con i(t+1),cur mem(t),cur swap(t),cur disk(t))+σ 3  (4c)
其中,f′()、g()、h()分别为表示平均物理内存利用率、平均虚拟内存利用率以及平均磁盘利用率与业务并发量之间的非线性关系函数;
步骤2.1.3:使用AdaBoost.RT算法对虚拟机软件老化指标的预测模型进行优化,将一些Elman神经网络作为弱预测模型合成强预测模型Ada-Elman;
步骤2.1.3.1:输入训练样本集,初始化每个Elman神经网络预测器f(x)的参数和训练样本的权值及训练误差的阈值;
所述训练样本的权值及训练误差的阈值如下公式所示:
Figure PCTCN2019090871-appb-000002
其中,m为Elman神经网络预测器的个数,
Figure PCTCN2019090871-appb-000003
第为t次迭代第i个样本的权重,
Figure PCTCN2019090871-appb-000004
Figure PCTCN2019090871-appb-000005
为训练误差的阈值;
步骤2.1.3.2:设置平均误差率e t为零,读取训练样本,训练第t个Elman神经网络预测器f t(x),进而合成强预测模型Ada-Elman;
步骤2.1.3.3:计算AdaBoost-Elman模型在训练集上的误差
Figure PCTCN2019090871-appb-000006
如下公式所示:
Figure PCTCN2019090871-appb-000007
其中,
Figure PCTCN2019090871-appb-000008
为第t次迭代第i个样本的绝对误差,y i为第i个样本值;
步骤2.1.3.4:如果
Figure PCTCN2019090871-appb-000009
则调整平均误差率
Figure PCTCN2019090871-appb-000010
步骤2.1.3.5:设置每个Elman神经网络的平均相对误差
Figure PCTCN2019090871-appb-000011
的初值为0.2,理想上界为0.35,理想下界为0.1,如公式(7)和(8)所示:
Figure PCTCN2019090871-appb-000012
Figure PCTCN2019090871-appb-000013
其中,
Figure PCTCN2019090871-appb-000014
为平均相对误差,
Figure PCTCN2019090871-appb-000015
为第t个训练样本误差的阈值;
步骤2.1.3.6:计算权值调整因子,如下公式所示:
Figure PCTCN2019090871-appb-000016
其中,
Figure PCTCN2019090871-appb-000017
为第t次迭代的权重调整因子;
步骤2.1.3.7:更新每个训练样本的权重,具体为:
如果
Figure PCTCN2019090871-appb-000018
增大该样本的权重,如下公式所示:
Figure PCTCN2019090871-appb-000019
其中,D t为第t次迭代样本权重的规范化因子;
如果
Figure PCTCN2019090871-appb-000020
调整训练样本的权重,如下公式所示:
Figure PCTCN2019090871-appb-000021
步骤2.1.3.8:判断是否达到最大迭代次数;
若未达到最大迭代次数,继续迭代;
若达到最大迭代次数,输出Ada-Elamn模型,得到虚拟机软件老化指标预测模型g(x),如下公式所示:
Figure PCTCN2019090871-appb-000022
步骤2.2:训练未老化虚拟机参照预测模型;
步骤2.2.1:提取新创建并且刚启动不久的虚拟机的性能日志库和业务并发量日志库中的数据,并对提取的数据进行预处理;
步骤2.2.2:通过步骤2.1.2中Elman神经网络建立关系模型的方法和步骤2.1.3中使用AdaBoost.RT算法对关系模型进行优化的方法建立并训练未老化虚拟机参照预测模型;
步骤3:预测虚拟机的软件老化的在线训练过程,具体如下:
步骤3.1:将业务并发量预测值和性能数据输入到离线过程训练的虚拟机的软件老化指标预测模型和未老化虚拟机参照预测模型中;
步骤3.2:虚拟机的软件老化指标预测模型和未老化虚拟机参照预测模型分别输出虚拟机的软件老化指标预测结果和未老化虚拟机的参照预测结果;
步骤3.3:结合步骤1中的评估虚拟机软件老化的方法,根据虚拟机的软件老化指标预测结果和未老化虚拟机的参照预测结果来评估虚拟机的软件老化趋势。
采用上述技术方案所产生的有益效果在于:本发明提供的一种基于Ada-Elman的虚拟机软件老化预测方法,建立基于Ada-Elman的虚拟机软件老化模型,细粒度地研究各类业务并发量与虚拟机软件老化指标之间的关系,进而预测出当前工作虚拟机的软件老化指标,并与未老化的虚拟机进行对比,从而得到下一段时间虚拟机的软件老化程度,提前采取防范措施。
附图说明
图1为本发明实施例提供的飞机票在线订购系统的实例拓扑图;
图2为本发明实施例提供的一种基于AdaBoost-Elman的虚拟机软件老化预测方法的预测过程示意图;
图3为本发明实施例提供的三种不同模型对虚拟机的平均虚拟内存利用率的预测结果对比图;
图4为本发明实施例提供的是虚拟机的平均物理内存利用率的预测结果对比图;
图5为本发明实施例提供的是虚拟机的平均磁盘利用率的预测结果对比图。
图中,1、客户端;2、负载均衡Nginx;3、交换机;4、服务端;5、业务数据库MySQL。
具体实施方式
下面结合附图和实施例,对本发明的具体实施方式作进一步详细描述。以下实施例用于说明本发明,但不用来限制本发明的范围。
本实施例使用如图1所示的飞机票在线计购系统模拟PC端用户应用,在曙光服务器上 搭建该服务系统,通过对飞机票在线订购系统加压模拟真实的业务并发场景,并采集不同的业务并发量数据,采用本发明的基于AdaBoost-Elman的虚拟机软件老化预测方法实现对该虚拟机软件老化情况进行预测。在该飞机票在线订购系统中,客户端1使用LoadRunner软件产生业务并发访问,它可以模拟大量的用户同时点击飞机票订购系统页面,LoadRunner发送页面请求后,由负载均衡Nginx2实现业务请求的接收和分配,最后服务端4安装Tomcat并部署飞机票在线预订系统,负责读写业务数据库MySQL5,处理LoadRunner发送的请求。通过开源监控工具Collectd周期性采集每台工作虚拟机的性能数据,并保存在Influxdb分布式数据库中,利用采集的虚拟机数据基于Ada-Elman建立模型。
一种基于AdaBoost-Elman的虚拟机软件老化预测方法,如图2所示,包括以下步骤:
步骤1:设定评估虚拟机软件老化程度的等级,具体方法为:
步骤1.1:选取磁盘、物理内存和虚拟内存的利用率作为虚拟机软件老化的评估指标,计算虚拟机的磁盘、物理内存和虚拟内存的平均利用率的性能损耗量wastage disk、wastage mem、wastage swap,如下公式所示:
wastage disk=|cur disk-confer disk|   (1a)
wastage mem=|cur mem-confer mem|   (1b)
wastage swap=|cur swap-confer swap|   (1c)
其中,cur disk、cur mem、cur swap为虚拟机的平均磁盘利用率、平均物理内存利用率和平均虚拟内存利用率,而confer disk、confer mem、confer swap则是进行软件老化评估的磁盘、物理内存和虚拟内存的平均利用率的基准值;
步骤1.2:计算代表虚拟机的软件老化程度的虚拟机软件老化度s,如下公式所示:
s=ω 1*wastage mem2*wastage swap3*wastage disk   (2)
其中,ω 1、ω 2、ω 3为物理内存、虚拟内存和磁盘的平均利用率的性能损耗量的权重系数;
步骤1.3:根据软件老化度s将虚拟机的健康状态划分为五个等级,具体为:
当0≤s<0.2时,判定该虚拟机处于健康状况;
当0.2≤s<0.4时,判定该虚拟机处于轻微软件老化状况;
当0.4≤s<0.6时,判定该虚拟机处于中度软件老化状况;
当0.6≤s<0.8时,判定该虚拟机处于重度软件老化状况;
当0.8≤s≤1时,判定该虚拟机故障,无法正常使用;
步骤2:预测虚拟机的软件老化的离线训练过程,具体如下:
步骤2.1:训练虚拟机的软件老化指标预测模型;
步骤2.1.1:提取虚拟机性能日志库和虚拟机业务并发量日志库中的历史数据,并对提取的历史数据进行预处理;
步骤2.1.1.1:对提取的虚拟机业务并发量缺失点进行处理;
对于个别采样点缺失的情况,采用前一周期和后一周期业务并发量的平均值进行填补;
对于采样点缺失达到百分之九十以上的情况,舍弃全部采样并且将该段时间内业务并发量的值置为零;
步骤2.1.1.2:对于采集到的虚拟机业务并发量中存在异常波动的极大极小样本进行异常值调整;
步骤2.1.1.3:对从虚拟机日志数据库和虚拟机业务并发量日志库中提取到的业务并发量和CPU利用率数据进行数据间隔调整,对采集的数据以秒、分钟或小时为单位进行合并;
步骤2.1.1.4:采用最大最小值归一法将步骤2.1.1.3处理后的数据进行归一化;
步骤2.1.2:通过Elman神经网络建立业务并发量与软件老化指标之间的关系模型,即虚拟机软件老化指标的预测模型;
步骤2.1.2.1:设置Elman神经网络的层数为3;
步骤2.1.2.2:虚拟机支撑的业务数类型为n,设置Elman神经网络的输入节点数in为n+3,输出节点数out为3;
步骤2.1.2.3:采用kolmogorov柯尔莫哥洛夫定理得出Elman神经网络中隐藏节点数hide的大致范围,如下公式所示,然后逐一验证结果准确性;
Figure PCTCN2019090871-appb-000023
其中,a∈(1,10);
步骤2.1.2.4:Elman神经网络输出层的传递函数采用ReLU线性整流函数或者Sigmod函数,隐藏层的传递函数采用Sigmod函数来对虚拟机的业务并发量和软件老化指标进行预测;
步骤2.1.2.5:将虚拟机的三类性能指标cur mem(t)、cur swap(t)、cur disk(t),虚拟机上业务并发量的预测值con i(t+1)和物理内存利用率、虚拟内存利用率以及磁盘利用率之间的影响因子σ 1、σ 2、σ 3一同输入到Elman神经网络模型中;
步骤2.1.2.6:输出虚拟机的平均物理内存利用率、平均虚拟内存利用率以及平均磁盘利用率与业务并发量之间的非线性关系,如下公式所示:
cur mem(t+1)=f′(con i(t+1),cur mem(t),cur swap(t),cur disk(t))+σ 1  (4a)
cur swap(t+1)=g(con i(t+1),cur mem(t),cur swap(t),cur disk(t))+σ 2  (4b)
cur disk(t+1)=h(con i(t+1),cur mem(t),cur swap(t),cur disk(t))+σ 3  (4c)
其中,f′()、g()、h()分别为表示平均物理内存利用率、平均虚拟内存利用率以及平均磁盘利用率与业务并发量之间的非线性关系函数;
步骤2.1.3:使用AdaBoost.RT算法对虚拟机软件老化指标的预测模型进行优化,将一些Elman神经网络作为弱预测模型合成强预测模型Ada-Elman;
步骤2.1.3.1:输入训练样本集,初始化每个Elman神经网络预测器f(x)的参数和训练样本的权值及训练误差的阈值;
所述训练样本的权值及训练误差的阈值如下公式所示:
Figure PCTCN2019090871-appb-000024
其中,m为Elman神经网络预测器的个数,
Figure PCTCN2019090871-appb-000025
第为t次迭代第i个样本的权重,
Figure PCTCN2019090871-appb-000026
Figure PCTCN2019090871-appb-000027
为训练误差的阈值;
步骤2.1.3.2:设置平均误差率e t为零,读取训练样本,训练第t个Elman神经网络预测器f t(x),进而合成强预测模型Ada-Elman;
步骤2.1.3.3:计算AdaBoost-Elman模型在训练集上的误差
Figure PCTCN2019090871-appb-000028
如下公式所示:
Figure PCTCN2019090871-appb-000029
其中,
Figure PCTCN2019090871-appb-000030
为第t次迭代第i个样本的绝对误差,y i为第i个样本值;
步骤2.1.3.4:如果
Figure PCTCN2019090871-appb-000031
则调整平均误差率
Figure PCTCN2019090871-appb-000032
步骤2.1.3.5:设置每个Elman神经网络的平均相对误差
Figure PCTCN2019090871-appb-000033
的初值为0.2,理想上界为0.35,理想下界为0.1,如公式(7)和(8)所示:
Figure PCTCN2019090871-appb-000034
Figure PCTCN2019090871-appb-000035
其中,
Figure PCTCN2019090871-appb-000036
为平均相对误差,
Figure PCTCN2019090871-appb-000037
为第t个训练样本误差的阈值;
步骤2.1.3.6:计算权值调整因子,如下公式所示:
Figure PCTCN2019090871-appb-000038
其中,
Figure PCTCN2019090871-appb-000039
为第t次迭代的权重调整因子;
步骤2.1.3.7:更新每个训练样本的权重,具体为:
如果
Figure PCTCN2019090871-appb-000040
增大该样本的权重,如下公式所示:
Figure PCTCN2019090871-appb-000041
其中,D t为第t次迭代样本权重的规范化因子;
如果
Figure PCTCN2019090871-appb-000042
调整训练样本的权重,如下公式所示:
Figure PCTCN2019090871-appb-000043
步骤2.1.3.8:判断是否达到最大迭代次数;
若未达到最大迭代次数,继续迭代;
若达到最大迭代次数,输出Ada-Elamn模型,得到虚拟机软件老化指标预测模型g(x),如下公式所示:
Figure PCTCN2019090871-appb-000044
步骤2.2:训练未老化虚拟机参照预测模型;
步骤2.2.1:提取新创建并且刚启动不久的虚拟机的性能日志库和业务并发量日志库中的数据,并对提取的数据进行预处理;
步骤2.2.2:通过步骤2.1.2中Elman神经网络建立关系模型的方法和步骤2.1.3中使用AdaBoost.RT算法对关系模型进行优化的方法建立并训练未老化虚拟机参照预测模型;
步骤3:预测虚拟机的软件老化的在线训练过程,具体如下;
步骤3.1:将业务并发量预测值和性能数据输入到离线过程训练的虚拟机的软件老化指标预测模型和未老化虚拟机参照预测模型中;
步骤3.2:虚拟机的软件老化指标预测模型和未老化虚拟机参照预测模型分别输出虚拟机的软件老化指标预测结果和未老化虚拟机的参照预测结果;
步骤3.3:结合步骤1中的评估虚拟机软件老化的方法,根据虚拟机的软件老化指标预测结果和未老化虚拟机的参照预测结果来评估虚拟机的软件老化趋势。
本实施例中,虚拟机老化的模拟和预测过程如下:
(1)首先在订票网页的Servlet中申请一个静态的HashSet,然后在doPost方法中申请一定数量的对象放到之前的HashSet中,在虚拟机运行时不断地调用订票网页即可达到消耗可用内存的目的,虚拟机内存消耗的速度由doPost方法中申请对象的数量决定,运用此过程来模拟虚拟机软件老化。
(2)使用LoadRunner对服务端虚拟机加压并采集数据。采集的数据进行规范化,并将 数据间隔设为15秒,分成data_health和data_aging两组,分别用来建立当前虚拟机的软件老化指标预测模型和参照预测模型。data_health取自新建虚拟机启动后3个小时,时间为2018年10月8日9时至2018年10月8日12时;data_aging取自虚拟机持续工作一段时间后3个小时的监测数据,时间为2018年10月8日18时至2018年10月8日21时。
(3)通过分析前165分钟的监测数据,分别利用Ada-Elman、Elman和BP神经网络对虚拟机建立软件老化指标预测模型,记为aging_Ada-Elman、aging_Elman和aging_BP模型,预测未来15分钟的虚拟机软件老化指标。同样地,利用Ada-Elman、Elman和BP神经网络分别建立参照模型,记为confer_Ada-Elman、confer_Elman和confer_BP模型,并将所有模型的输出与真实值对比分析。实验中共设置登录、退票、浏览、注册等八类业务,因此Ada-Elman和Elman模型的输入包括八类业务的并发该问量和物理内存、虚拟内存、磁盘数据,BP模型是直接对三类老化指标的历史序列进行拟合。多次实验后设定Elman模型各层的节点数为11-8-3,学习率是0.2,Ada-Elman模型中Elman预测器个数为10,每个预测器的各层节点数为11-7-3,学习率是0.2,BP模型的各层节点数为11-8-3,学习率是0.3,三种模型的最大迭代轮数为1000。
本实施例中,aging_Ada-Elman、aging_Elman和aging_BP三种模型对虚拟机软件老化指标的预测结果如图3-5所示。aging_Ada-Elman模型的预测结果更加接近虚拟机的真实性能值,尤其对波动频繁的虚拟内存利用率预测时,aging_BP和aging_Elman的拟合结果偏差较大,而aging_Ada-Elman拟合效果更好。
本实施例中,不同模型预测的误差表1所示,对于虚拟机的三类软件老化指标,Ada-Elman预测的平均绝对误差MAE和均方误差MSE均小于Elman预测的平均绝对误差和均方误差,表明本文提出的Ada-Elman预测精度比单个Elman模型高,同样Ada-Elman预测的平均绝对误差和均方误差也均小于BP模型预测的平均绝对误差和均方误差,这是因为Ada-Elman并没有直接对历史序列建模,而是充分考虑了业务并发量和软件老化指标之间的关系。
表1不同方法对软件老化指标的预测误差
Figure PCTCN2019090871-appb-000045
表1反映了不同模型的时间开销,表中结果为多次预测后的平均值,其中,BP模型所用 时间最短,Ada-Elman和Elman模型时间更长,这是因为BP模型是直接对软件老化指标的历史序列进行建模,而不用输入业务并发量。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明权利要求所限定的范围。

Claims (6)

  1. 一种基于AdaBoost-Elman的虚拟机软件老化预测方法,其特征在于:包括以下步骤:
    步骤1:设定评估虚拟机软件老化程度的等级,具体方法为:
    步骤1.1:选取磁盘、物理内存和虚拟内存的利用率作为虚拟机软件老化的评估指标,计算虚拟机的磁盘、物理内存和虚拟内存的平均利用率的性能损耗量wastage disk、wastage mem、wastage swap
    步骤1.2:计算代表虚拟机的软件老化程度的虚拟机软件老化度s;
    步骤1.3:根据软件老化度s将虚拟机的健康状态划分为五个等级,具体为:
    当0≤s<0.2时,判定该虚拟机处于健康状况;
    当0.2≤s<0.4时,判定该虚拟机处于轻微软件老化状况;
    当0.4≤s<0.6时,判定该虚拟机处于中度软件老化状况;
    当0.6≤s<0.8时,判定该虚拟机处于重度软件老化状况;
    当0.8≤s≤1时,判定该虚拟机故障,无法正常使用;
    步骤2:预测虚拟机的软件老化的离线训练过程,具体如下:
    步骤2.1:训练虚拟机的软件老化指标预测模型;
    步骤2.1.1:提取虚拟机性能日志库和虚拟机业务并发量日志库中的历史数据,并对提取的历史数据进行预处理;
    步骤2.1.2:通过Elman神经网络建立业务并发量与软件老化指标之间的关系模型,即虚拟机软件老化指标的预测模型;
    步骤2.1.3:使用AdaBoost.RT算法对虚拟机软件老化指标的预测模型进行优化,将一些Elman神经网络作为弱预测模型合成强预测模型Ada-Elman;
    步骤2.2:训练未老化虚拟机参照预测模型;
    步骤2.2.1:提取新创建并且刚启动不久的虚拟机的性能日志库和业务并发量日志库中的数据,并对提取的数据进行预处理;
    步骤2.2.2:通过步骤2.1.2中Elman神经网络建立关系模型的方法和步骤2.1.3中使用AdaBoost.RT算法对关系模型进行优化的方法建立并训练未老化虚拟机参照预测模型;
    步骤3:预测虚拟机的软件老化的在线训练过程,具体如下:
    步骤3.1:将业务并发量预测值和性能数据输入到离线过程训练的虚拟机的软件老化指标预测模型和未老化虚拟机参照预测模型中;
    步骤3.2:虚拟机的软件老化指标预测模型和未老化虚拟机参照预测模型分别输出虚拟机 的软件老化指标预测结果和未老化虚拟机的参照预测结果;
    步骤3.3:结合步骤1中的评估虚拟机软件老化的方法,根据虚拟机的软件老化指标预测结果和未老化虚拟机的参照预测结果来评估虚拟机的软件老化趋势。
  2. 根据权利要求1所述的一种基于AdaBoost-Elman的虚拟机软件老化预测方法,其特征在于:步骤1.1所述虚拟机的磁盘、物理内存和虚拟内存的平均利用率的性能损耗量wastage disk、wastage mem、wastage swap的计算公式如下所示:
    wastage disk=|cur disk-confer disk|     (1a)
    wastage mem=|cur mem-confer mem|     (1b)
    wastage swap=|cur swap-confer swap|      (1c)
    其中,cur disk、cur mem、cur swap为虚拟机的平均磁盘利用率、平均物理内存利用率和平均虚拟内存利用率,而confer disk、confer mem、confer swap则是进行软件老化评估的磁盘、物理内存和虚拟内存的平均利用率的基准值。
  3. 根据权利要求2所述的一种基于AdaBoost-Elman的虚拟机软件老化预测方法,其特征在于:步骤1.2所述虚拟机软件老化度s如下公式所示:
    s=ω 1*wastage mem2*wastage swap3*wastage disk    (2)
    其中,ω 1、ω 2、ω 3为物理内存、虚拟内存和磁盘的平均利用率的性能损耗量的权重系数。
  4. 根据权利要求3所述的一种基于AdaBoost-Elman的虚拟机软件老化预测方法,其特征在于:所述步骤2.1.1的具体方法为:
    步骤2.1.1.1:对提取的虚拟机业务并发量缺失点进行处理;
    对于个别采样点缺失的情况,采用前一周期和后一周期业务并发量的平均值进行填补;
    对于采样点缺失达到百分之九十以上的情况,舍弃全部采样并且将该段时间内业务并发量的值置为零;
    步骤2.1.1.2:对于采集到的虚拟机业务并发量中存在异常波动的极大极小样本进行异常值调整;
    步骤2.1.1.3:对从虚拟机日志数据库和虚拟机业务并发量日志库中提取到的业务并发量和CPU利用率数据进行数据间隔调整,对采集的数据以秒、分钟或小时为单位进行合并;
    步骤2.1.1.4:采用最大最小值归一法将步骤2.1.1.3处理后的数据进行归一化。
  5. 根据权利要求4所述的一种基于AdaBoost-Elman的虚拟机软件老化预测方法,其特征在于:所述步骤2.1.2的具体方法为:
    步骤2.1.2.1:设置Elman神经网络的层数为3;
    步骤2.1.2.2:虚拟机支撑的业务数类型为n,设置Elman神经网络的输入节点数in为n+3,输出节点数out为3;
    步骤2.1.2.3:采用柯尔莫哥洛夫定理得出Elman神经网络中隐藏节点数hide的大致范围,如下公式所示,然后逐一验证结果准确性;
    Figure PCTCN2019090871-appb-100001
    其中,a∈(1,10);
    步骤2.1.2.4:Elman神经网络输出层的传递函数采用ReLU线性整流函数或者Sigmod函数,隐藏层的传递函数采用Sigmod函数来对虚拟机的业务并发量和软件老化指标进行预测;
    步骤2.1.2.5:将虚拟机的三类性能指标cur mem(t)、cur swap(t)、cur disk(t),虚拟机上业务并发量的预测值con i(t+1)和物理内存利用率、虚拟内存利用率以及磁盘利用率之间的影响因子σ 1、σ 2、σ 3一同输入到Elman神经网络模型中;
    步骤2.1.2.6:输出虚拟机的平均物理内存利用率、平均虚拟内存利用率以及平均磁盘利用率与业务并发量之间的非线性关系,如下公式所示:
    cur mem(t+1)=f′(con i(t+1),cur mem(t),cur swap(t),cur disk(t))+σ 1    (4a)
    cur swap(t+1)=g(con i(t+1),cur mem(t),cur swap(t),cur disk(t))+σ 2     (4b)
    cur disk(t+1)=h(con i(t+1),cur mem(t),cur swap(t),cur disk(t))+σ 3    (4c)
    其中,f′()、g()、h()分别为表示平均物理内存利用率、平均虚拟内存利用率以及平均磁盘利用率与业务并发量之间的非线性关系函数。
  6. 根据权利要求5所述的一种基于AdaBoost-Elman的虚拟机软件老化预测方法,其特征在于:所述步骤2.1.3的具体方法为:
    步骤2.1.3.1:输入训练样本集,初始化每个Elman神经网络预测器f(x)的参数和训练样本的权值及训练误差的阈值;
    所述训练样本的权值及训练误差的阈值如下公式所示:
    Figure PCTCN2019090871-appb-100002
    其中,m为Elman神经网络预测器的个数,
    Figure PCTCN2019090871-appb-100003
    第为t次迭代第i个样本的权重,i=1,…,m,
    Figure PCTCN2019090871-appb-100004
    为训练误差的阈值;
    步骤2.1.3.2:设置平均误差率e t为零,读取训练样本,训练第t个Elman神经网络预测 器f t(x),进而合成强预测模型Ada-Elman;
    步骤2.1.3.3:计算AdaBoost-Elman模型在训练集上的误差
    Figure PCTCN2019090871-appb-100005
    如下公式所示:
    Figure PCTCN2019090871-appb-100006
    其中,
    Figure PCTCN2019090871-appb-100007
    为第t次迭代第i个样本的绝对误差,y i为第i个样本值;
    步骤2.1.3.4:如果
    Figure PCTCN2019090871-appb-100008
    则调整平均误差率
    Figure PCTCN2019090871-appb-100009
    步骤2.1.3.5:设置每个Elman神经网络的平均相对误差
    Figure PCTCN2019090871-appb-100010
    的初值为0.2,理想上界为0.35,理想下界为0.1,如公式(7)和(8)所示:
    Figure PCTCN2019090871-appb-100011
    Figure PCTCN2019090871-appb-100012
    其中,
    Figure PCTCN2019090871-appb-100013
    为平均相对误差,
    Figure PCTCN2019090871-appb-100014
    为第t个训练样本误差的阈值;
    步骤2.1.3.6:计算权值调整因子,如下公式所示:
    Figure PCTCN2019090871-appb-100015
    其中,
    Figure PCTCN2019090871-appb-100016
    为第t次迭代的权重调整因子;
    步骤2.1.3.7:更新每个训练样本的权重,具体为:
    如果
    Figure PCTCN2019090871-appb-100017
    增大该样本的权重,如下公式所示:
    Figure PCTCN2019090871-appb-100018
    其中,D t为第t次迭代样本权重的规范化因子;
    如果
    Figure PCTCN2019090871-appb-100019
    调整训练样本的权重,如下公式所示:
    Figure PCTCN2019090871-appb-100020
    步骤2.1.3.8:判断是否达到最大迭代次数;
    若未达到最大迭代次数,继续迭代;
    若达到最大迭代次数,输出Ada-Elamn模型,得到虚拟机软件老化指标预测模型g(x),如下公式所示:
    Figure PCTCN2019090871-appb-100021
PCT/CN2019/090871 2019-04-29 2019-06-12 一种基于AdaBoost-Elman的虚拟机软件老化预测方法 WO2020220437A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910354685.2A CN110083518B (zh) 2019-04-29 2019-04-29 一种基于AdaBoost-Elman的虚拟机软件老化预测方法
CN201910354685.2 2019-04-29

Publications (1)

Publication Number Publication Date
WO2020220437A1 true WO2020220437A1 (zh) 2020-11-05

Family

ID=67417651

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/090871 WO2020220437A1 (zh) 2019-04-29 2019-06-12 一种基于AdaBoost-Elman的虚拟机软件老化预测方法

Country Status (2)

Country Link
CN (1) CN110083518B (zh)
WO (1) WO2020220437A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112433927A (zh) * 2020-11-30 2021-03-02 西安理工大学 基于时间序列聚类和lstm的云服务器老化预测方法
CN114691459A (zh) * 2022-03-17 2022-07-01 中国人民解放军国防科技大学 一种软件系统老化预测方法及系统
CN116627759A (zh) * 2023-05-19 2023-08-22 北京神州安付科技股份有限公司 一种金融支付设备电路安全检测装置
CN116661954A (zh) * 2023-07-21 2023-08-29 苏州浪潮智能科技有限公司 虚拟机异常预测方法、装置、通信设备及存储介质
CN116738689A (zh) * 2023-05-26 2023-09-12 中国长江电力股份有限公司 基于Elman模型的水轮发电机组状态健康评估系统及方法
CN117271350A (zh) * 2023-09-28 2023-12-22 江苏天好富兴数据技术有限公司 一种基于日志分析的软件质量评估系统及方法
CN117807501A (zh) * 2023-12-28 2024-04-02 无锡索威尔信息科技有限公司 一种基于大数据的农作物生长环境监控方法及系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101377756A (zh) * 2007-08-30 2009-03-04 联想(北京)有限公司 一种评估计算机系统老化的方法
CN104951379A (zh) * 2015-07-21 2015-09-30 国家计算机网络与信息安全管理中心 一种基于乘积季节模型的软件再生方法
US20160188449A1 (en) * 2013-08-12 2016-06-30 Nec Corporation Software aging test system, software aging test method, and program for software aging test

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2010122709A1 (ja) * 2009-04-23 2012-10-25 日本電気株式会社 若化処理装置、若化処理システム、コンピュータプログラムおよびデータ処理方法
CN106776288B (zh) * 2016-11-25 2019-11-19 北京航空航天大学 一种基于Hadoop的分布式系统的健康度量方法
CN109408386B (zh) * 2018-10-18 2022-03-25 中国电子科技集团公司第二十八研究所 一种软件老化流式监测系统及其监测方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101377756A (zh) * 2007-08-30 2009-03-04 联想(北京)有限公司 一种评估计算机系统老化的方法
US20160188449A1 (en) * 2013-08-12 2016-06-30 Nec Corporation Software aging test system, software aging test method, and program for software aging test
CN104951379A (zh) * 2015-07-21 2015-09-30 国家计算机网络与信息安全管理中心 一种基于乘积季节模型的软件再生方法

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112433927A (zh) * 2020-11-30 2021-03-02 西安理工大学 基于时间序列聚类和lstm的云服务器老化预测方法
CN114691459A (zh) * 2022-03-17 2022-07-01 中国人民解放军国防科技大学 一种软件系统老化预测方法及系统
CN116627759A (zh) * 2023-05-19 2023-08-22 北京神州安付科技股份有限公司 一种金融支付设备电路安全检测装置
CN116627759B (zh) * 2023-05-19 2024-02-27 北京神州安付科技股份有限公司 一种金融支付设备电路安全检测装置
CN116738689A (zh) * 2023-05-26 2023-09-12 中国长江电力股份有限公司 基于Elman模型的水轮发电机组状态健康评估系统及方法
CN116661954A (zh) * 2023-07-21 2023-08-29 苏州浪潮智能科技有限公司 虚拟机异常预测方法、装置、通信设备及存储介质
CN116661954B (zh) * 2023-07-21 2023-11-03 苏州浪潮智能科技有限公司 虚拟机异常预测方法、装置、通信设备及存储介质
CN117271350A (zh) * 2023-09-28 2023-12-22 江苏天好富兴数据技术有限公司 一种基于日志分析的软件质量评估系统及方法
CN117807501A (zh) * 2023-12-28 2024-04-02 无锡索威尔信息科技有限公司 一种基于大数据的农作物生长环境监控方法及系统

Also Published As

Publication number Publication date
CN110083518B (zh) 2021-11-16
CN110083518A (zh) 2019-08-02

Similar Documents

Publication Publication Date Title
WO2020220437A1 (zh) 一种基于AdaBoost-Elman的虚拟机软件老化预测方法
Shahid et al. Towards Resilient Method: An exhaustive survey of fault tolerance methods in the cloud computing environment
Bi et al. Deep neural networks for predicting task time series in cloud computing systems
US8806487B2 (en) Calculating virtual machine resource utilization information
Caron et al. Pattern matching based forecast of non-periodic repetitive behavior for cloud clients
CN106776288B (zh) 一种基于Hadoop的分布式系统的健康度量方法
Yu et al. Integrating clustering and learning for improved workload prediction in the cloud
JP2006024017A (ja) コンピュータ資源のキャパシティを予測するためのシステム、方法およびプログラム
CN110109733B (zh) 面向不同老化场景的虚拟机工作队列和冗余队列更新方法
Sonmez et al. Trace-based evaluation of job runtime and queue wait time predictions in grids
CN111027591B (zh) 一种面向大规模集群系统的节点故障预测方法
CN108390775B (zh) 一种基于spice的用户体验质量评价方法及系统
WO2017071369A1 (zh) 一种预测用户离网的方法和设备
JP2022503783A (ja) 予測モデルの改良
CN111045939A (zh) Weibull分布的故障检测开源软件可靠性建模方法
Gupta et al. Long range dependence in cloud servers: a statistical analysis based on google workload trace
CN114741160A (zh) 一种基于平衡能耗与服务质量的动态虚拟机整合方法和系统
CN113850669A (zh) 用户分群方法、装置、计算机设备及计算机可读存储介质
KR102062332B1 (ko) 처리 시간에 민감한 워크로드에 대한 메모리 대역폭 할당 방법 및 장치
Dai Vu et al. Deep learning-based fault prediction in cloud system
WO2022022572A1 (en) Calculating developer time during development process
CN112882917B (zh) 一种基于贝叶斯网络迁移的虚拟机服务质量动态预测方法
US11150971B1 (en) Pattern recognition for proactive treatment of non-contiguous growing defects
CN114490405A (zh) 资源需求量确定方法、装置、设备及存储介质
CN111598390B (zh) 服务器高可用性评估方法、装置、设备和可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19927427

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19927427

Country of ref document: EP

Kind code of ref document: A1