CN106484496B - Virtual machine BOTTOM LAYER ENVIRONMENT signature analysis and performance metric method based on Bayesian network - Google Patents

Virtual machine BOTTOM LAYER ENVIRONMENT signature analysis and performance metric method based on Bayesian network Download PDF

Info

Publication number
CN106484496B
CN106484496B CN201610956901.7A CN201610956901A CN106484496B CN 106484496 B CN106484496 B CN 106484496B CN 201610956901 A CN201610956901 A CN 201610956901A CN 106484496 B CN106484496 B CN 106484496B
Authority
CN
China
Prior art keywords
performance
virtual machine
feature
bottom layer
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610956901.7A
Other languages
Chinese (zh)
Other versions
CN106484496A (en
Inventor
张彬彬
岳昆
郝佳
王娟
武浩
吴鸿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan University YNU
Original Assignee
Yunnan University YNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan University YNU filed Critical Yunnan University YNU
Priority to CN201610956901.7A priority Critical patent/CN106484496B/en
Publication of CN106484496A publication Critical patent/CN106484496A/en
Application granted granted Critical
Publication of CN106484496B publication Critical patent/CN106484496B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Abstract

The virtual machine BOTTOM LAYER ENVIRONMENT signature analysis and performance metric method that the invention discloses a kind of based on Bayesian network, according to software and hardware configuration specific in virtual platform to be assessed, from hardware characteristics, software features, configuration feature, four aspects of runtime environment feature extract the BOTTOM LAYER ENVIRONMENT feature that may influence virtual machine performance, then the performance indicator for needing to measure is determined, the virtual machine of different BOTTOM LAYER ENVIRONMENT feature combinations is configured in virtual platform to be assessed, required performance indicator numerical value is obtained by operation benchmark, obtain feature-performance data sample aggregates of each performance indicator;Corresponding feature-performance the Bayesian network of each performance indicator is constructed according to feature-performance data sample aggregates, finally the performance of virtual machine is measured according to the conditional probability table of performance indicator node.The present invention shows the dependence between BOTTOM LAYER ENVIRONMENT feature and performance indicator using Bayesian network, to realize the accurate measurement to virtual machine performance.

Description

Virtual machine BOTTOM LAYER ENVIRONMENT signature analysis and performance metric method based on Bayesian network
Technical field
The invention belongs to field of cloud computer technology, more specifically, are related to a kind of virtual machine bottom based on Bayesian network The analysis of layer environmental characteristic and performance metric method.
Background technique
Infrastructure is to service the cloud computing platform of (Infrastructure as a Service, IaaS) with virtual machine Form provide a user resource, resource provider and user need the accurate performance for understanding virtual machine, so as to more reasonably sharp With resource, deployment services;Commercial cloud computing platform is usually according to the Subscription Price for formulating virtual machine with the performance of virtual machine Lattice.Therefore, the performance for how accurately measuring virtual machine becomes a major issue.
In order to accurately measure the performance of virtual machine from virtual machine BOTTOM LAYER ENVIRONMENT feature, need special to virtual machine BOTTOM LAYER ENVIRONMENT Sign, the following three classes relationship of virtual machine performance are assessed:
1. the relationship between each software and hardware configuration feature and virtual machine performance of virtual machine: in virtualized environment, configuration When one virtual machine, software and hardware configuration feature be all it is determining, for example, the number of virtual cpu, memory size, virtual machine prison Virtualization technology type, the CPU dispatching algorithm of use etc. that control device (Virtual Machine Monitor, VMM) uses, all Can accurately it be arranged.The performance of these software and hardware configuration feature joint effect virtual machines, but each feature is to performance Influence degree is different, and the same feature also has different journeys to different performance indicators such as calculated performance, memory access performance, I/O performances The influence of degree.Need the relationship between clearly each feature and performance.
2. the relationship of other virtual machine brings interference and virtual machine performance that are run on same physical host: virtualizing On platform, each virtual machine on same physical host shares the physical resource of bottom, and the performance of virtual machine is virtual by other Machine accesses the interference of the mode of resource, and this performance interference has biggish uncertainty, it is difficult to accurate evaluation.Needing to design has Influence of the interference to virtual machine performance between the model evaluation virtual machine of effect.
3. the relationship between each software and hardware configuration feature of virtual machine: each in the software and hardware configuration of virtual machine, identical platform Virtual machine is not independent from each other using features such as the modes of resource, wherein presence influences each other and dependence.It needs to dig Dig the correlation between these features.
The simultaneously above three classes relationship of quantitative description is analyzed, is basis and the key for measuring virtual machine performance.
The performance evaluation and Modeling Research of well known virtual machine, main part configuration parameter or money by finding virtual machine The relationship between application performance run in source utilization rate and virtual machine performance or virtual machine, helps the performance prediction of virtual machine, Or the resource allocation of optimization virtual machine is to guarantee application service quality.Li Fengze etc. (<computer system application>, 2015) is collected Period CPU time weight, the quantity of virtual cpu, memory and I/O compete four kinds of hardware resources individually to virtual machine application performance Influence as modeling parameters, the modeling method with nonlinear model is expanded using the feature based on singular value decomposition, to hardware It is modeled with the relationship of virtual machine performance.Wang Rui (<Shanghai Communications University's Master's thesis>, 2011) uses recursive least-squares Method establishes the relational model between the resource that application performance and virtual machine use, and can be determined based on this model by differential errors function The optimal resource allocation amount of virtual machine is determined to reach expected performance objective.But the performance of virtual machine is by hardware parameter, software All various combined effects such as environment, configuration feature, runtime environment, all spies that virtual machine performance may be had an impact Relationship between sign and virtual machine performance is difficult to analyze;In addition, virtual machine performance has uncertainty, in above method, to not It is deterministic to indicate that there is also difficulties with reasoning.
The assessment that well known virtual machine intersexuality can be interfered, mainly to application load characteristic or fractional hardware performance parameter and property Relationship modeling between capable of interfering.Kings thirty etc. (<Journal of Software>, 2015) have found last level cache crash rate (LLC miss Rates existing different incidence relations) are interfered from CPU intensive type, network-intensive application performance, virtual machine performance is established with this Interfere appraising model, estimating virtual machine performance.Meng Fanxin (<Shandong University's Master's thesis>, 2014) devises a kind of based on application The virtual machine performance Interference Prediction Model of type uses decision tree classification side by monitoring the different characteristic of application access resource Method completes application class, the use of linear regression technique is then that performance Interference Prediction Model is established in each type of application respectively. King devises a kind of virtual machine performance mutual interference degree based on multiple linear regression analysis into (<Northeastern University's Master's thesis>, 2013) Model establishes performance mutual interference degree model by excavating the relationship between virtual machine performance mutual interference degree and its background load parameter, and With multiple linear regression analysis solving model parameter, wherein the CPU of other virtual machines that background load parameter refers to while running The resource using informations such as utilization rate, memory usage.Based on this model, the virtual machine performance under capable of loading to other backgrounds is mutual It disturbs and makes prediction.Performance interference between virtual machine is that the main reason for uncertain is presented in virtual machine performance, and quantificational description is virtual The uncertainty of machine performance is another mode measuring virtual machine intersexuality and capable of interfering.
The well known work that bayes method is applied to virtual machine field mainly uses Naive Bayes Classifier to void Application type, secure virtual machine rank etc. in quasi- machine carry out the basis that classification is predicted as application performance and application is disposed.Example Such as, virtual machine CPU, memory, the network bandwidth of Yang Guang (<Beijing University of Post & Telecommunication's Master's thesis>, 2013) analysis real time monitoring make With rate and utilization rate change rate, virtual machine is divided into three classifications with Naive Bayes Classifier: expanding resource extent, reduce Resource extent is not necessarily to adjustresources, the resource used according to classification results dynamic adjustment virtual machine.Shen allusion quotation (< Southeast China University master Paper >, 2012) analysis cloud computing environment in virtual machine behavior, be based on Naive Bayes Classification Algorithm, propose one kind to void The method that quasi- machine carries out security level classification, according to different user behavior by virtual machine be categorized into preset four it is different Safe level of trust, to instruct the deployment of virtual machine.Nae Bayesianmethod requires between assumed condition variable independently of each other, and In virtualized environment, influence between each low-level image feature of virtual machine performance that there may be relation of interdependence, it is difficult to utilize Piao Plain bayes method processing.
Summary of the invention
It is an object of the invention to overcome the deficiencies of the prior art and provide a kind of virtual machine bottom ring based on Bayesian network Border signature analysis and performance metric method, the dependence showed between BOTTOM LAYER ENVIRONMENT feature and performance indicator using Bayesian network are closed System, to realize the accurate measurement to virtual machine performance.
For achieving the above object, the present invention is based on the virtual machine BOTTOM LAYER ENVIRONMENT signature analysis of Bayesian network and performance degree Amount method the following steps are included:
S1: according to software and hardware configuration specific in virtual platform to be assessed, from hardware characteristics, software features, configuration Four feature, runtime environment feature aspects extract the BOTTOM LAYER ENVIRONMENT feature that may influence virtual machine performance, what note was extracted BOTTOM LAYER ENVIRONMENT feature quantity is N, when can determine that two BOTTOM LAYER ENVIRONMENT feature XiAnd XjBetween be not present dependence, then remember its according to Rely mark rij=0, otherwise rij=1, wherein i=1,2 ..., N, j=1,2 ..., N, i ≠ j;
S2: the performance indicator Y for needing to measure is determined as neededk, k=1,2 ..., K, K expression performance indicator quantity;
S3: it according to the possibility value of BOTTOM LAYER ENVIRONMENT feature every in step S1, is configured in virtual platform to be assessed The virtual machine of different BOTTOM LAYER ENVIRONMENT feature combinations, then according to main application type to be disposed in virtualized environment to be assessed One group of benchmark is selected, benchmark, identified correspondence in recording step S2 are run on each virtual machine The numerical value of performance indicator, to obtain each performance indicator YkA series of corresponding feature by virtual machines-performance data (x1, x2,…,xN,yk) constitute set of data samples Dk, wherein xiIndicate BOTTOM LAYER ENVIRONMENT feature XiValue, ykIndicate performance indicator Yk's Value;
S4: for BOTTOM LAYER ENVIRONMENT feature and each performance indicator YkCombination, according to its set of data samples Dk, establish and correspond to Feature-performance Bayesian network, and the conditional probability table of each node is calculated;Establish feature-performance Bayesian network Method are as follows:
Initialization one is with { X1,X2,…,XN,YkIt is nodal set V, boundless graph structure G=(V, E), i.e. V={ X1, X2,…,XN,Yk,Node is initialized to list
For the node that is made of two-by-two BOTTOM LAYER ENVIRONMENT feature to (Xi,Xj), it is identified first according to the dependence in step S1 rijDetermined, if rij=0, then do not make any operation, otherwise calculates its mutual information;It calculates by BOTTOM LAYER ENVIRONMENT feature and property The node of energy index composition is to (Xi,Yk) mutual information;For there are the nodes pair of mutual information, if mutual information is greater than threshold epsilon, Node is then put into in list L, does not otherwise make any operation;For node pair all in list L, according to the value of mutual information by Small be ranked up is arrived greatly;
According to node to the node in list L to addition while and when removing redundancy, obtain feature-performance Bayesian network, be Each node calculates its conditional probability table;
S5: when virtual platform to be assessed configures a virtual machine, which is obtained according to configuration information The value of BOTTOM LAYER ENVIRONMENT feature, then in the feature of the obtained each performance indicator of step S4-performance Bayesian network performance indicator In the conditional probability table of node, lookup obtains the maximal condition of performance indexes corresponding to the virtual machine BOTTOM LAYER ENVIRONMENT feature Probability, performance indicator value corresponding to the maximal condition probability have measured the property that virtual machine is measured by the performance indicator Can, the items of virtual machine BOTTOM LAYER ENVIRONMENT feature performance indicator corresponding in the conditional probability table of performance indicator node may Value illustrates that the range of virtual machine performance fluctuation, the probability distribution of each value of corresponding performance indicator reflect the virtual machine The degree of performance inconsistency.
The present invention is based on the virtual machine BOTTOM LAYER ENVIRONMENT signature analysis of Bayesian network and performance metric method, according to be assessed Specific software and hardware configuration in virtual platform, from hardware characteristics, software features, configuration feature, runtime environment feature four Aspect extracts the BOTTOM LAYER ENVIRONMENT feature that may influence virtual machine performance, the performance indicator for needing to measure then is determined, to be assessed Virtual platform in configure the virtual machine of different BOTTOM LAYER ENVIRONMENT features combination, needed for being obtained by operation benchmark Performance indicator numerical value, obtain feature-performance data sample aggregates of each performance indicator;According to feature-performance data sample aggregates Construct the corresponding feature of each performance indicator-performance Bayesian network, finally according to the conditional probability table of performance indicator node come pair The performance of virtual machine is measured.
The present invention has following technical effect that
(1) present invention models virtual machine BOTTOM LAYER ENVIRONMENT feature with performance data using Bayesian network, can be effective Express the dependence between performance and feature, especially between multiple virtual machines on same physical host due to resource contention Caused by performance inconsistency can use Probability Forms quantificational description, solve cloud computing environment under due to virtual machine performance have not really Qualitative the problem of being difficult to accurate evaluation virtual machine performance;
(2) present invention analysis virtual machine measured performance data, can between virtual machine BOTTOM LAYER ENVIRONMENT feature from analyzing in data Relationship existing for energy, and the dependence between performance and each feature is analyzed, it can be found that the relationship that manual analysis defies capture;
It (3), can be on the basis of virtual machine BOTTOM LAYER ENVIRONMENT feature and the Bayesian network model of property relationship by the present invention The performance of upper measurement particular virtual machine, or recommend the virtual machine that can achieve agreement performance to match for user according to the demand of user It sets, the Resource dynamic allocation between virtual machine can also be instructed, provided to lease virtual machine and application deployment in cloud computing environment Important basic information.
Detailed description of the invention
Fig. 1 is that the present invention is based on the specific realities of the virtual machine BOTTOM LAYER ENVIRONMENT signature analysis of Bayesian network and performance metric method Applying method flow chart;
Fig. 2 is feature in the present invention-performance Bayesian network building flow chart;
Fig. 3 is feature-performance Bayesian network that the present embodiment obtains.
Specific embodiment
A specific embodiment of the invention is described with reference to the accompanying drawing, preferably so as to those skilled in the art Understand the present invention.Requiring particular attention is that in the following description, when known function and the detailed description of design perhaps When can desalinate main contents of the invention, these descriptions will be ignored herein.
Fig. 1 is that the present invention is based on the specific realities of the virtual machine BOTTOM LAYER ENVIRONMENT signature analysis of Bayesian network and performance metric method Applying method flow chart.As shown in Figure 1, the present invention is based on the virtual machine BOTTOM LAYER ENVIRONMENT signature analysis of Bayesian network and performance metric side Method the following steps are included:
S101: the BOTTOM LAYER ENVIRONMENT feature that may influence virtual machine performance is extracted:
It is special from hardware characteristics, software features, configuration according to software and hardware configuration specific in virtual platform to be assessed Four sign, runtime environment feature aspects extract the BOTTOM LAYER ENVIRONMENT feature that may influence virtual machine performance, remember extracted bottom Layer environmental characteristic quantity is N, when can determine that two BOTTOM LAYER ENVIRONMENT feature XiAnd XjBetween be not present dependence, then remember its dependence Identify rij=0, otherwise rij=1, wherein i=1,2 ..., N, j=1,2 ..., N, i ≠ j.
According to current virtual platform, the BOTTOM LAYER ENVIRONMENT feature of each type respectively includes following feature:
● hardware characteristics include:
1) physical cpu framework, the relationship being primarily upon between multicore, shared relationship of the Multi-Level Cache between multicore;
2) dominant frequency of physical cpu;
3) capacity and dominant frequency of memory;
4) hard disk type and bandwidth, type mainly include IDE, SATA, SCSI, SSD etc. or network storage, as NFS, NAS, SAN etc..
5) network bandwidth.
● software features include:
1) VMM software used, the main Xen including mainstream, KVM, VMware ESXi etc.;
2) the virtualization mode of CPU is supported fully virtualized, empty based on hardware including half virtualization, without hardware virtualization Quasi-ization is supported fully virtualized etc.;
3) the CPU dispatching algorithm that virtual platform uses, according to dispatching priority-based, then it is each virtual for including The dispatching priority of CPU configuration;
4) the virtualization mode of memory, including shadow page table, fully virtualized etc.;
5) the virtualization mode of I/O equipment, including privileged virtual machine are acted on behalf of I/O access, software simulation I/O equipment, are bypassed VMM directly accesses physical equipment etc..
● configuration feature includes:
1) vCPU number of virtual machine configuration, each vCPU are tied to the mode of physical cpu core;
2) memory size that virtual machine uses;
3) capacity for the CPU Cache that virtual machine uses;
4) the virtual hard disk type that virtual machine uses mainly includes physical extent, LVM logical volume, file (Loop equipment) Deng.
● runtime environment feature includes:
1) the virtual machine quantity run on same physical host;
2) on same physical host other virtual machines loadtype;
3) frequency of adjustresources configuration (virtual machine (vm) migration, long-distance inner access etc.) and duration between more hosts.
Between BOTTOM LAYER ENVIRONMENT feature with the presence or absence of dependence be can be obtained by the common sense in virtual machine field, such as Can be determined that between the framework of physical cpu and hard disk type, between the memory size of virtual machine and vCPU number of virtual machine not There are relation of interdependence.And between the virtualization mode of the VMM software and CPU used, the virtualization mode of I/O equipment and empty Can then there be relation of interdependence, etc. between the virtual hard disk type that quasi- machine uses.
S102: performance indicator is determined:
The performance indicator Y for needing to measure is determined as neededk, k=1,2 ..., K, K expression performance indicator quantity.Such as it can To measure the calculated performance and memory access performance of virtual machine using the runing time of benchmark as performance indicator, to handle up Rate, response time etc. measure the I/O performance of virtual machine as performance indicator.
S103: feature-performance data is obtained:
According to the possibility value of BOTTOM LAYER ENVIRONMENT feature every in step S101, configured not in virtual platform to be assessed With the virtual machine of BOTTOM LAYER ENVIRONMENT feature combination.That is, the BOTTOM LAYER ENVIRONMENT feature in these virtual machines will include step S101 In all BOTTOM LAYER ENVIRONMENT features, and the BOTTOM LAYER ENVIRONMENT feature value of these virtual machines wants BOTTOM LAYER ENVIRONMENT in covering step S101 The all possible combinations of feature.
Then one group of benchmark is selected according to main application type to be disposed in virtualized environment to be assessed, Benchmark is run on each virtual machine, the numerical value of identified corresponding performance index in recording step S102, thus Obtain each performance indicator YkA series of corresponding feature by virtual machines-performance data (x1,x2,…,xN,yk) constitute property It can index YkSet of data samples Dk, wherein xiIndicate BOTTOM LAYER ENVIRONMENT feature XiValue, ykIndicate performance indicator YkValue.Such as It can choose the benchmarks such as SPECCPU, PARSEC, test the processor calculated performance and memory access performance of virtual machine, Corresponding index is program runtime, it is also an option that the I/O intensity benchmark such as Bonnie++, netperf, right Answering index is throughput and response time.
Since virtual machine performance is influenced by runtime environment, there is uncertainty, therefore, each virtual machine is run multiple times Benchmark obtains a feature-performance data with the variation of record performance, after each run.
S104: feature-performance Bayesian network is established:
Bayesian network (Bayesian Network, BN) is the directed acyclic graph for having conditional probability table, is not true One of Qualitative Knowledge expression and the most effective tool of reasoning are widely used in the neck such as data analysis, medical diagnosis, economic forecasting Domain.The present invention expressed using Bayesian network between the BOTTOM LAYER ENVIRONMENT feature of virtual machine, BOTTOM LAYER ENVIRONMENT feature and virtual machine performance Dependence between index, and indicate the uncertainty of its dependence.By feature-performance Bayesian network directed acyclic graph G=(V, E) is indicated, wherein V={ X1,X2,…,XN,Yk, { X1,X2,…,XN,YkIt is institute in feature-performance Bayesian network There is node;E between each feature, between each feature and performance indicator directed edge set.{X1,X2,…,XN,YkIn knot Point, father node, then quantify the node to the probability dependency of father node collection with a conditional probability distribution if it exists.Obviously, For each performance indicator, there is a feature-performance Bayesian network, that is to say, that each performance indicator YkRequire basis Its set of data samples Dk, construct a feature-performance Bayesian network.Fig. 2 is the building of feature in the present invention-performance Bayesian network Flow chart.As shown in Fig. 2, in the present invention building of feature-performance Bayesian network the following steps are included:
S201: initialization Bayesian network:
Initialization one is with { X1,X2,…,XN,YkIt is nodal set V, boundless graph structure G=(V, E), i.e. V={ X1, X2,…,XN,Yk,Node is initialized to listList L is for saving the node pair filtered out.
For V={ X1,X2,…,XN,YkIn each node sequence, due to can between each BOTTOM LAYER ENVIRONMENT feature of virtual machine There can be dependence, for example, the value of hardware characteristics will affect software features value, for example hardware CPU does not support that hardware is empty Quasi-ization extension, then be not available KVM as VMM software;The value of software features then will affect configuration feature value, for example, I/O If the virtualization mode of equipment has selected " directly accessing physical equipment around VMM ", the virtual hard disk type that virtual machine uses It will not select " file (Loop equipment) ", etc., and each BOTTOM LAYER ENVIRONMENT feature of virtual machine influences virtual machine performance.Therefore preferably Mode be to arrange each feature knot according to the hardware characteristics of virtual machine, software features, configuration feature, runtime environment characteristic sequence Point is finally the performance indicator node of virtual machine.And for all types of features inside, also can specify that its node sequence, such as Hardware characteristics can then be arranged according to cpu character, memory features, I/O equipment characteristic sequence.
S202: screening node pair:
In screening node clock synchronization, need to calculate the mutual information between node.In the present invention, since two feature nodes may There is no dependences, then for these nodes pair, so that it may directly skip, do not need calculate mutual information, only exist according to Two feature nodes of the relationship of relying, just need to calculate mutual information, to quantify its degree of dependence.And for feature node and performance knot The node pair that point is constituted, then need to calculate the mutual information of whole nodes pair.Therefore, the specific method of node pair is screened in the present invention Are as follows:
For the node that is made of two-by-two BOTTOM LAYER ENVIRONMENT feature to (Xi,Xj), it is identified first according to the dependence in step S101 rijDetermined, if rij=0, then do not make any operation, if rij=1, then its mutual information is calculated according to the following formula:
Wherein, P (xi) and P (xj) respectively indicate feature XiValue xi, feature XjValue xjIn set of data samples DkIn The probability of appearance, P (xi,xj) indicate feature XiValue xiWith feature XjValue xjSimultaneously in set of data samples DkMiddle appearance Probability.
For the node that is made of BOTTOM LAYER ENVIRONMENT feature and performance indicator to (Xi,Yk), its mutual trust is calculated according to the following formula Breath:
Wherein, P (yk) indicate performance indicator YkValue ykIn set of data samples DkThe probability of middle appearance, P (xi,yk) indicate Feature XiValue xiWith performance indicator YkValue ykSimultaneously in set of data samples DkThe probability of middle appearance.
For there are the nodes pair of mutual information, determine whether it is added node to list L according to threshold epsilon.If mutual information Greater than ε, then node is put into in list L, does not otherwise make any operation.As it can be seen that the size of threshold epsilon determines energy in final mask Enough retain the relation of interdependence of which kind of degree, its general value range is 0.01≤ε≤0.05.For all in list L Node pair is ranked up according to the value of mutual information is descending.Mutual information is bigger, indicates the probability between corresponding two nodes Dependence is stronger, it is clear that I (Xi,Yk) bigger, then feature XiTo performance indicator YkInfluence degree it is higher.
S203: addition side:
Every a pair of of node in sequential search list L, such as (Xi,Xj), find out the X in current G=(V, E)iAnd XjMost Small cut set C is (if the X in GiAnd XjBetween all paths all blocked by node set C, then C is XiAnd XjCut set.If X cannot all be blocked by removing the C ' that any one node obtains from CiAnd XjBetween all paths, then set C is referred to as XiAnd Xj Minimal cut set), utilize following formula calculate node XiAnd XjConditional mutual information I (X in given minimal cut set Ci,Xj| C):
Wherein, P (xi,xj, c) and indicate feature XiValue xi, feature XjValue xjValue c with C is in data sample Collect DkThe probability of middle appearance, P (xi,xj| c) indicate the feature X at minimal cut set ciValue xi, feature XjValue xjIn data Sample set DkIn simultaneously occur probability, P (xi|c)、P(xj| c) it is illustrated respectively in feature X under minimal cut set ciValue xi, it is special Levy XjValue xjIn set of data samples DkThe probability of middle appearance.
If I (Xi,Xj| C) > ε, then (Xi,Xj) be added in the collection E of side.Lower a pair of of node in list L is continued checking, Side of the satisfactory node to composition is added in E, until checking out all nodes pair.
S204: removal redundancy side:
When each in collection E is examined successively, such as (Xi,Xj), if Xi、XjBetween in addition to side (Xi,Xj) except, no There are other paths, then continue checking lower a line;Otherwise, by (Xi,Xj) temporarily deleted from the collection E of side, obtain side collection E '.It looks for (X in (V, E ') outiAnd XjMinimal cut set C ', utilize formula (3) calculate node XiAnd XjItem in given minimal cut set C ' Part mutual information I (Xi,Xj|C′).If I (Xi,Xj| C ') < ε, then enable E=E ', i.e., the permanent delet (X from the collection E of sidei,Xj), otherwise Restore this edge strip (Xi,Xj).Continue checking the lower a line in E.Finally obtained directed acyclic graph G=(V, E), as virtually The BOTTOM LAYER ENVIRONMENT feature of machine and feature-performance Bayesian network of current performance index.
S205: node conditional probability table is calculated:
Next its conditional probability table is calculated for each node in Bayesian network.Conditional probability table is the weight of Bayesian network Component part is wanted, for node Xi, every a line in conditional probability table indicates node XiA value for its father node Collect Pa (Xi) a possible valued combinations conditional probability.Using maximum Likelihood come design conditions in the present embodiment Each probability parameter in probability tables.Its method particularly includes: note feature-performance Bayesian network nodal set is V={ A1,A2,…, AN+1, wherein A1,A2,…,ANIt is virtual machine BOTTOM LAYER ENVIRONMENT feature, AN+1For performance indicator, wherein AnIn set of data samples DkIn Possible value has RnIt is a, n=1,2 ..., N+1.AnFather's nodal set Pa(An) in the possibility valued combinations of all nodes have Qn It is a, then AnConditional probability parameter alphanrq=P (An=r | Pa (An)=q), wherein r indicates AnPossibility value serial number, q indicate AnFather's nodal set Pa(An) possibility value serial number, then conditional probability parameter can be calculated according to the following formula Maximal possibility estimation
Wherein, WnrqIndicate set of data samples DkIn meet A simultaneouslyn=r and Pa (AnThe sample size of)=q, WnqIndicate number According to sample set DkIn meet Pa (AnThe sample size of)=q.
According to above step it is found that compared with the building process of conventional Bayesian network, the present invention is in screening node pair In step according to the present invention targeted object the characteristics of carried out the processing of adaptability and reduced to simplify building process The computation complexity of Bayesian network building.
S105: measurement virtual machine performance:
When virtual platform to be assessed configures a virtual machine, the bottom of the virtual machine is obtained according to configuration information The value of environmental characteristic, the then performance indicator in feature-performance Bayesian network of the obtained each performance indicator of step S104 In the conditional probability table of node, lookup obtains the maximal condition of performance indexes corresponding to the virtual machine BOTTOM LAYER ENVIRONMENT feature Probability, performance indicator value corresponding to the maximal condition probability have measured the property that virtual machine is measured by the performance indicator Can, the items of virtual machine BOTTOM LAYER ENVIRONMENT feature performance indicator corresponding in the conditional probability table of performance indicator node may Value illustrates that the range of virtual machine performance fluctuation, the probability distribution of each value of corresponding performance indicator reflect the virtual machine The degree of performance inconsistency.
Embodiment
Technical effect in order to better illustrate the present invention test to the present invention using a specific embodiment Card.In certain virtualized environment, there are three hosts configured as follows, need to measure the calculated performance of virtual machine in the environment.
2133 16GB memory of host 1:Intel core i5-6600 (3.3GHz), DDR4, SATA3 500GB hard disk;
2133 16GB memory of host 2:Intel core i7-6700 (3.4GHz), DDR4, SATA3 500GB hard disk;
1866 16GB memory of host 3:AMD A10-7850K (3.7GHz), DDR3, SSD 120GB hard disk.
The BOTTOM LAYER ENVIRONMENT feature for influencing virtual machine calculated performance is extracted first:
Hardware characteristics: core cpu code name (value Skylake, Kaveri), dominant frequency (value 3.3GHz, 3.4GHz, 3.7GHz), memory dominant frequency (value is DDR4 2133MHz, DDR3 1866MHz), hard disk type (SATA3, SSD).
Software features: CPU is virtualized type (value be fully virtualized, half virtualize), and (value is CPU dispatching algorithm credit、credit2)。
Configuration feature: virtual cpu quantity (value 1,2,3), virutal machine memory capacity (value 500MB, 1000MB, 2000MB), virtual hard disk type (value is subregion, file).
Runtime environment feature: while the virtual machine number (value 0,1,2,4) of operation, while the virtual machine run is negative Carry type (value is " non-loaded ", " computation-intensive ", " data-intensive ", " I/O is intensive ", " mixing ").
In this document, to simplify the description, features above is screened, and each feature is using only smaller Codomain, selected feature includes: feature X1For CPU frequency (value 3.3,3.4), feature X2For virutal machine memory capacity (value 1000,2000), feature X3For the virtual machine number (value 0,1) of operation simultaneously, feature X4For the void of operation simultaneously Quasi- machine loadtype (0 represents not running load, 1 represent it is computation-intensive, 2 represent it is data-intensive).For ease of description, the present embodiment In only select a performance indicator, i.e., using after the runing time discretization of benchmark program as performance indicator Y.It is pressed in the present embodiment Each node is arranged according to hardware characteristics, software features, configuration feature, runtime environment feature, the sequence of performance indicator, obtains node Sequence is X1、X2、X3、X4,Y.It may determine that obtain using virtual machine field common sense: (X1,X2)、(X1,X3)、(X1,X4)、(X2,X3) (X2,X4) etc. each node to uncorrelated, that is, dependence is not present, then corresponding rely on is identified as 0, residue character is to (X3, X4) dependence be identified as 1.
The virtual machine for configuring different BOTTOM LAYER ENVIRONMENT feature combinations, runs one group of computation-intensive benchmark, with journey Sort run total time is as the index for measuring calculated performance.Obviously, feature-performance data sample aggregates are bigger, then the performance metric As a result more accurate.In the present embodiment for ease of description, using 16 item data samples.Table 1 is number acquired in the present embodiment According to sample set.
X1 X2 X3 X4 Y
D1 3.3 1000 0 0 65
D2 3.3 1000 1 1 70
D3 3.3 1000 1 2 65
D4 3.3 1000 1 2 70
D5 3.3 2000 0 0 60
D6 3.3 2000 1 1 65
D7 3.3 2000 1 1 60
D8 3.3 2000 1 2 65
D9 3.4 1000 0 0 60
D10 3.4 1000 1 1 65
D11 3.4 1000 1 2 65
D12 3.4 1000 1 2 65
D13 3.4 2000 0 0 55
D14 3.4 2000 1 1 60
D15 3.4 2000 1 1 55
D16 3.4 2000 1 2 60
Table 1
For the node that is made of two-by-two BOTTOM LAYER ENVIRONMENT feature to (Xi,Xj), determine that it relies on mark first.This obvious reality It applies and relies on the feature for being identified as 1 in example to only (X3,X4), therefore its mutual information can be obtained according to data in table 1 are as follows:
The node being made of BOTTOM LAYER ENVIRONMENT feature and performance indicator is calculated to (Xi,Yk) mutual information, with node to (X1,Y) For, its mutual information can be obtained according to data in table 1 are as follows:
It can similarly obtain: I (X2, Y) and=0.40, I (X3, Y) and=0.12, I (X4, Y)=0.16.
Threshold epsilon=0.01 is set in the present embodiment, and the node by mutual information greater than ε is to being put into list L, and according to mutual trust The arrangement of breath value descending, obtains L={ (X3,X4),(X2,Y),(X1,Y),(X4,Y),(X3,Y)}。
Every a pair of of node in sequential search list L, such as (X3,X4), the X in current G=(V, E)3And X4Minimal cut setThen conditional mutual information I (X3,X4| C)=0.81, due to I (X3,X4| C) > ε, therefore, side (X3,X4) it is added to side Collect in E.Lower a pair of of node in list L is continued checking by the above operating process, successively by (X2,Y),(X1,Y),(X4, Y) and addition Into side collection E.When checking (X3, Y) when, the X in current G=(V, E)3With the minimal cut set C={ X of Y4, it is mutual to calculate its condition Information are as follows:
I(X3,Y|{X4) > ε, therefore, (X3, Y) and it is added to side collection E, the node in L finishes inspection, obtains E= {(X3,X4),(X2,Y),(X1,Y),(X4,Y),(X3,Y)}。
Then each side in sequential search E, such as (X3,X4), check discovery X3And X4Between be not present other paths, Then continue checking lower a line.When checking (X3, Y) when, find X3There is also other paths between Y, then by (X3, Y) and from E In temporarily delete, obtain E ', find out X in G'=(V, E')3With Y minimal cut set C'={ X4, design conditions mutual information I (X3,Y|{X4)=0.07, due to I (X3,Y|{X4) > ε, then still retain (X in side collection E3,Y).After inspection, obtain most Whole side collection, E={ (X3,X4),(X2,Y),(X1,Y),(X4,Y),(X3,Y)}.Current G=(V, E) i.e. virtual machine feature with Bayes's directed acyclic graph structures of performance indicator.
Then the conditional probability table of each node is calculated using maximum likelihood estimate.Table 2 is that performance refers in the present embodiment Target conditional probability table.
Table 2
The conditional probability table of other nodes can similarly be obtained.Fig. 3 is feature-performance Bayesian network that the present embodiment obtains.Such as Shown in Fig. 3, the Bayesian network express between each BOTTOM LAYER ENVIRONMENT feature of virtual machine, between each BOTTOM LAYER ENVIRONMENT feature and performance according to The relationship of relying.Next the performance metric of virtual machine can be carried out by the Bayesian network.
A virtual machine is configured on host 1, then its CPU frequency value is 3.3GHz, i.e. X1=3.3, virtual machine uses 2000MB memory headroom, i.e. X2=2000, which independently runs on host 1, i.e. X3=0, X4=0, then come to an end point Y's Conditional probability table works as X1=3.3, X2=2000, X3=0, X4When=0, find P (Y | X1,X2,X3,X4) in maximum condition it is general Rate, obtain P (Y=60 | X1=3.3, X2=2000, X3=0, X4=0)=1.0, then using benchmark runing time as Performance indicator, the performance for measuring the virtual machine is 60.
Similar, a virtual machine is configured on host 2, configures X1=3.4, X2=2000, X3=1, X4When=1, node Have in the conditional probability table of Y P (Y=55 | X1=3.4, X2=2000, X3=1, X4=1)=0.5, P (Y=60 | X1=3.4, X2 =2000, X3=1, X4=1)=0.5.Therefore, using benchmark runing time as performance indicator, the virtual machine is measured Performance, performance with 50% and 50% probability { 55,60 } fluctuate.
Although the illustrative specific embodiment of the present invention is described above, in order to the technology of the art Personnel understand the present invention, it should be apparent that the present invention is not limited to the range of specific embodiment, to the common skill of the art For art personnel, if various change the attached claims limit and determine the spirit and scope of the present invention in, these Variation is it will be apparent that all utilize the innovation and creation of present inventive concept in the column of protection.

Claims (3)

1. a kind of virtual machine BOTTOM LAYER ENVIRONMENT signature analysis based on Bayesian network and performance metric method, which is characterized in that including Following steps:
S1: according to software and hardware configuration specific in virtual platform to be assessed, from hardware characteristics, software features, configuration feature, Four aspects of runtime environment feature extract the BOTTOM LAYER ENVIRONMENT feature for influencing virtual machine performance, remember that extracted BOTTOM LAYER ENVIRONMENT is special Sign quantity is N, when can determine that two BOTTOM LAYER ENVIRONMENT feature XiAnd XjBetween be not present dependence, then remember its rely on mark rij= 0, otherwise rij=1, wherein i=1,2 ..., N, j=1,2 ..., N, i ≠ j;
S2: the performance indicator Y for needing to measure is determined as neededk, k=1,2 ..., K, K expression performance indicator quantity;
S3: according to the value of BOTTOM LAYER ENVIRONMENT feature every in step S1, different bottoms are configured in virtual platform to be assessed The virtual machine of environmental characteristic combination, then selects one group according to main application type to be disposed in virtualized environment to be assessed Benchmark runs benchmark, identified corresponding performance index in recording step S2 on each virtual machine Numerical value, to obtain each performance indicator YkA series of corresponding feature by virtual machines-performance data (x1,x2,…, xN,yk) constitute set of data samples Dk, wherein xiIndicate BOTTOM LAYER ENVIRONMENT feature XiValue, ykIndicate performance indicator YkValue;
S4: for BOTTOM LAYER ENVIRONMENT feature and each performance indicator YkCombination, according to its set of data samples Dk, establish corresponding spy Sign-performance Bayesian network, and the conditional probability table of each node is calculated;Establish feature-performance Bayesian network method Are as follows:
Initialization one is with { X1,X2,…,XN,YkIt is nodal set V, boundless graph structure G=(V, E), i.e. V={ X1,X2,…, XN,Yk,Node is initialized to list
For the node that is made of two-by-two BOTTOM LAYER ENVIRONMENT feature to (Xi,Xj), r is identified according to the dependence in step S1 firstijIt carries out Determine, if rij=0, then do not make any operation, otherwise calculates its mutual information;It calculates by BOTTOM LAYER ENVIRONMENT feature and performance indicator group At node to (Xi,Yk) mutual information;Knot is put into if mutual information is greater than threshold epsilon for there are the nodes pair of mutual information Otherwise point does not make any operation in list L;For node pair all in list L, according to the value of mutual information it is descending into Row sequence;
According to node to the node in list L to addition while and when removing redundancy, obtain feature-performance Bayesian network, be each Node calculates its conditional probability table;
S5: when virtual platform to be assessed configures a virtual machine, the bottom of the virtual machine is obtained according to configuration information The value of environmental characteristic, then in the feature of the obtained each performance indicator of step S4-performance Bayesian network performance indicator node Conditional probability table in, search obtain performance indexes corresponding to the virtual machine BOTTOM LAYER ENVIRONMENT feature maximal condition it is general Rate, performance indicator value corresponding to the maximal condition probability have measured the performance that virtual machine is measured by the performance indicator, Every value table of virtual machine BOTTOM LAYER ENVIRONMENT feature performance indicator corresponding in the conditional probability table of performance indicator node Show that the range of virtual machine performance fluctuation, the probability distribution of each value of corresponding performance indicator reflect the virtual machine performance wave Dynamic degree.
2. virtual machine BOTTOM LAYER ENVIRONMENT signature analysis according to claim 1 and performance metric method, which is characterized in that described When graph structure G=(V, E) is initialized in step S4, by V={ X1,X2,…,XN,YkIn each node it is special according to the hardware of virtual machine Sign, software features, configuration feature, runtime environment feature, the arrangement of the sequence of performance indicator.
3. virtual machine BOTTOM LAYER ENVIRONMENT signature analysis according to claim 1 and performance metric method, which is characterized in that described The value range of threshold epsilon is 0.01≤ε≤0.05 in step S4.
CN201610956901.7A 2016-10-28 2016-10-28 Virtual machine BOTTOM LAYER ENVIRONMENT signature analysis and performance metric method based on Bayesian network Active CN106484496B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610956901.7A CN106484496B (en) 2016-10-28 2016-10-28 Virtual machine BOTTOM LAYER ENVIRONMENT signature analysis and performance metric method based on Bayesian network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610956901.7A CN106484496B (en) 2016-10-28 2016-10-28 Virtual machine BOTTOM LAYER ENVIRONMENT signature analysis and performance metric method based on Bayesian network

Publications (2)

Publication Number Publication Date
CN106484496A CN106484496A (en) 2017-03-08
CN106484496B true CN106484496B (en) 2019-08-20

Family

ID=58271759

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610956901.7A Active CN106484496B (en) 2016-10-28 2016-10-28 Virtual machine BOTTOM LAYER ENVIRONMENT signature analysis and performance metric method based on Bayesian network

Country Status (1)

Country Link
CN (1) CN106484496B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108037979B (en) * 2017-12-26 2021-07-13 云南大学 Virtual machine performance degradation evaluation method based on Bayesian network containing hidden variables
CN110209577A (en) * 2019-05-20 2019-09-06 深圳壹账通智能科技有限公司 A kind of test method and device
CN112882917B (en) * 2021-03-17 2023-05-12 云南师范大学 Virtual machine service quality dynamic prediction method based on Bayesian network migration
CN117057486B (en) * 2023-10-11 2023-12-22 云南电投绿能科技有限公司 Operation and maintenance cost prediction method, device and equipment for power system and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102843385A (en) * 2012-09-24 2012-12-26 东南大学 Method for guarding against side channel attack virtual machine in cloud computing environment
CN104102875A (en) * 2014-07-22 2014-10-15 河海大学 Software service quality monitoring method and system based on weighted naive Bayes classifier
CN104618406A (en) * 2013-11-05 2015-05-13 镇江华扬信息科技有限公司 Load balancing algorithm based on naive Bayesian classification
CN105320559A (en) * 2014-07-30 2016-02-10 中国移动通信集团广东有限公司 Scheduling method and device of cloud computing system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102843385A (en) * 2012-09-24 2012-12-26 东南大学 Method for guarding against side channel attack virtual machine in cloud computing environment
CN104618406A (en) * 2013-11-05 2015-05-13 镇江华扬信息科技有限公司 Load balancing algorithm based on naive Bayesian classification
CN104102875A (en) * 2014-07-22 2014-10-15 河海大学 Software service quality monitoring method and system based on weighted naive Bayes classifier
CN105320559A (en) * 2014-07-30 2016-02-10 中国移动通信集团广东有限公司 Scheduling method and device of cloud computing system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Novel Artificial Bee Colony Approach of Live Virtual Machine Migration Policy Using Bayes Theorem;Gaochao Xu等;《The Scientific World Journal》;20131209;正文第1-13页
面向IaaS的虚拟机异常检测系统研究;任涛;《中国优秀硕士学位论文全文数据库》;20150115;正文第1-48页

Also Published As

Publication number Publication date
CN106484496A (en) 2017-03-08

Similar Documents

Publication Publication Date Title
Wang et al. Rafiki: Machine learning as an analytics service system
CN106484496B (en) Virtual machine BOTTOM LAYER ENVIRONMENT signature analysis and performance metric method based on Bayesian network
Bader et al. Graph partitioning and graph clustering
US9952891B2 (en) Anomalous usage of resources by a process in a software defined data center
Chowdhury et al. Greenoracle: Estimating software energy consumption with energy measurement corpora
US20200026560A1 (en) Dynamic workload classification for workload-based resource allocation
Noorshams et al. Predictive performance modeling of virtualized storage systems using optimized statistical regression techniques
Singer et al. Garbage collection auto-tuning for java mapreduce on multi-cores
US20160070633A1 (en) Memory leak analysis by usage trends correlation
US10459704B2 (en) Code relatives detection
Kumar et al. Software testing optimization through test suite reduction using fuzzy clustering
CN108509324B (en) System and method for selecting computing platform
US10754744B2 (en) Method of estimating program speed-up in highly parallel architectures using static analysis
Verboven et al. Black box scheduling for resource intensive virtual machine workloads with interference models
Xi et al. Characterization of real workloads of web search engines
Patros et al. Investigating resource interference and scaling on multitenant PaaS clouds
Bezemer et al. Performance optimization of deployed software-as-a-service applications
Peng et al. Virtual machine profiling for analyzing resource usage of applications
Rajan et al. A study on the influence of software and hardware features on program energy
Iosup et al. Towards benchmarking IaaS and PaaS clouds for graph analytics
Do et al. Assessing resource provisioning and allocation of ensembles of in situ workflows
Calzarossa et al. A methodology towards automatic performance analysis of parallel applications
Sergeev et al. Docker Container Performance Comparison on Windows and Linux Operating Systems
Li et al. A method to identify spark important parameters based on machine learning
Cuzzocrea et al. An innovative deep-learning algorithm for supporting the approximate classification of workloads in big data environments

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant