CN109308225A - A kind of virtual machine method for detecting abnormality, device, equipment and storage medium - Google Patents

A kind of virtual machine method for detecting abnormality, device, equipment and storage medium Download PDF

Info

Publication number
CN109308225A
CN109308225A CN201710627200.3A CN201710627200A CN109308225A CN 109308225 A CN109308225 A CN 109308225A CN 201710627200 A CN201710627200 A CN 201710627200A CN 109308225 A CN109308225 A CN 109308225A
Authority
CN
China
Prior art keywords
gaussian
virtual machine
data
residual error
pivot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710627200.3A
Other languages
Chinese (zh)
Other versions
CN109308225B (en
Inventor
陈力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhongxing Software Co Ltd
Original Assignee
Shanghai Zhongxing Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhongxing Software Co Ltd filed Critical Shanghai Zhongxing Software Co Ltd
Priority to CN201710627200.3A priority Critical patent/CN109308225B/en
Priority to PCT/CN2017/106655 priority patent/WO2019019429A1/en
Publication of CN109308225A publication Critical patent/CN109308225A/en
Application granted granted Critical
Publication of CN109308225B publication Critical patent/CN109308225B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0712Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a virtual computing platform, e.g. logically partitioned systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2134Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on separation criteria, e.g. independent component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45591Monitoring or debugging support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/815Virtual

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a kind of virtual machine method for detecting abnormality, device, equipment and storage mediums, are related to information and communication technique field, which comprises obtain the residual error data of the non-Gaussian system of virtual machine;Independent component analysis is carried out to the residual error data of the non-Gaussian system, determines that the virtual machine is abnormal the time point of behavior.The embodiment of the present invention uses the independent entry abnormality detection based on residual error data, and obtained testing result is more accurate and effective.

Description

A kind of virtual machine method for detecting abnormality, device, equipment and storage medium
Technical field
The present invention relates to information and mechanics of communication (Information and Communication Technologies, ICT the monitoring of computer performance index and abnormality detection field, in particular to a kind of virtual machine method for detecting abnormality), is set device Standby and storage medium.
Background technique
Cloud computing is integrated existing hardware resource by technologies such as virtualizations, is formed shared resource pool, is made industry Business system can obtain calculating, storage and Internet resources on demand, efficiently solve the problems, such as that traditional IT infrastructure exists. Virtual machine is the core component of cloud platform, is responsible for operation system and provides calculating and storage resource, to guarantee operation system It operates normally.However, being increasing with operation system type and quantity, the scale of cloud platform constantly expands, and cloud platform becomes Increasingly complexity is obtained, so that it is easy to appear exceptions in the process of running for virtual machine.Virtual machine, which is of the presence of an anomaly with, not only results in industry Business system is unable to operate normally, and causes various losses difficult to the appraisal;And enterprise can be caused to the worry of cloud computing, hinder cloud The development and application of calculating.Therefore, it is necessary to introduce virtual machine abnormality detection technology, the abnormal behaviour of virtual machine is found in time, with Administrator is reminded to take the necessary measures, to guarantee the normal operation of virtual machine.
Since virtual machine usually contains multiple system resource monitor control indexes, it can be used what industry in recent years was studied extensively Multivariate statistical analysis is applied to process monitoring and fault diagnosis.Traditional multivariate statistics monitoring method mostly uses pivot point It analyses (Principle Component Analysis, PCA), data space is decomposed into principal component subspace for it and residual error is empty Between, each group of measurement data can project in the two subspaces, while introduce Hotelling respectively in two spaces T2(size that measurement includes the information content in principal component model) and square prediction error SPE (Squared Prediction Error measures the size of the information content described in principal component model) the two statistics monitor the generation of failure.One As think T2What is embodied is systematic change, and what SPE embodied is non-systematic change, that is to say, that the SPE based on residual error space It can more reflect off-note.The problem of PCA, is that it is to generally require vacation based on the analysis method of signal second-order statistics If process variable Gaussian distributed.Using the abnormality alarming detection system of PCA algorithm as shown in Figure 1, PCA algorithm service receives Time series source data (i.e. time series data) exports the abnormal time point of detection after treatment, while taking as alarm The input of business, to generate abnormality alarming.
Another kind using more method be Independent Component Analysis (Independent Component Analysis, ICA), different from PCA, it is a kind of analysis method based on signal higher order statistical characteristic, and the purpose is to will observe obtained number It is set to resolve into the ingredient of statistical iteration using the independence and non-Gaussian system of source signal according to certain linear decomposition is carried out.It will It is corresponding to introduce I as PCA when ICA is applied to abnormality detection2(measurement includes the big of the information content in independent meta-model It is small) and square prediction error SPE (Squared Prediction Error, measurement cannot be believed described in independent entry model The size of breath amount) the two statistics monitor the generation of failure.The problem of ICA, is that its supposed premise is independent element It needs otherwise to will be unable to determine hybrid matrix with non-gaussian distribution.Using abnormality alarming detection system such as Fig. 2 of ICA algorithm Shown, ICA algorithm services receiving time sequence source data, exports the abnormal time point of detection after treatment, while as announcement The input of police uniform business, to generate abnormality alarming.
Since the type of service carried on virtual machine and application behavior are varied, the data distribution that real system observes It is often unsatisfactory, the characteristics of with gaussian sum non-gaussian distribution, therefore only with traditional PCA or ICA method, it is possible to It will cause the wrong report of failure and fail to report.From the point of view of published some patents and document, there is scholar to attempt ICA algorithm being used for height The division of this and non-Gaussian signal, but the practical supposed premise for not overcoming ICA algorithm, and for gaussian sum non-Gaussian signal Division lack relatively good guideline;There are also the correlations that scholar considers time series, and data are divided into one by sliding window Each and every one local segment data, although the data in this way in window may not form complicated distribution, due to number of samples meeting Greatly reduce, is actually not appropriate for implementing the statistical algorithms such as PCA, ICA.
Summary of the invention
A kind of virtual machine method for detecting abnormality, device, equipment and storage medium provided in an embodiment of the present invention solve existing Technology can not accurately detect the problem of virtual machine is abnormal the time point of behavior.
A kind of virtual machine method for detecting abnormality provided according to embodiments of the present invention, comprising:
Obtain the residual error data of the non-Gaussian system of virtual machine;
Independent component analysis is carried out to the residual error data of the non-Gaussian system, determine the virtual machine be abnormal behavior when Between point.
A kind of virtual machine abnormal detector provided according to embodiments of the present invention, comprising:
Residual error obtains module, the residual error data of the non-Gaussian system for obtaining virtual machine;
Abnormal determining module carries out independent component analysis for the residual error data to the non-Gaussian system, determines described virtual Machine is abnormal the time point of behavior.
A kind of virtual machine abnormality detecting apparatus provided according to embodiments of the present invention, comprising:
Processor, the residual error data of the non-Gaussian system for obtaining virtual machine, and to the residual error data of the non-Gaussian system Independent component analysis is carried out, determines that the virtual machine is abnormal the time point of behavior;
Memory, for storing the program executed for the processor.
A kind of storage medium provided according to embodiments of the present invention is stored thereon with the executable program of processor, the journey Sequence makes processor execute following steps:
Obtain the residual error data of the non-Gaussian system of virtual machine;
Independent component analysis is carried out to the residual error data of the non-Gaussian system, determine the virtual machine be abnormal behavior when Between point.
Technical solution provided in an embodiment of the present invention has the following beneficial effects:
1, the embodiment of the present invention extracts non-gaussian independent entry by ICA in PCA residual error space, and obtained testing result is more It is accurate and effective;
2, the embodiment of the present invention carries out a degree of reservation to non-Gauss information by treated the residual error space PCA, Can more fully be caught the exception information.
Detailed description of the invention
Fig. 1 is the abnormality alarming detection system block diagram using PCA algorithm;
Fig. 2 is the abnormality alarming detection system block diagram using ICA algorithm;
Fig. 3 is virtual machine method for detecting abnormality flow chart provided in an embodiment of the present invention;
Fig. 4 is the actual motion figure of virtual machine abnormality detection system provided in an embodiment of the present invention;
Fig. 5 is the PCA algorithm service process flow diagram of Fig. 4;
Fig. 6 is the ICA algorithm service processing flow chart of Fig. 5;
Fig. 7 is virtual machine abnormal detector block diagram provided in an embodiment of the present invention;
Fig. 8 is one group of datagram handled by the embodiment of the present invention, includes CPU, disk read-write, network I/O, memory etc. 6 The data of a dimension, the left side are training sets, and the right is test set;
Fig. 9 is the processing result figure for Fig. 8 data using tradition PCA method, and the left side is directed to training set data, the right needle To test set data;
Figure 10 is the processing result figure that the ICA algorithm based on PCA residual error is used for Fig. 8 data, and the left side is directed to training set Data, the right are directed to test set data;
Figure 11 is another group of datagram handled by the embodiment of the present invention, equally include CPU, disk read-write, network I/O, The data of 6 dimensions such as memory, the left side are training sets, and the right is test set;
Figure 12 is the processing result figure for Figure 11 data using tradition PCA method, and the left side is directed to training set data, the right For test set data;
Figure 13 is the processing result figure that the ICA algorithm based on PCA residual error is used for Figure 11 data, and the left side is directed to training set Data, the right are directed to test set data.
Specific embodiment
Below in conjunction with attached drawing to a preferred embodiment of the present invention will be described in detail, it should be understood that described below is excellent Select embodiment only for the purpose of illustrating and explaining the present invention and is not intended to limit the present invention.
The embodiment of the present invention is suitable for detection virtual machine abnormal behaviour, when concrete application, utilizes the time sequence to virtual machine The residual error data of the non-Gaussian system for the virtual machine that column data is handled carries out independent component analysis, obtains virtual machine The time point of abnormal behaviour.
Fig. 3 is virtual machine method for detecting abnormality flow chart provided in an embodiment of the present invention, as shown in figure 3, step includes:
Step S10: the residual error data of the non-Gaussian system of virtual machine is obtained.
The step S10 includes:
Step S101: pivot analysis is carried out to the time series data of the virtual machine, obtains the time series data Strong Gaussian pivot.
Specifically, carrying out pivot decomposition to the time series data, the pivot of the time series data is obtained;From Strong Gaussian component is extracted in the pivot of the time series data, and the time is constituted by the strong Gaussian component The strong Gaussian pivot of sequence data.
Wherein, it includes: to calculate the time sequence that strong Gaussian component is extracted from the pivot of the time series data The statistical value (i.e. JB value) of the Gaussian power of characterization of each component of the pivot of column data;Calculate important statistical value Summation;Each component is ranked up according to the sequence of statistical value from small to large, and each component and row in the sequence of calculation Sequence preceding component statistical value accumulative and;According to each component and sort in the accumulative and institute of the statistical value of preceding component The summation for stating important statistical value calculates Gaussian ingredient accounting, and according to the Gaussian ingredient accounting, determines strong high The component of this property.
The step S10 further include:
Step S102: according to the strong Gaussian pivot and the time series data, the residual error of non-Gaussian system is obtained Data.
Specifically, carrying out data recovery using the strong Gaussian pivot, it is extensive to obtain strong Gaussian time series Complex data;Restore data according to the time series data and the time series, obtains the residual error data of non-Gaussian system.
Step S20: independent component analysis is carried out to the residual error data of the non-Gaussian system, determines that the virtual machine is abnormal The time point of behavior, i.e. the abnormal time point of time series data.
The step S20 includes:
Step S201: independent component analysis is carried out to the residual error data of the non-Gaussian system, obtains being included in solely for measuring Statistical value (the i.e. I of information content in vertical meta-model2) and for measuring the information content that cannot be described by the independent meta-model Statistical value (i.e. SPE).
Step S202: according to the I2With the SPE, determine that the virtual machine is abnormal the time point of behavior.Specifically Ground is said, the I will be utilized2The abnormal time point of extraction and the abnormal time point merging extracted using the SPE, as the void The abnormal time point of quasi- machine.
It will appreciated by the skilled person that implement the method for the above embodiments be can be with Relevant hardware is instructed to complete by program, the program can store in computer-readable storage medium, should Program when being executed, including step S10 to step S20.Furtherly, the present invention can also provide a kind of storage medium, thereon It is stored with computer program, the non-Gaussian system for obtaining virtual machine is at least performed the steps of when which is executed by processor Residual error data;Independent component analysis is carried out to the residual error data of the non-Gaussian system, determines that the virtual machine is abnormal behavior Time point.Wherein, the storage medium may include ROM/RAM, magnetic disk, CD, USB flash disk.
Fig. 4 is the diagram of virtual machine system actual motion, and time series data source flows into PCA algorithm clothes as input first Business module, completes the extraction of PCA residual error data, and residual error data is then flowed into ICA algorithm service module, exports I2It unites with SPE The detected abnormal time point of metering flows into alerting service module and generates alarm.Wherein, the processing of PCA algorithm service module Process such as Fig. 4, the process flow of ICA algorithm service module such as Fig. 5.
Below in conjunction with fig. 4 to fig. 6, invention is further explained.
Fig. 4 is virtual machine abnormality detection system actual motion figure provided in an embodiment of the present invention, as shown in Figure 4.Specific side Case is as follows:
Step 1: the PCA algorithm service in system receives time series data (i.e. initial data) conduct from data source Input.
Step 2: assuming that initial data X ∈ Rn*m, wherein n is number of samples, and m is variable number or title dimension), to X PCA algorithm is executed, pivot X_T ∈ R is obtainedn*p, wherein p is pivot component number.
Step 3: Gaussian stronger component is further extracted to pivot X_T.Specific practice is as follows:
Step 3.1: calculating each component of pivot the value of JB (Jarque-Bera) statistic, JB is defined as follows: JB =n (S2/6+(K-3)2/24)。
Wherein, n is sample points, and S is sample skewness (skewness), and K is sample kurtosis (kurtosis), and JB value is got over Greatly, non-Gaussian system is stronger, Gaussian weaker.
Step 3.2: the JB value of each component is ranked up to obtain a sequence, such as JB=by sequence from small to large [JB1, JB2 ..., JBp], while recording the corresponding relationship of each pivot component and the sequential value, such as JB1 ← → X_T [i], Middle X_T [i] indicates i-th of pivot component of X_T, and the JB value of X_T [i] is JB1.
Step 3.3: calculate above-mentioned ordering JB sequential value: accumulative and/summation calculates: [JB1/sum (JB), (JB1+JB2)/sum (JB) ... ..., (JB1+ ...+JBp)/sum (JB)], obtain a value magnitude range (0,1] score value sequence Column set Gaussian ingredient accounting threshold value, retain the value for being less than threshold value in score value sequence, and pivot corresponding to abstraction sequence value Component forms new pivot X_Tnew.
Step 4: pivot X_Tnew being restored to luv space, obtains X_Recover, calculates residual error: X_Res=X-X_ Recover, wherein X_Res ∈ Rn*m, as the output of PCA algorithm service.
The embodiment of the present invention realizes a kind of innovatory algorithm of PCA residual error, specifically, the residual error data of obtained PCA, Be to continue with and PCA pivot is done by further screening formed after new pivot the residual error being calculated again as Gaussian, thus with tradition PCA algorithm directly to press residual error calculated after energy size extracts pivot different.
Step 5: the ICA algorithm service in system receives the output X_Res data from PCA algorithm service, holds to X_Res Row ICA algorithm carries out independent entry decomposition, calculates I2With SPE statistic.To I2Detection threshold value is set with SPE statistic, is mentioned respectively Abnormal time point is taken, then by I2It is merged with the abnormality detection result of SPE, the output as ICA algorithm service.
The input/output interface part of the PCA/ICA algorithm service of the embodiment of the present invention, PCA service do not export directly different Normal time point, and only export the residual error data of PCA.The input of ICA algorithm service is not initial data yet, but the residual error of PCA Data, final testing result is from the ICA data processing to PCA residual error data.
Step 6: the alerting service in system receives the output from ICA algorithm service, i.e. abnormal time point, generates corresponding Alarm.
Fig. 5 is the PCA algorithm service process flow diagram of Fig. 4, as shown in Figure 5, comprising: first to initial data X ∈ Rn*mIt holds Row PCA algorithm extracts pivot X_T;Then Gaussian stronger component is further extracted to pivot X_T, forms new pivot X_ Tnew;New pivot X_Tnew is finally reverted into original data space, calculates Residual X _ Res ∈ Rn*mAnd it exports.
Fig. 6 is the ICA algorithm service processing flow chart of Fig. 5, as shown in Figure 6, comprising: first to Residual X _ Res ∈ Rn*mIt holds Row ICA algorithm decomposes independent entry;Then I is calculated2With SPE statistic, extract respectively abnormal;Finally merge I2With the exception of SPE Testing result simultaneously exports.
The residual error space that initial data (i.e. time series data) is decomposed by PCA in the present embodiment is empty compared to pivot Between, it is more advantageous to reflection off-note, therefore the embodiment of the present invention considers using the residual error space of PCA as the base for continuing analysis Plinth.Further, consider that ICA is not to directly acquire tradition when calculating PCA residual error to the processing advantage of non-gaussian source signal The residual error of PCA algorithm, but first to PCA pivot by it is Gaussian it is further extracted, calculated after returning again to original data space Then PCA residual error extracts independent entry by ICA in PCA residual error space, calculate I2Exception is detected with SPE statistic, finally Merge testing result.
Fig. 7 is virtual machine abnormal detector block diagram provided in an embodiment of the present invention, as shown in fig. 7, comprises residual error obtains Module and abnormal determining module.
Residual error obtains module, the residual error data of the non-Gaussian system for obtaining virtual machine.The residual error obtains module into one Step includes pivot computational submodule and residual computations submodule, wherein the pivot computational submodule is used for the virtual machine Time series data carry out pivot analysis, obtain the strong Gaussian pivot of time series data;Residual computations submodule Block is used to obtain the residual error data of non-Gaussian system according to the strong Gaussian pivot and the time series data.
Abnormal determining module carries out independent component analysis for the residual error data to the non-Gaussian system, determines described virtual Machine is abnormal the time point of behavior, i.e., the abnormal time point of the described time series data.
The course of work of described device includes: that pivot computational submodule carries out pivot decomposition to the time series data, The pivot of the time series data is obtained, the strong Gaussian component of extraction from the pivot of the time series data, and by The strong Gaussian component constitutes the strong Gaussian pivot of the time series data.Described in residual computations submodule utilizes Strong Gaussian pivot, carries out data recovery, obtains strong Gaussian time series and restores data, and according to the time series Data and the time series restore data, obtain the residual error data of non-Gaussian system.Abnormal determining module is to the non-Gaussian system Residual error data carry out independent component analysis, obtain I2With SPE statistic, and the abnormal time of the time series data is determined Point.
Wherein, pivot computational submodule calculate each component of the pivot of the time series data JB value and all points The summation of the JB value of amount is ranked up each component according to the sequence of JB value from small to large, and each described in the sequence of calculation Component and sequence preceding component JB value accumulative and, then the JB value according to each component and sequence in preceding component being tired out The summation of meter and described the important JB value, calculates Gaussian ingredient accounting, and according to the Gaussian ingredient accounting, really Fixed strong Gaussian component.
The present embodiment provides a kind of virtual machine abnormality detecting apparatus, comprising:
Processor, the residual error data of the non-Gaussian system for obtaining virtual machine, and to the residual error data of the non-Gaussian system Independent component analysis is carried out, determines that the virtual machine is abnormal the time point of behavior;
Memory can be coupled for storing the program executed for the processor with the processor.
It is to set identical training set and test that algorithm of the embodiment of the present invention, which is assessed, compared to the improved method of traditional algorithm Collection, wherein test set is the period that the abnormal comparison fed back according to data collection site is concentrated, and sets phase to detection statistic Same stimulus threshold criterion, investigates whether algorithm of the embodiment of the present invention can detect more exceptions on the known exception period Data point.
Application example 1
Fig. 8 data collected include period 2016.10.1~2016.11.11, scene feedback 18:00 on November 7 To between next day 12:00, business had multiple exception.2016.11.7 18:00~2016.11.8 12:00 period is set as Test set, remaining data are set as training set after rejecting the partial data.
Abnormality detection result using traditional PCA algorithm is as shown in Figure 9, wherein set PCA principal component energy accounting as 85%, detection statistic T2Cuclear density method estimated probability density is pressed with SPE, and takes 99.7% according to accumulated probability Distribution Value Threshold value limit is extracted abnormal.The results show that in test set, PCA T2Exception is not detected, PCA SPE detects the different of a period of time Often.
Abnormality detection result using the ICA algorithm based on PCA residual error is as shown in Figure 10, same to set PCA principal component energy Measuring accounting threshold value is 85%, obtains 4 pivot component X_T [0], X_T [1], X_T [2], X_T [3], calculates 4 pivot components JB value first sorts from small to large, then calculates accumulative and/summation, as shown in table 1.
Accumulative and/summation the table of 1. application example 1 of table
Pivot component JB Accumulative and/summation
X_T[3] 4.745843e+02 9.973862e-08
X_T[0] 4.537954e+06 9.537958e-04
X_T[2] 1.088366e+07 3.241106e-03
X_T[1] 4.742859e+09 1.000000e+00
Setting the Gaussian ingredient accounting threshold value 85% of pivot, the pivot of actual extracting is X_T [0], X_T [2], X_T [3], And X_T [1] is rejected because non-Gaussian system is relatively strong.X_T [0], X_T [2], X_T [3] the new principal component space constituted are returned to PCA residual error is calculated in original data space.
The threshold value of detection statistics measurement accumulated probability Distribution Value 99.7%.The results show that in test set, ICA I2With SPE respectively detects the exception of a period of time, wherein I2Period for being detected with PCA SPE of testing result than more consistent.
From the point of view of synthesis result, the detected abnormal point numerical of present invention method is more than tradition PCA method, and from original Data see that the period of PCA institute missing inspection, system resource has variation by a relatively large margin really.
Application example 2
Figure 11 data collected include period 2017.1.1~2017.2.28, and scene 25 days 2 months 8:00 of feedback are extremely Between 12:00, business experience is abnormal.2017.2.25 8:00~2017.2.25 12:00 period is set as test set, is rejected Remaining data are set as training set after the partial data.
Abnormality detection result using traditional PCA algorithm is as shown in figure 12, wherein setting PCA principal component energy accounting threshold Value is 85%, detection statistic T2Cuclear density method estimated probability density is pressed with SPE, and is taken according to accumulated probability Distribution Value 99.7% threshold value is extracted abnormal.The results show that in test set, PCA T2Exception is not detected with PCA SPE, with business body It tests and is not inconsistent completely.
Abnormality detection result using the ICA algorithm based on PCA residual error is as shown in figure 13, same to set PCA principal component energy Measuring accounting is 85%, obtains 4 pivot component X_T [0], X_T [1], X_T [2], X_T [3], calculates the JB of 4 pivot components Value first sorts from small to large, then calculates accumulative and/summation, as shown in table 2.
Accumulative and/summation the table of 2. application example 2 of table
Pivot component JB Accumulative and/summation
X_T[2] 1.316693e+04 0.000001
X_T[3] 3.613565e+04 0.000004
X_T[0] 9.596462e+05 0.000088
X_T[1] 1.152558e+10 1.000000
Setting the Gaussian ingredient accounting threshold value 85% of pivot, the pivot of actual extracting is X_T [0], X_T [2], X_T [3], And X_T [1] is rejected because non-Gaussian system is relatively strong.X_T [0], X_T [2], X_T [3] the new principal component space constituted are returned to PCA residual error is calculated in original data space.
Detection statistics measure the threshold value limit of accumulated probability Distribution Value 99.7%.The results show that in test set, ICA SPE The abnormal time section than comparatively dense is detected.
From the point of view of synthesis result, the detected abnormal point numerical of the method for the present invention is more than tradition PCA method, and from initial data It sees, the period where test set, the unusual fluctuations that system resource has comparison violent really.
In conclusion the embodiment of the present invention is the improvement based on traditional PCA and ICA method for detecting abnormality, with conventional method Compare, the embodiment of the present invention has following technical effect that
1. tradition PCA algorithm only considers energy size factor when extracting pivot, data distribution is not accounted for, is used The algorithm of the embodiment of the present invention is further extracted the extracted pivot component of traditional PCA as Gaussian, that is, is retained Gaussian stronger component is as actual PCA pivot in PCA pivot.
2. energy feature is only reflected in the residual error space that tradition PCA algorithm obtains, using the algorithm of the embodiment of the present invention, institute The residual error space non-Gaussian system of acquisition can also be enhanced, this becomes with two benefits firstly, PCA residual error embodies non-systemic Change, is easier to detect exception compared to pivot;Secondly, abnormal often have burst, few non-Gaussian system feature, therefore non-gaussian are measured Enhancing illustrates that the exception of residual error space capture more fully will detect abnormal effect in non-Gaussian system stronger PCA residual error space Fruit can be more preferable.
3. traditional ICA algorithm is suitble to the processing of non-gaussian source signal, therefore, compared to original signal is directly inputted, use The PCA residual error data with stronger non-Gaussian system that the embodiment of the present invention obtains is more suitable for the processing of ICA algorithm, therefore obtain Testing result will be more accurate and effective.
Although describing the invention in detail above, but the invention is not restricted to this, those skilled in the art of the present technique It can be carry out various modifications with principle according to the present invention.Therefore, all to be modified according to made by the principle of the invention, all it should be understood as Fall into protection scope of the present invention.

Claims (10)

1. a kind of virtual machine method for detecting abnormality, comprising:
Obtain the residual error data of the non-Gaussian system of virtual machine;
Independent component analysis is carried out to the residual error data of the non-Gaussian system, determines that the virtual machine is abnormal the time of behavior Point.
2. according to the method described in claim 1, the residual error data of the non-Gaussian system for obtaining virtual machine includes:
Pivot analysis is carried out to the time series data of the virtual machine, obtains the strong Gaussian master of the time series data Member;
According to the strong Gaussian pivot and the time series data, the residual error data of non-Gaussian system is obtained.
3. being obtained according to the method described in claim 2, the time series data to the virtual machine carries out pivot analysis The strong Gaussian pivot of the time series data includes:
Pivot decomposition is carried out to the time series data, obtains the pivot of the time series data;
Strong Gaussian component is extracted from the pivot of the time series data, and institute is constituted by the strong Gaussian component State the strong Gaussian pivot of time series data.
4. according to the method described in claim 3, described extract strong Gaussian point from the pivot of the time series data Amount includes:
Calculate the statistical value of the Gaussian power of characterization of each component of the pivot of the time series data;
According to the statistical value of each component, the strong Gaussian component in the pivot of the time series data is determined.
5. according to the method described in claim 4, the statistical value according to each component, determines the time series number According to pivot in strong Gaussian component include:
Calculate the summation of important statistical value;
Each component is ranked up according to the sequence of statistical value from small to large, and each component and sequence in the sequence of calculation Preceding component statistical value accumulative and;
According to each component and sequence the statistical value of preceding component accumulative and described important statistical value it is total With calculate Gaussian ingredient accounting, and according to the Gaussian ingredient accounting, determine strong Gaussian component.
6. obtaining according to the method described in claim 2, described according to the strong Gaussian pivot and the time series data Residual error data to non-Gaussian system includes:
Using the strong Gaussian pivot, data recovery is carried out, strong Gaussian time series is obtained and restores data;
Restore data according to the time series data and the time series, obtains the residual error data of non-Gaussian system.
7. being determined according to the method described in claim 1, the residual error data to the non-Gaussian system carries out independent component analysis The time point that the virtual machine is abnormal behavior includes:
Independent component analysis is carried out to the residual error data of the non-Gaussian system, obtains for measuring including letter in independent meta-model The statistical value of breath amount and statistical value for measuring the information content that cannot be described by the independent meta-model;
According to it is described for measures include information content in independent meta-model statistical value and it is described be used for measurement cannot be by institute The statistical value for stating the information content of independent meta-model description, determines that the virtual machine is abnormal the time point of behavior.
8. a kind of virtual machine abnormal detector, comprising:
Residual error obtains module, the residual error data of the non-Gaussian system for obtaining virtual machine;
Abnormal determining module carries out independent component analysis for the residual error data to the non-Gaussian system, determines the virtual machine hair The time point of raw abnormal behaviour.
9. a kind of virtual machine abnormality detecting apparatus, comprising:
Processor, the residual error data of the non-Gaussian system for obtaining virtual machine, and the residual error data of the non-Gaussian system is carried out Independent component analysis determines that the virtual machine is abnormal the time point of behavior;
Memory, for storing the program executed for the processor.
10. a kind of storage medium is stored thereon with the executable program of processor, which makes processor execute following steps:
Obtain the residual error data of the non-Gaussian system of virtual machine;
Independent component analysis is carried out to the residual error data of the non-Gaussian system, determines that the virtual machine is abnormal the time of behavior Point.
CN201710627200.3A 2017-07-28 2017-07-28 Virtual machine abnormality detection method, device, equipment and storage medium Active CN109308225B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710627200.3A CN109308225B (en) 2017-07-28 2017-07-28 Virtual machine abnormality detection method, device, equipment and storage medium
PCT/CN2017/106655 WO2019019429A1 (en) 2017-07-28 2017-10-18 Anomaly detection method, device and apparatus for virtual machine, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710627200.3A CN109308225B (en) 2017-07-28 2017-07-28 Virtual machine abnormality detection method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109308225A true CN109308225A (en) 2019-02-05
CN109308225B CN109308225B (en) 2024-04-16

Family

ID=65039486

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710627200.3A Active CN109308225B (en) 2017-07-28 2017-07-28 Virtual machine abnormality detection method, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN109308225B (en)
WO (1) WO2019019429A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115147203A (en) * 2022-06-08 2022-10-04 南京金威诚融科技开发有限公司 Financial risk intelligent analysis method based on big data

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11060885B2 (en) 2019-09-30 2021-07-13 Oracle International Corporation Univariate anomaly detection in a sensor network
US11216247B2 (en) 2020-03-02 2022-01-04 Oracle International Corporation Automatic asset anomaly detection in a multi-sensor network
US11762956B2 (en) 2021-02-05 2023-09-19 Oracle International Corporation Adaptive pattern recognition for a sensor network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101158693A (en) * 2007-09-26 2008-04-09 东北大学 Bulk production process malfunction detection method based on multiple nucleus independent elements analyse
CN101403923A (en) * 2008-10-31 2009-04-08 浙江大学 Course monitoring method based on non-gauss component extraction and support vector description
US20090216393A1 (en) * 2008-02-27 2009-08-27 James Schimert Data-driven anomaly detection to anticipate flight deck effects
CN104656635A (en) * 2014-12-31 2015-05-27 重庆科技学院 Abnormity detection and diagnosis method for non-gaussian dynamic high-sulfur natural gas purification process
CN106483847A (en) * 2016-09-20 2017-03-08 北京工业大学 A kind of handpiece Water Chilling Units fault detection method based on self adaptation ICA

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8036325B2 (en) * 2006-03-09 2011-10-11 Interdigital Technology Corporation Wireless communication method and apparatus for performing knowledge-based and blind interference cancellation
CN106778533A (en) * 2016-11-28 2017-05-31 国网上海市电力公司 PCA KSICA energy-storage system typical condition recognition methods based on kernel function

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101158693A (en) * 2007-09-26 2008-04-09 东北大学 Bulk production process malfunction detection method based on multiple nucleus independent elements analyse
US20090216393A1 (en) * 2008-02-27 2009-08-27 James Schimert Data-driven anomaly detection to anticipate flight deck effects
CN101403923A (en) * 2008-10-31 2009-04-08 浙江大学 Course monitoring method based on non-gauss component extraction and support vector description
CN104656635A (en) * 2014-12-31 2015-05-27 重庆科技学院 Abnormity detection and diagnosis method for non-gaussian dynamic high-sulfur natural gas purification process
CN106483847A (en) * 2016-09-20 2017-03-08 北京工业大学 A kind of handpiece Water Chilling Units fault detection method based on self adaptation ICA

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
景源: ""一种基于非高斯性测度的认知无线电频谱感知新方法"", 《小型微型计算机系统》 *
田学民: ""一种基于KICA-GMM的过程故障检测方法"", 《化工学报》 *
谭帅: ""多模态过程统计建模及在线监测方法研究"", 《中国博士学位论文全文数据库信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115147203A (en) * 2022-06-08 2022-10-04 南京金威诚融科技开发有限公司 Financial risk intelligent analysis method based on big data
CN115147203B (en) * 2022-06-08 2024-03-15 阿尔法时刻科技(深圳)有限公司 Financial risk analysis method based on big data

Also Published As

Publication number Publication date
CN109308225B (en) 2024-04-16
WO2019019429A1 (en) 2019-01-31

Similar Documents

Publication Publication Date Title
Aubin et al. The MBTA pipeline for detecting compact binary coalescences in the third LIGO–Virgo observing run
Modarres What every engineer should know about reliability and risk analysis
Horváth et al. Monitoring changes in linear models
US10068176B2 (en) Defect prediction method and apparatus
CN106980573B (en) Method, device and system for constructing test case request object
CN109308225A (en) A kind of virtual machine method for detecting abnormality, device, equipment and storage medium
CN104216349B (en) Utilize the yield analysis system and method for the sensing data of manufacturing equipment
CN106803799B (en) Performance test method and device
Essick et al. Optimizing vetoes for gravitational-wave transient searches
Grbac et al. Stability of software defect prediction in relation to levels of data imbalance
CN107562202A (en) The recognition methods of process operator's human error and device based on Eye-controlling focus
US9317387B2 (en) Methods and systems for reducing metrics used to monitor resources
Mostaeen et al. Clonecognition: machine learning based code clone validation tool
Singh et al. Improving the quality of software by quantifying the code change metric and predicting the bugs
CN105825130A (en) Information security early-warning method and device
KR20200019741A (en) Data Analysis Support System and Data Analysis Support Method
Zhang et al. Failure prediction in ibm bluegene/l event logs
Rebuge et al. A process mining analysis on a virtual electronic patient record system
Bai et al. Location‐scale monitoring of ordinal categorical processes
CN105488061B (en) A kind of method and device of verify data validity
Dhiman et al. A Clustered Approach to Analyze the Software Quality Using Software Defects
Schoenberg et al. Short-term exciting, long-term correcting models for earthquake catalogs
CN116302640A (en) Abnormality analysis method, abnormality analysis device, abnormality analysis computer device, and abnormality analysis program
CN104794031A (en) Cloud system fault detection method combining self-adjustment strategy with virtualization technology
CN110851344B (en) Big data testing method and device based on complexity of calculation formula and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant