CN109308225A - A kind of virtual machine method for detecting abnormality, device, equipment and storage medium - Google Patents
A kind of virtual machine method for detecting abnormality, device, equipment and storage medium Download PDFInfo
- Publication number
- CN109308225A CN109308225A CN201710627200.3A CN201710627200A CN109308225A CN 109308225 A CN109308225 A CN 109308225A CN 201710627200 A CN201710627200 A CN 201710627200A CN 109308225 A CN109308225 A CN 109308225A
- Authority
- CN
- China
- Prior art keywords
- gaussian
- virtual machine
- data
- residual error
- pivot
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000005856 abnormality Effects 0.000 title claims abstract description 30
- 238000012880 independent component analysis Methods 0.000 claims abstract description 51
- 230000002159 abnormal effect Effects 0.000 claims abstract description 42
- 230000006399 behavior Effects 0.000 claims description 12
- 239000004615 ingredient Substances 0.000 claims description 10
- 238000004458 analytical method Methods 0.000 claims description 9
- 206010000117 Abnormal behaviour Diseases 0.000 claims description 8
- 239000000284 extract Substances 0.000 claims description 6
- 238000005259 measurement Methods 0.000 claims description 6
- 238000000354 decomposition reaction Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000011084 recovery Methods 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 abstract description 24
- 238000012360 testing method Methods 0.000 abstract description 22
- 238000004891 communication Methods 0.000 abstract description 3
- 238000004422 calculation algorithm Methods 0.000 description 46
- 239000000306 component Substances 0.000 description 45
- 238000012545 processing Methods 0.000 description 11
- 238000012549 training Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 238000012896 Statistical algorithm Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 206010016256 fatigue Diseases 0.000 description 1
- 238000013100 final test Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000000491 multivariate analysis Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 239000011800 void material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0712—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a virtual computing platform, e.g. logically partitioned systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2134—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on separation criteria, e.g. independent component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45591—Monitoring or debugging support
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/815—Virtual
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Quality & Reliability (AREA)
- Mathematical Physics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a kind of virtual machine method for detecting abnormality, device, equipment and storage mediums, are related to information and communication technique field, which comprises obtain the residual error data of the non-Gaussian system of virtual machine;Independent component analysis is carried out to the residual error data of the non-Gaussian system, determines that the virtual machine is abnormal the time point of behavior.The embodiment of the present invention uses the independent entry abnormality detection based on residual error data, and obtained testing result is more accurate and effective.
Description
Technical field
The present invention relates to information and mechanics of communication (Information and Communication Technologies,
ICT the monitoring of computer performance index and abnormality detection field, in particular to a kind of virtual machine method for detecting abnormality), is set device
Standby and storage medium.
Background technique
Cloud computing is integrated existing hardware resource by technologies such as virtualizations, is formed shared resource pool, is made industry
Business system can obtain calculating, storage and Internet resources on demand, efficiently solve the problems, such as that traditional IT infrastructure exists.
Virtual machine is the core component of cloud platform, is responsible for operation system and provides calculating and storage resource, to guarantee operation system
It operates normally.However, being increasing with operation system type and quantity, the scale of cloud platform constantly expands, and cloud platform becomes
Increasingly complexity is obtained, so that it is easy to appear exceptions in the process of running for virtual machine.Virtual machine, which is of the presence of an anomaly with, not only results in industry
Business system is unable to operate normally, and causes various losses difficult to the appraisal;And enterprise can be caused to the worry of cloud computing, hinder cloud
The development and application of calculating.Therefore, it is necessary to introduce virtual machine abnormality detection technology, the abnormal behaviour of virtual machine is found in time, with
Administrator is reminded to take the necessary measures, to guarantee the normal operation of virtual machine.
Since virtual machine usually contains multiple system resource monitor control indexes, it can be used what industry in recent years was studied extensively
Multivariate statistical analysis is applied to process monitoring and fault diagnosis.Traditional multivariate statistics monitoring method mostly uses pivot point
It analyses (Principle Component Analysis, PCA), data space is decomposed into principal component subspace for it and residual error is empty
Between, each group of measurement data can project in the two subspaces, while introduce Hotelling respectively in two spaces
T2(size that measurement includes the information content in principal component model) and square prediction error SPE (Squared Prediction
Error measures the size of the information content described in principal component model) the two statistics monitor the generation of failure.One
As think T2What is embodied is systematic change, and what SPE embodied is non-systematic change, that is to say, that the SPE based on residual error space
It can more reflect off-note.The problem of PCA, is that it is to generally require vacation based on the analysis method of signal second-order statistics
If process variable Gaussian distributed.Using the abnormality alarming detection system of PCA algorithm as shown in Figure 1, PCA algorithm service receives
Time series source data (i.e. time series data) exports the abnormal time point of detection after treatment, while taking as alarm
The input of business, to generate abnormality alarming.
Another kind using more method be Independent Component Analysis (Independent Component Analysis,
ICA), different from PCA, it is a kind of analysis method based on signal higher order statistical characteristic, and the purpose is to will observe obtained number
It is set to resolve into the ingredient of statistical iteration using the independence and non-Gaussian system of source signal according to certain linear decomposition is carried out.It will
It is corresponding to introduce I as PCA when ICA is applied to abnormality detection2(measurement includes the big of the information content in independent meta-model
It is small) and square prediction error SPE (Squared Prediction Error, measurement cannot be believed described in independent entry model
The size of breath amount) the two statistics monitor the generation of failure.The problem of ICA, is that its supposed premise is independent element
It needs otherwise to will be unable to determine hybrid matrix with non-gaussian distribution.Using abnormality alarming detection system such as Fig. 2 of ICA algorithm
Shown, ICA algorithm services receiving time sequence source data, exports the abnormal time point of detection after treatment, while as announcement
The input of police uniform business, to generate abnormality alarming.
Since the type of service carried on virtual machine and application behavior are varied, the data distribution that real system observes
It is often unsatisfactory, the characteristics of with gaussian sum non-gaussian distribution, therefore only with traditional PCA or ICA method, it is possible to
It will cause the wrong report of failure and fail to report.From the point of view of published some patents and document, there is scholar to attempt ICA algorithm being used for height
The division of this and non-Gaussian signal, but the practical supposed premise for not overcoming ICA algorithm, and for gaussian sum non-Gaussian signal
Division lack relatively good guideline;There are also the correlations that scholar considers time series, and data are divided into one by sliding window
Each and every one local segment data, although the data in this way in window may not form complicated distribution, due to number of samples meeting
Greatly reduce, is actually not appropriate for implementing the statistical algorithms such as PCA, ICA.
Summary of the invention
A kind of virtual machine method for detecting abnormality, device, equipment and storage medium provided in an embodiment of the present invention solve existing
Technology can not accurately detect the problem of virtual machine is abnormal the time point of behavior.
A kind of virtual machine method for detecting abnormality provided according to embodiments of the present invention, comprising:
Obtain the residual error data of the non-Gaussian system of virtual machine;
Independent component analysis is carried out to the residual error data of the non-Gaussian system, determine the virtual machine be abnormal behavior when
Between point.
A kind of virtual machine abnormal detector provided according to embodiments of the present invention, comprising:
Residual error obtains module, the residual error data of the non-Gaussian system for obtaining virtual machine;
Abnormal determining module carries out independent component analysis for the residual error data to the non-Gaussian system, determines described virtual
Machine is abnormal the time point of behavior.
A kind of virtual machine abnormality detecting apparatus provided according to embodiments of the present invention, comprising:
Processor, the residual error data of the non-Gaussian system for obtaining virtual machine, and to the residual error data of the non-Gaussian system
Independent component analysis is carried out, determines that the virtual machine is abnormal the time point of behavior;
Memory, for storing the program executed for the processor.
A kind of storage medium provided according to embodiments of the present invention is stored thereon with the executable program of processor, the journey
Sequence makes processor execute following steps:
Obtain the residual error data of the non-Gaussian system of virtual machine;
Independent component analysis is carried out to the residual error data of the non-Gaussian system, determine the virtual machine be abnormal behavior when
Between point.
Technical solution provided in an embodiment of the present invention has the following beneficial effects:
1, the embodiment of the present invention extracts non-gaussian independent entry by ICA in PCA residual error space, and obtained testing result is more
It is accurate and effective;
2, the embodiment of the present invention carries out a degree of reservation to non-Gauss information by treated the residual error space PCA,
Can more fully be caught the exception information.
Detailed description of the invention
Fig. 1 is the abnormality alarming detection system block diagram using PCA algorithm;
Fig. 2 is the abnormality alarming detection system block diagram using ICA algorithm;
Fig. 3 is virtual machine method for detecting abnormality flow chart provided in an embodiment of the present invention;
Fig. 4 is the actual motion figure of virtual machine abnormality detection system provided in an embodiment of the present invention;
Fig. 5 is the PCA algorithm service process flow diagram of Fig. 4;
Fig. 6 is the ICA algorithm service processing flow chart of Fig. 5;
Fig. 7 is virtual machine abnormal detector block diagram provided in an embodiment of the present invention;
Fig. 8 is one group of datagram handled by the embodiment of the present invention, includes CPU, disk read-write, network I/O, memory etc. 6
The data of a dimension, the left side are training sets, and the right is test set;
Fig. 9 is the processing result figure for Fig. 8 data using tradition PCA method, and the left side is directed to training set data, the right needle
To test set data;
Figure 10 is the processing result figure that the ICA algorithm based on PCA residual error is used for Fig. 8 data, and the left side is directed to training set
Data, the right are directed to test set data;
Figure 11 is another group of datagram handled by the embodiment of the present invention, equally include CPU, disk read-write, network I/O,
The data of 6 dimensions such as memory, the left side are training sets, and the right is test set;
Figure 12 is the processing result figure for Figure 11 data using tradition PCA method, and the left side is directed to training set data, the right
For test set data;
Figure 13 is the processing result figure that the ICA algorithm based on PCA residual error is used for Figure 11 data, and the left side is directed to training set
Data, the right are directed to test set data.
Specific embodiment
Below in conjunction with attached drawing to a preferred embodiment of the present invention will be described in detail, it should be understood that described below is excellent
Select embodiment only for the purpose of illustrating and explaining the present invention and is not intended to limit the present invention.
The embodiment of the present invention is suitable for detection virtual machine abnormal behaviour, when concrete application, utilizes the time sequence to virtual machine
The residual error data of the non-Gaussian system for the virtual machine that column data is handled carries out independent component analysis, obtains virtual machine
The time point of abnormal behaviour.
Fig. 3 is virtual machine method for detecting abnormality flow chart provided in an embodiment of the present invention, as shown in figure 3, step includes:
Step S10: the residual error data of the non-Gaussian system of virtual machine is obtained.
The step S10 includes:
Step S101: pivot analysis is carried out to the time series data of the virtual machine, obtains the time series data
Strong Gaussian pivot.
Specifically, carrying out pivot decomposition to the time series data, the pivot of the time series data is obtained;From
Strong Gaussian component is extracted in the pivot of the time series data, and the time is constituted by the strong Gaussian component
The strong Gaussian pivot of sequence data.
Wherein, it includes: to calculate the time sequence that strong Gaussian component is extracted from the pivot of the time series data
The statistical value (i.e. JB value) of the Gaussian power of characterization of each component of the pivot of column data;Calculate important statistical value
Summation;Each component is ranked up according to the sequence of statistical value from small to large, and each component and row in the sequence of calculation
Sequence preceding component statistical value accumulative and;According to each component and sort in the accumulative and institute of the statistical value of preceding component
The summation for stating important statistical value calculates Gaussian ingredient accounting, and according to the Gaussian ingredient accounting, determines strong high
The component of this property.
The step S10 further include:
Step S102: according to the strong Gaussian pivot and the time series data, the residual error of non-Gaussian system is obtained
Data.
Specifically, carrying out data recovery using the strong Gaussian pivot, it is extensive to obtain strong Gaussian time series
Complex data;Restore data according to the time series data and the time series, obtains the residual error data of non-Gaussian system.
Step S20: independent component analysis is carried out to the residual error data of the non-Gaussian system, determines that the virtual machine is abnormal
The time point of behavior, i.e. the abnormal time point of time series data.
The step S20 includes:
Step S201: independent component analysis is carried out to the residual error data of the non-Gaussian system, obtains being included in solely for measuring
Statistical value (the i.e. I of information content in vertical meta-model2) and for measuring the information content that cannot be described by the independent meta-model
Statistical value (i.e. SPE).
Step S202: according to the I2With the SPE, determine that the virtual machine is abnormal the time point of behavior.Specifically
Ground is said, the I will be utilized2The abnormal time point of extraction and the abnormal time point merging extracted using the SPE, as the void
The abnormal time point of quasi- machine.
It will appreciated by the skilled person that implement the method for the above embodiments be can be with
Relevant hardware is instructed to complete by program, the program can store in computer-readable storage medium, should
Program when being executed, including step S10 to step S20.Furtherly, the present invention can also provide a kind of storage medium, thereon
It is stored with computer program, the non-Gaussian system for obtaining virtual machine is at least performed the steps of when which is executed by processor
Residual error data;Independent component analysis is carried out to the residual error data of the non-Gaussian system, determines that the virtual machine is abnormal behavior
Time point.Wherein, the storage medium may include ROM/RAM, magnetic disk, CD, USB flash disk.
Fig. 4 is the diagram of virtual machine system actual motion, and time series data source flows into PCA algorithm clothes as input first
Business module, completes the extraction of PCA residual error data, and residual error data is then flowed into ICA algorithm service module, exports I2It unites with SPE
The detected abnormal time point of metering flows into alerting service module and generates alarm.Wherein, the processing of PCA algorithm service module
Process such as Fig. 4, the process flow of ICA algorithm service module such as Fig. 5.
Below in conjunction with fig. 4 to fig. 6, invention is further explained.
Fig. 4 is virtual machine abnormality detection system actual motion figure provided in an embodiment of the present invention, as shown in Figure 4.Specific side
Case is as follows:
Step 1: the PCA algorithm service in system receives time series data (i.e. initial data) conduct from data source
Input.
Step 2: assuming that initial data X ∈ Rn*m, wherein n is number of samples, and m is variable number or title dimension), to X
PCA algorithm is executed, pivot X_T ∈ R is obtainedn*p, wherein p is pivot component number.
Step 3: Gaussian stronger component is further extracted to pivot X_T.Specific practice is as follows:
Step 3.1: calculating each component of pivot the value of JB (Jarque-Bera) statistic, JB is defined as follows: JB
=n (S2/6+(K-3)2/24)。
Wherein, n is sample points, and S is sample skewness (skewness), and K is sample kurtosis (kurtosis), and JB value is got over
Greatly, non-Gaussian system is stronger, Gaussian weaker.
Step 3.2: the JB value of each component is ranked up to obtain a sequence, such as JB=by sequence from small to large
[JB1, JB2 ..., JBp], while recording the corresponding relationship of each pivot component and the sequential value, such as JB1 ← → X_T [i],
Middle X_T [i] indicates i-th of pivot component of X_T, and the JB value of X_T [i] is JB1.
Step 3.3: calculate above-mentioned ordering JB sequential value: accumulative and/summation calculates: [JB1/sum (JB),
(JB1+JB2)/sum (JB) ... ..., (JB1+ ...+JBp)/sum (JB)], obtain a value magnitude range (0,1] score value sequence
Column set Gaussian ingredient accounting threshold value, retain the value for being less than threshold value in score value sequence, and pivot corresponding to abstraction sequence value
Component forms new pivot X_Tnew.
Step 4: pivot X_Tnew being restored to luv space, obtains X_Recover, calculates residual error: X_Res=X-X_
Recover, wherein X_Res ∈ Rn*m, as the output of PCA algorithm service.
The embodiment of the present invention realizes a kind of innovatory algorithm of PCA residual error, specifically, the residual error data of obtained PCA,
Be to continue with and PCA pivot is done by further screening formed after new pivot the residual error being calculated again as Gaussian, thus with tradition
PCA algorithm directly to press residual error calculated after energy size extracts pivot different.
Step 5: the ICA algorithm service in system receives the output X_Res data from PCA algorithm service, holds to X_Res
Row ICA algorithm carries out independent entry decomposition, calculates I2With SPE statistic.To I2Detection threshold value is set with SPE statistic, is mentioned respectively
Abnormal time point is taken, then by I2It is merged with the abnormality detection result of SPE, the output as ICA algorithm service.
The input/output interface part of the PCA/ICA algorithm service of the embodiment of the present invention, PCA service do not export directly different
Normal time point, and only export the residual error data of PCA.The input of ICA algorithm service is not initial data yet, but the residual error of PCA
Data, final testing result is from the ICA data processing to PCA residual error data.
Step 6: the alerting service in system receives the output from ICA algorithm service, i.e. abnormal time point, generates corresponding
Alarm.
Fig. 5 is the PCA algorithm service process flow diagram of Fig. 4, as shown in Figure 5, comprising: first to initial data X ∈ Rn*mIt holds
Row PCA algorithm extracts pivot X_T;Then Gaussian stronger component is further extracted to pivot X_T, forms new pivot X_
Tnew;New pivot X_Tnew is finally reverted into original data space, calculates Residual X _ Res ∈ Rn*mAnd it exports.
Fig. 6 is the ICA algorithm service processing flow chart of Fig. 5, as shown in Figure 6, comprising: first to Residual X _ Res ∈ Rn*mIt holds
Row ICA algorithm decomposes independent entry;Then I is calculated2With SPE statistic, extract respectively abnormal;Finally merge I2With the exception of SPE
Testing result simultaneously exports.
The residual error space that initial data (i.e. time series data) is decomposed by PCA in the present embodiment is empty compared to pivot
Between, it is more advantageous to reflection off-note, therefore the embodiment of the present invention considers using the residual error space of PCA as the base for continuing analysis
Plinth.Further, consider that ICA is not to directly acquire tradition when calculating PCA residual error to the processing advantage of non-gaussian source signal
The residual error of PCA algorithm, but first to PCA pivot by it is Gaussian it is further extracted, calculated after returning again to original data space
Then PCA residual error extracts independent entry by ICA in PCA residual error space, calculate I2Exception is detected with SPE statistic, finally
Merge testing result.
Fig. 7 is virtual machine abnormal detector block diagram provided in an embodiment of the present invention, as shown in fig. 7, comprises residual error obtains
Module and abnormal determining module.
Residual error obtains module, the residual error data of the non-Gaussian system for obtaining virtual machine.The residual error obtains module into one
Step includes pivot computational submodule and residual computations submodule, wherein the pivot computational submodule is used for the virtual machine
Time series data carry out pivot analysis, obtain the strong Gaussian pivot of time series data;Residual computations submodule
Block is used to obtain the residual error data of non-Gaussian system according to the strong Gaussian pivot and the time series data.
Abnormal determining module carries out independent component analysis for the residual error data to the non-Gaussian system, determines described virtual
Machine is abnormal the time point of behavior, i.e., the abnormal time point of the described time series data.
The course of work of described device includes: that pivot computational submodule carries out pivot decomposition to the time series data,
The pivot of the time series data is obtained, the strong Gaussian component of extraction from the pivot of the time series data, and by
The strong Gaussian component constitutes the strong Gaussian pivot of the time series data.Described in residual computations submodule utilizes
Strong Gaussian pivot, carries out data recovery, obtains strong Gaussian time series and restores data, and according to the time series
Data and the time series restore data, obtain the residual error data of non-Gaussian system.Abnormal determining module is to the non-Gaussian system
Residual error data carry out independent component analysis, obtain I2With SPE statistic, and the abnormal time of the time series data is determined
Point.
Wherein, pivot computational submodule calculate each component of the pivot of the time series data JB value and all points
The summation of the JB value of amount is ranked up each component according to the sequence of JB value from small to large, and each described in the sequence of calculation
Component and sequence preceding component JB value accumulative and, then the JB value according to each component and sequence in preceding component being tired out
The summation of meter and described the important JB value, calculates Gaussian ingredient accounting, and according to the Gaussian ingredient accounting, really
Fixed strong Gaussian component.
The present embodiment provides a kind of virtual machine abnormality detecting apparatus, comprising:
Processor, the residual error data of the non-Gaussian system for obtaining virtual machine, and to the residual error data of the non-Gaussian system
Independent component analysis is carried out, determines that the virtual machine is abnormal the time point of behavior;
Memory can be coupled for storing the program executed for the processor with the processor.
It is to set identical training set and test that algorithm of the embodiment of the present invention, which is assessed, compared to the improved method of traditional algorithm
Collection, wherein test set is the period that the abnormal comparison fed back according to data collection site is concentrated, and sets phase to detection statistic
Same stimulus threshold criterion, investigates whether algorithm of the embodiment of the present invention can detect more exceptions on the known exception period
Data point.
Application example 1
Fig. 8 data collected include period 2016.10.1~2016.11.11, scene feedback 18:00 on November 7
To between next day 12:00, business had multiple exception.2016.11.7 18:00~2016.11.8 12:00 period is set as
Test set, remaining data are set as training set after rejecting the partial data.
Abnormality detection result using traditional PCA algorithm is as shown in Figure 9, wherein set PCA principal component energy accounting as
85%, detection statistic T2Cuclear density method estimated probability density is pressed with SPE, and takes 99.7% according to accumulated probability Distribution Value
Threshold value limit is extracted abnormal.The results show that in test set, PCA T2Exception is not detected, PCA SPE detects the different of a period of time
Often.
Abnormality detection result using the ICA algorithm based on PCA residual error is as shown in Figure 10, same to set PCA principal component energy
Measuring accounting threshold value is 85%, obtains 4 pivot component X_T [0], X_T [1], X_T [2], X_T [3], calculates 4 pivot components
JB value first sorts from small to large, then calculates accumulative and/summation, as shown in table 1.
Accumulative and/summation the table of 1. application example 1 of table
Pivot component | JB | Accumulative and/summation |
X_T[3] | 4.745843e+02 | 9.973862e-08 |
X_T[0] | 4.537954e+06 | 9.537958e-04 |
X_T[2] | 1.088366e+07 | 3.241106e-03 |
X_T[1] | 4.742859e+09 | 1.000000e+00 |
Setting the Gaussian ingredient accounting threshold value 85% of pivot, the pivot of actual extracting is X_T [0], X_T [2], X_T [3],
And X_T [1] is rejected because non-Gaussian system is relatively strong.X_T [0], X_T [2], X_T [3] the new principal component space constituted are returned to
PCA residual error is calculated in original data space.
The threshold value of detection statistics measurement accumulated probability Distribution Value 99.7%.The results show that in test set, ICA I2With
SPE respectively detects the exception of a period of time, wherein I2Period for being detected with PCA SPE of testing result than more consistent.
From the point of view of synthesis result, the detected abnormal point numerical of present invention method is more than tradition PCA method, and from original
Data see that the period of PCA institute missing inspection, system resource has variation by a relatively large margin really.
Application example 2
Figure 11 data collected include period 2017.1.1~2017.2.28, and scene 25 days 2 months 8:00 of feedback are extremely
Between 12:00, business experience is abnormal.2017.2.25 8:00~2017.2.25 12:00 period is set as test set, is rejected
Remaining data are set as training set after the partial data.
Abnormality detection result using traditional PCA algorithm is as shown in figure 12, wherein setting PCA principal component energy accounting threshold
Value is 85%, detection statistic T2Cuclear density method estimated probability density is pressed with SPE, and is taken according to accumulated probability Distribution Value
99.7% threshold value is extracted abnormal.The results show that in test set, PCA T2Exception is not detected with PCA SPE, with business body
It tests and is not inconsistent completely.
Abnormality detection result using the ICA algorithm based on PCA residual error is as shown in figure 13, same to set PCA principal component energy
Measuring accounting is 85%, obtains 4 pivot component X_T [0], X_T [1], X_T [2], X_T [3], calculates the JB of 4 pivot components
Value first sorts from small to large, then calculates accumulative and/summation, as shown in table 2.
Accumulative and/summation the table of 2. application example 2 of table
Pivot component | JB | Accumulative and/summation |
X_T[2] | 1.316693e+04 | 0.000001 |
X_T[3] | 3.613565e+04 | 0.000004 |
X_T[0] | 9.596462e+05 | 0.000088 |
X_T[1] | 1.152558e+10 | 1.000000 |
Setting the Gaussian ingredient accounting threshold value 85% of pivot, the pivot of actual extracting is X_T [0], X_T [2], X_T [3],
And X_T [1] is rejected because non-Gaussian system is relatively strong.X_T [0], X_T [2], X_T [3] the new principal component space constituted are returned to
PCA residual error is calculated in original data space.
Detection statistics measure the threshold value limit of accumulated probability Distribution Value 99.7%.The results show that in test set, ICA SPE
The abnormal time section than comparatively dense is detected.
From the point of view of synthesis result, the detected abnormal point numerical of the method for the present invention is more than tradition PCA method, and from initial data
It sees, the period where test set, the unusual fluctuations that system resource has comparison violent really.
In conclusion the embodiment of the present invention is the improvement based on traditional PCA and ICA method for detecting abnormality, with conventional method
Compare, the embodiment of the present invention has following technical effect that
1. tradition PCA algorithm only considers energy size factor when extracting pivot, data distribution is not accounted for, is used
The algorithm of the embodiment of the present invention is further extracted the extracted pivot component of traditional PCA as Gaussian, that is, is retained
Gaussian stronger component is as actual PCA pivot in PCA pivot.
2. energy feature is only reflected in the residual error space that tradition PCA algorithm obtains, using the algorithm of the embodiment of the present invention, institute
The residual error space non-Gaussian system of acquisition can also be enhanced, this becomes with two benefits firstly, PCA residual error embodies non-systemic
Change, is easier to detect exception compared to pivot;Secondly, abnormal often have burst, few non-Gaussian system feature, therefore non-gaussian are measured
Enhancing illustrates that the exception of residual error space capture more fully will detect abnormal effect in non-Gaussian system stronger PCA residual error space
Fruit can be more preferable.
3. traditional ICA algorithm is suitble to the processing of non-gaussian source signal, therefore, compared to original signal is directly inputted, use
The PCA residual error data with stronger non-Gaussian system that the embodiment of the present invention obtains is more suitable for the processing of ICA algorithm, therefore obtain
Testing result will be more accurate and effective.
Although describing the invention in detail above, but the invention is not restricted to this, those skilled in the art of the present technique
It can be carry out various modifications with principle according to the present invention.Therefore, all to be modified according to made by the principle of the invention, all it should be understood as
Fall into protection scope of the present invention.
Claims (10)
1. a kind of virtual machine method for detecting abnormality, comprising:
Obtain the residual error data of the non-Gaussian system of virtual machine;
Independent component analysis is carried out to the residual error data of the non-Gaussian system, determines that the virtual machine is abnormal the time of behavior
Point.
2. according to the method described in claim 1, the residual error data of the non-Gaussian system for obtaining virtual machine includes:
Pivot analysis is carried out to the time series data of the virtual machine, obtains the strong Gaussian master of the time series data
Member;
According to the strong Gaussian pivot and the time series data, the residual error data of non-Gaussian system is obtained.
3. being obtained according to the method described in claim 2, the time series data to the virtual machine carries out pivot analysis
The strong Gaussian pivot of the time series data includes:
Pivot decomposition is carried out to the time series data, obtains the pivot of the time series data;
Strong Gaussian component is extracted from the pivot of the time series data, and institute is constituted by the strong Gaussian component
State the strong Gaussian pivot of time series data.
4. according to the method described in claim 3, described extract strong Gaussian point from the pivot of the time series data
Amount includes:
Calculate the statistical value of the Gaussian power of characterization of each component of the pivot of the time series data;
According to the statistical value of each component, the strong Gaussian component in the pivot of the time series data is determined.
5. according to the method described in claim 4, the statistical value according to each component, determines the time series number
According to pivot in strong Gaussian component include:
Calculate the summation of important statistical value;
Each component is ranked up according to the sequence of statistical value from small to large, and each component and sequence in the sequence of calculation
Preceding component statistical value accumulative and;
According to each component and sequence the statistical value of preceding component accumulative and described important statistical value it is total
With calculate Gaussian ingredient accounting, and according to the Gaussian ingredient accounting, determine strong Gaussian component.
6. obtaining according to the method described in claim 2, described according to the strong Gaussian pivot and the time series data
Residual error data to non-Gaussian system includes:
Using the strong Gaussian pivot, data recovery is carried out, strong Gaussian time series is obtained and restores data;
Restore data according to the time series data and the time series, obtains the residual error data of non-Gaussian system.
7. being determined according to the method described in claim 1, the residual error data to the non-Gaussian system carries out independent component analysis
The time point that the virtual machine is abnormal behavior includes:
Independent component analysis is carried out to the residual error data of the non-Gaussian system, obtains for measuring including letter in independent meta-model
The statistical value of breath amount and statistical value for measuring the information content that cannot be described by the independent meta-model;
According to it is described for measures include information content in independent meta-model statistical value and it is described be used for measurement cannot be by institute
The statistical value for stating the information content of independent meta-model description, determines that the virtual machine is abnormal the time point of behavior.
8. a kind of virtual machine abnormal detector, comprising:
Residual error obtains module, the residual error data of the non-Gaussian system for obtaining virtual machine;
Abnormal determining module carries out independent component analysis for the residual error data to the non-Gaussian system, determines the virtual machine hair
The time point of raw abnormal behaviour.
9. a kind of virtual machine abnormality detecting apparatus, comprising:
Processor, the residual error data of the non-Gaussian system for obtaining virtual machine, and the residual error data of the non-Gaussian system is carried out
Independent component analysis determines that the virtual machine is abnormal the time point of behavior;
Memory, for storing the program executed for the processor.
10. a kind of storage medium is stored thereon with the executable program of processor, which makes processor execute following steps:
Obtain the residual error data of the non-Gaussian system of virtual machine;
Independent component analysis is carried out to the residual error data of the non-Gaussian system, determines that the virtual machine is abnormal the time of behavior
Point.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710627200.3A CN109308225B (en) | 2017-07-28 | 2017-07-28 | Virtual machine abnormality detection method, device, equipment and storage medium |
PCT/CN2017/106655 WO2019019429A1 (en) | 2017-07-28 | 2017-10-18 | Anomaly detection method, device and apparatus for virtual machine, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710627200.3A CN109308225B (en) | 2017-07-28 | 2017-07-28 | Virtual machine abnormality detection method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109308225A true CN109308225A (en) | 2019-02-05 |
CN109308225B CN109308225B (en) | 2024-04-16 |
Family
ID=65039486
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710627200.3A Active CN109308225B (en) | 2017-07-28 | 2017-07-28 | Virtual machine abnormality detection method, device, equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109308225B (en) |
WO (1) | WO2019019429A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115147203A (en) * | 2022-06-08 | 2022-10-04 | 南京金威诚融科技开发有限公司 | Financial risk intelligent analysis method based on big data |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11060885B2 (en) | 2019-09-30 | 2021-07-13 | Oracle International Corporation | Univariate anomaly detection in a sensor network |
US11216247B2 (en) | 2020-03-02 | 2022-01-04 | Oracle International Corporation | Automatic asset anomaly detection in a multi-sensor network |
US11762956B2 (en) | 2021-02-05 | 2023-09-19 | Oracle International Corporation | Adaptive pattern recognition for a sensor network |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101158693A (en) * | 2007-09-26 | 2008-04-09 | 东北大学 | Bulk production process malfunction detection method based on multiple nucleus independent elements analyse |
CN101403923A (en) * | 2008-10-31 | 2009-04-08 | 浙江大学 | Course monitoring method based on non-gauss component extraction and support vector description |
US20090216393A1 (en) * | 2008-02-27 | 2009-08-27 | James Schimert | Data-driven anomaly detection to anticipate flight deck effects |
CN104656635A (en) * | 2014-12-31 | 2015-05-27 | 重庆科技学院 | Abnormity detection and diagnosis method for non-gaussian dynamic high-sulfur natural gas purification process |
CN106483847A (en) * | 2016-09-20 | 2017-03-08 | 北京工业大学 | A kind of handpiece Water Chilling Units fault detection method based on self adaptation ICA |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8036325B2 (en) * | 2006-03-09 | 2011-10-11 | Interdigital Technology Corporation | Wireless communication method and apparatus for performing knowledge-based and blind interference cancellation |
CN106778533A (en) * | 2016-11-28 | 2017-05-31 | 国网上海市电力公司 | PCA KSICA energy-storage system typical condition recognition methods based on kernel function |
-
2017
- 2017-07-28 CN CN201710627200.3A patent/CN109308225B/en active Active
- 2017-10-18 WO PCT/CN2017/106655 patent/WO2019019429A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101158693A (en) * | 2007-09-26 | 2008-04-09 | 东北大学 | Bulk production process malfunction detection method based on multiple nucleus independent elements analyse |
US20090216393A1 (en) * | 2008-02-27 | 2009-08-27 | James Schimert | Data-driven anomaly detection to anticipate flight deck effects |
CN101403923A (en) * | 2008-10-31 | 2009-04-08 | 浙江大学 | Course monitoring method based on non-gauss component extraction and support vector description |
CN104656635A (en) * | 2014-12-31 | 2015-05-27 | 重庆科技学院 | Abnormity detection and diagnosis method for non-gaussian dynamic high-sulfur natural gas purification process |
CN106483847A (en) * | 2016-09-20 | 2017-03-08 | 北京工业大学 | A kind of handpiece Water Chilling Units fault detection method based on self adaptation ICA |
Non-Patent Citations (3)
Title |
---|
景源: ""一种基于非高斯性测度的认知无线电频谱感知新方法"", 《小型微型计算机系统》 * |
田学民: ""一种基于KICA-GMM的过程故障检测方法"", 《化工学报》 * |
谭帅: ""多模态过程统计建模及在线监测方法研究"", 《中国博士学位论文全文数据库信息科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115147203A (en) * | 2022-06-08 | 2022-10-04 | 南京金威诚融科技开发有限公司 | Financial risk intelligent analysis method based on big data |
CN115147203B (en) * | 2022-06-08 | 2024-03-15 | 阿尔法时刻科技(深圳)有限公司 | Financial risk analysis method based on big data |
Also Published As
Publication number | Publication date |
---|---|
CN109308225B (en) | 2024-04-16 |
WO2019019429A1 (en) | 2019-01-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Aubin et al. | The MBTA pipeline for detecting compact binary coalescences in the third LIGO–Virgo observing run | |
Modarres | What every engineer should know about reliability and risk analysis | |
Horváth et al. | Monitoring changes in linear models | |
US10068176B2 (en) | Defect prediction method and apparatus | |
CN106980573B (en) | Method, device and system for constructing test case request object | |
CN109308225A (en) | A kind of virtual machine method for detecting abnormality, device, equipment and storage medium | |
CN104216349B (en) | Utilize the yield analysis system and method for the sensing data of manufacturing equipment | |
CN106803799B (en) | Performance test method and device | |
Essick et al. | Optimizing vetoes for gravitational-wave transient searches | |
Grbac et al. | Stability of software defect prediction in relation to levels of data imbalance | |
CN107562202A (en) | The recognition methods of process operator's human error and device based on Eye-controlling focus | |
US9317387B2 (en) | Methods and systems for reducing metrics used to monitor resources | |
Mostaeen et al. | Clonecognition: machine learning based code clone validation tool | |
Singh et al. | Improving the quality of software by quantifying the code change metric and predicting the bugs | |
CN105825130A (en) | Information security early-warning method and device | |
KR20200019741A (en) | Data Analysis Support System and Data Analysis Support Method | |
Zhang et al. | Failure prediction in ibm bluegene/l event logs | |
Rebuge et al. | A process mining analysis on a virtual electronic patient record system | |
Bai et al. | Location‐scale monitoring of ordinal categorical processes | |
CN105488061B (en) | A kind of method and device of verify data validity | |
Dhiman et al. | A Clustered Approach to Analyze the Software Quality Using Software Defects | |
Schoenberg et al. | Short-term exciting, long-term correcting models for earthquake catalogs | |
CN116302640A (en) | Abnormality analysis method, abnormality analysis device, abnormality analysis computer device, and abnormality analysis program | |
CN104794031A (en) | Cloud system fault detection method combining self-adjustment strategy with virtualization technology | |
CN110851344B (en) | Big data testing method and device based on complexity of calculation formula and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |