CN104063747A  Performance abnormality prediction method in distributed system and system  Google Patents
Performance abnormality prediction method in distributed system and system Download PDFInfo
 Publication number
 CN104063747A CN104063747A CN201410294472.2A CN201410294472A CN104063747A CN 104063747 A CN104063747 A CN 104063747A CN 201410294472 A CN201410294472 A CN 201410294472A CN 104063747 A CN104063747 A CN 104063747A
 Authority
 CN
 China
 Prior art keywords
 data pattern
 pattern
 performance
 distributed
 eigenwert
 Prior art date
Links
 230000002159 abnormal effects Effects 0.000 claims abstract description 39
 230000000875 corresponding Effects 0.000 claims description 21
 238000000034 methods Methods 0.000 claims description 17
 238000004364 calculation methods Methods 0.000 claims description 9
 239000000284 extracts Substances 0.000 claims description 6
 238000010801 machine learning Methods 0.000 abstract description 2
 238000005303 weighing Methods 0.000 description 2
 280000255884 Dispatching companies 0.000 description 1
 280000667465 Run Time companies 0.000 description 1
 238000004458 analytical methods Methods 0.000 description 1
 230000006399 behavior Effects 0.000 description 1
 238000010586 diagrams Methods 0.000 description 1
 230000000694 effects Effects 0.000 description 1
 238000005516 engineering processes Methods 0.000 description 1
Abstract
Description
Technical field
The present invention relates to a kind of property abnormality and detect Forecasting Methodology and system, relate in particular to property abnormality Forecasting Methodology and system in a kind of distributed system.
Background technology
In distributed system, each computing machine is separate, can be physically adjacent, also can geographically disperse, and they connect by network or other modes, form a whole.From research, Distributed Calculation has following characteristics: 1. resource sharing; 2. scalability; 3. faulttolerance; 4. concurrency.
The ability of calculating in order to embody better the powerful deal with data of Distributed Calculation, monitors and will become particularly important and crucial distributed computing environment.Operation, the reasonable distribution resource that system must be coordinated these tasks is fully utilized resource and promotes the performance of whole system.Under normal circumstances, system adopts scheduler program to manage these tasks.In scheduler program meeting acquisition system, the relevant information of various resources is to determine whether resource can be used, and then dispatching algorithm is according to determining the priority of task and distribute to their available resources the working time of the availability of resource, task etc.But along with the operation of task, the state of various resources, as cpu load, free memory, hard disk remaining space etc. can change at any time, if before carrying out scheduling, whether just can predict resource still can use in following certain time, and reasonably avoid the use of abnormal period to resource, the scheduling result of system will be more desirable so.Therefore, the resource in system is monitored in real time, and before abnormal generation, detected abnormal omen and have great importance.
System performance refers to extremely during running software, because resource exhausts or runtime error builds up caused computer system performance and declines gradually gradually, finally drop to people the phenomenon of flagrant degree.Extremely normally system state behavior of system performance (as, cpu load, memory usage etc.) can not maintain existing application work.Most of predicting abnormality models are all the model based on regression technique, and regression technique has its specific limitation, and therefore this class model exists defect separately, or is only applicable to specific data, or predicated error is larger etc.And based on the already present predicting abnormality model based on classification, still need manually historical data allocation identification, automaticity is not high, and just goes to observe from the angle of variatevalue, the feature that can not comprehensively consider variable, therefore predicts the outcome and can have certain error.
Summary of the invention
The object of the present invention is to provide property abnormality Forecasting Methodology and system in a kind of distributed system, solved not high to distributed environment performance prediction automaticity, just go to observe and can not consider the problem of the feature of variable comprehensively from the angle of variatevalue.
In order to address the above problem, the present invention relates to the property abnormality Forecasting Methodology in a kind of distributed system, comprise the following steps:
S1: extract the data source of target data values as training in the History Performance Data that the some monitoring nodes from supervisory system obtain, and calculate the eigenwert of each historical data pattern in data source;
S2: obtain respectively the prior probability distribution of each historical data pattern under various states according to the eigenwert of each historical data pattern, and add up the probability distribution of various states, thereby train the Bayesian model of the state of various data patterns;
S3: the realtime performance data of obtaining according to supervisory system calculate the eigenwert of current data pattern;
S4: find the data pattern the most similar to current data pattern from described historical data pattern;
S5: predict by the Bayesian model of training in S2 according to the Output rusults of S4, draw respectively the probability distribution of described various states;
S6: according to result in S5, the selfconfident factor and abnormal threshold value are set, are predicted as abnormality if the selfconfident factor exceeds abnormal threshold value.
Preferably, described eigenwert comprises performance number variable quantity, performance number rate of change and performance number.
Preferably, in S2, the various eigenwert variances of all historical data patterns are arranged by value size, and be divided into some subspaces, calculate the prior probability of the particular state of the corresponding eigenwert variance in each subspace.
Preferably, in S2, train the Bayesian model of each historical data pattern according to the eigenwert of described each historical data pattern, obtain respectively the prior probability of the various states of each pattern.
Preferably, S4 further comprises:
Calculate the standard variance of the eigenwert between current data pattern and each historical normal mode;
Draw with the historical data pattern of all standard variance sums of current data pattern minimum to be the parallel pattern of current data pattern.
Preferably, described state is abnormality, alarm condition and normal condition.
Preferably, in S6, also comprise alarm threshold value is set, if the selfconfident factor is between alarm threshold value and abnormal threshold value, be predicted as alarm condition, be predicted as normal condition if the selfconfident factor is less than alarm threshold value.
In order to address the above problem, the invention still further relates to the property abnormality prognoses system in a kind of distributed system, be connected with the supervisory system of distributed system, comprising:
History feature value computing module, extracts the data source of target data values as training in the History Performance Data that the some monitoring nodes from supervisory system obtain, and calculates the eigenwert of each historical data pattern in data source;
Prior probability module, be connected with the output terminal of history feature value computing module, obtain respectively the prior probability distribution of each historical data pattern under various states according to the eigenwert of each historical data pattern, and the probability distribution of adding up various states, thereby train the Bayesian model of the state of various data patterns;
Realtime characteristic value computing module, the realtime performance data of obtaining according to the some monitoring nodes in supervisory system calculate the eigenwert of current data pattern;
Parallel pattern module, is connected with the output terminal of history feature value computing module and the output terminal of realtime characteristic computing module, from described historical data pattern, finds the data pattern the most similar to current data pattern;
Probability calculation module, predicts by the Bayesian model of training in prior probability module according to the Output rusults of parallel pattern module, draws respectively the probability distribution of described various states; And
Abnormal alarm module, arranges the selfconfident factor and abnormal threshold value according to result in probability calculation module, is predicted as abnormality if the selfconfident factor exceeds abnormal threshold value.
Preferably, described eigenwert comprises performance number variable quantity, performance number rate of change and performance number.
Preferably, described state comprises abnormality, alarm condition and normal condition.
The present invention, owing to adopting above technical scheme, compared with prior art, has following advantage and good effect:
1) the present invention, originally for the property abnormality prediction in distributed system, by the performance of distributed node is analyzed by special value and dividing data pattern, considers the problem of the feature of variable comprehensively, and accuracy rate is higher;
2) the present invention adopts machine learning method Bayesian model to instruct prediction, and detect in real time property abnormality situation, and the prediction detecting is carried out to analysis and assessment by the Bayesian model drawing before, the degree of confidence of prediction is provided, automaticity is high, has improved forecasting reliability and practicality;
3) the eigenwert standard variance of each historical data pattern is changed into multiple subspaces by the present invention, parameter using these subspaces as Bayesian model is trained, calculate the prior probability of the corresponding particular state of sub spaces, further promoted the accuracy rate of predicting abnormality.
Brief description of the drawings
Fig. 1 is the process flow diagram of the property abnormality Forecasting Methodology in a kind of distributed system of the present invention;
Fig. 2 is the system chart of the property abnormality prognoses system in a kind of distributed system of the present invention.
Embodiment
Below with reference to accompanying drawing of the present invention; technical scheme in the embodiment of the present invention is carried out to clear, complete description; obviously; as described herein is only a part of example of the present invention; it is not whole examples; based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain under the prerequisite of not making creative work, belongs to protection scope of the present invention.
For the ease of the understanding to the embodiment of the present invention, be further explained as an example of specific embodiment example below in conjunction with accompanying drawing, and each embodiment does not form the restriction to the embodiment of the present invention.
Embodiment mono
Please refer to Fig. 1, the invention provides the property abnormality Forecasting Methodology in a kind of distributed system, mainly comprise the following steps:
S1: extract the data source of target data values as training in the History Performance Data that the some monitoring nodes from supervisory system obtain, and calculate the eigenwert of each historical data pattern in data source;
In the present embodiment, describe a data point by the eigenwert of three aspects, comprise performance number variable quantity (Change Value, CV), performance number rate of change (Change Rate, CR) and performance number (Value, V).Performance number is a moment t _{1}the value of performance metric.
Performance number variable quantity is a moment t _{1}with another moment t _{2}the difference of performance metric:
Wherein, moment t _{i}the value of performance metric, i=0,1 ..., n;
moment t _{i1}the value of performance metric, i=1 ..., n.
Performance number rate of change is the variation ratio of performance metric, equals performance number variable quantity divided by current time t _{1}performance number:
Wherein, moment t _{i}the value of performance metric, i=0,1 ..., n;
moment t _{i1}the value of performance metric, i=1 ..., n.
S2: obtain respectively the prior probability distribution of each historical data pattern under various states according to the eigenwert of each historical data pattern, and add up the probability distribution of various states, thereby train the Bayesian model of the state of various data patterns;
According to the data characteristics result of S1, historical data is divided into multiple patterns, and these patterns are carried out to the mark of three states, it is abnormality, alarm condition and normal condition, then train prior probability distribution by three states, count the probability distribution at each state of each pattern, train the Bayesian model of various patterns, for further lift scheme correctness, the feature of pattern is changed into multiple subspaces, and the parameter using these subspaces as Bayesian model is trained.
In S2, can train according to the eigenwert of each historical data pattern the Bayesian model of each historical data pattern, obtain respectively the prior probability of the various states of each pattern.Various states can be abnormality, alarm condition and normal condition.
Set up a disaggregated model with Naive Bayes Classifier.The use restriction of Naive Bayes Classification is between each parameter, to be mutually independently, and in the pattern obtaining is formally three parameters independently mutually, therefore meets the requirement of Naive Bayes Classification.
Suppose that current time is t _{i}, so from t _{iL}to t _{i}time period in the relevant eigenwert of all data form current data pattern, wherein L is the length of current data pattern.
In training, add label for the each pattern in training data, indicate the state of this pattern, pattern can be expressed as (Vt1, Vt2 ..., Vtn, Status).Use the training dataset containing label, can obtain the prior probability distribution (prior distribution) of all patterns of three states:
P((SD _{CV},SD _{CR},SD _{V})status)
Wherein, statusnormal condition normal, alarm condition alert or abnormality abnormal.
Three standard variances of parallel pattern are respectively SD _{cV}, SD _{cR}, SD _{v}, the probability size that the corresponding state of this pattern is status.According to training data, can also obtain the distribution situation P (status) of each state.
According to above prior probability, can be in the hope of obtaining in situation at this variance yields, calculate a specific shape probability of state size, use Bayes's classification to obtain:
As noted earlier, between three parameters, be independent of each other, therefore can be expressed as:
For further lift scheme correctness, also can be that the various eigenwert variances of all historical data patterns are arranged by value size, and be divided into some subspaces, calculate the prior probability of the particular state of the corresponding eigenwert variance in each subspace, particular state can be abnormality, alarm condition or normal condition.
Model space is divided into several subspaces, and each subspace has comprised all particular characteristic value that exist in a continuous span, has therefore obtained several discrete subspaces, the parameter using these subspaces as Naive Bayes Classification.For example, performance number rate of change variance SD _{cR}all spans are r=[a, b], wherein a is the minimum value that performance number rate of change variance is got, b is the maximal value that performance number rate of change variance is got.Be m sub spaces by this spatial division, the length of every sub spaces is:
So each subspace can be expressed as:
S _{SDCR1}＝[a,a+Δr],S _{SDCR2}＝[a+Δr,a+2*Δr],...,S _{SDCR1}＝[bΔr,b]
For each performance number rate of change variance, as long as it is put in suitable subspace.Therefore, do not need to calculate each variance the prior probability of corresponding particular state, only need to calculate the prior probability of the corresponding particular state of sub spaces:
Wherein, S _{sDCVi}performance number variable quantity variance SD _{cV}corresponding certain sub spaces;
S _{sDCRj}performance number rate of change variance SD _{cR}corresponding certain sub spaces;
S _{sDVk}performance number variance SD _{v}corresponding certain sub spaces;
Statuscertain specific state, normal, alert or abnormal.
S3: the realtime performance data of obtaining according to supervisory system calculate the eigenwert of current data pattern.Suppose that current time is t _{i}, so from t _{i}L is to t _{i}time period in the relevant feature of all data form current data pattern, wherein L is the length of current data pattern.
S4: find the data pattern the most similar to current data pattern from historical data pattern;
Be specially:
S41: the standard variance that calculates the eigenwert between current data pattern and each historical normal mode;
Each moment t _{i}data have three features, i.e. (CV (t _{i}), CR (t _{i}), V (ti)).Suppose that current time is t _{i}, so from t _{iL}to t _{i}time period in the relevant feature of all data form the pattern of current performance metric, wherein L is the length of current data pattern.
As Fig. 2, current pattern and historical normal mode are compared, and in historical normal mode, find a pattern the most similar to current data pattern.Calculate the standard variance (Standard Deviation) of each feature between current data pattern and each historical normal mode.If a historical data pattern is from moment t _{j}L starts, to t _{j}finish, the performance number variable quantity standard variance between current data pattern and this historical data pattern is designated as SD _{cV}(t _{j}), the performance number rate of change standard variance between current data pattern and this historical data pattern is designated as SD _{cR}(t _{j}), the worth standard variance SD between current data pattern and this historical data pattern _{v}(t _{j}).Current data pattern and historical data pattern are before contrasted one by one,
S42: meet all standard variance sum minimums of current data pattern and a historical data pattern, establishing this historical data pattern is the parallel pattern of current data pattern.
In the time that a pattern in historical data meets following formula:
Wherein, { SD _{cV}(t _{j})+SD _{cR}(t _{j})+SD _{v}(t _{j})the set that between current data pattern and all historical data patterns, the standard variance of feature forms;
Minthe minimum value in set.
That is, meet all standard variance sum minimums of current data pattern and this historical data pattern, so just claim that this historical data pattern is the parallel pattern of current data pattern.Therefore,, for each current data pattern, can find the most similar pattern in a history:
(SD _{CV}(t _{k})，SD _{CR}(t _{k})，SD _{V}(t _{k}))。
S5: predict by the Bayesian model of training in S2 according to the Output rusults of S4, draw respectively the probability distribution of various states;
In the present embodiment according to the parallel pattern (SD of S4 _{cV}(t _{k}), SD _{cR}(t _{k}), SD _{v}(t _{k})), the Bayesian model of training from S2 instructs prediction, obtains the shape probability of state situation of pattern:
The probability that obtains pattern carrys out the state of deterministic model, and the state of judgment model exactly, just can capture the omen of abnormal generation, thereby realizes predicting abnormality.
S6: according to result in S5, the selfconfident factor and abnormal threshold value are set, are predicted as abnormality if the selfconfident factor exceeds abnormal threshold value.
Also comprise alarm threshold value is set, if the selfconfident factor is between alarm threshold value and abnormal threshold value, be predicted as alarm condition, be predicted as normal condition if the selfconfident factor is less than alarm threshold value.Also need to set up alarm mechanism, and take the defence treatment measures after warning by default alarm mechanism.
In the present embodiment, for current pattern (SD _{cV}, SD _{cR}, SD _{v}), obtain corresponding three kinds of shape probability of states according to above method:
P(normal(SD _{cv},SD _{CR},SD _{V}))
P(alert(SD _{cv},SD _{CR},SD _{V}))
P(abnormal(SD _{cv},SD _{CR},SD _{V}))
In order to determine which kind of state is this pattern be in, and above three shape probability of states are done to corresponding comparison:
δ _{1}＝logP(alert(SD _{CV},SD _{CR},SD _{V}))logP(normal(SD _{CV},SD _{CR},SD _{V}))
δ _{2}＝logP(alert(SD _{CV},SD _{CR},SD _{V}))logP(abnormal(SD _{CV},SD _{CR},SD _{V}))
If meet following condition, judge that current data pattern is in alarm state, next may occur extremely:
δ _{1}>=0 and δ _{2}>=0
δ _{1}which is larger to represent the possibility of current data pattern in alarm condition and the possibility in normal condition, δ _{2}that is larger to represent the possibility of current data pattern in alarm condition and the possibility in abnormality.If meet formula (310), illustrate that the possibility of current data pattern in alarm condition is all larger than the possibility in normal or abnormality, can judge next and likely occur extremely.
When sending while predicting abnormal alarm, if δ _{1}>=0, and δ _{1}be worth larger, so show that this pattern is the possibility of the significantly large normal condition of possibility of alarm condition.Similarly, if δ _{2}>=0, and δ _{2}be worth larger, so show that this pattern is the possibility of the significantly large abnormality of possibility of alarm condition.Can say  δ _{1} and  δ _{2} value larger, the confidence level predicting the outcome is higher, therefore can be by  δ _{1} and  δ _{2} as the reference index of the credibility of predicting abnormality.Each predicting abnormality of making is distributed to selfconfident factor (Confidence Factor, CF), being calculated as follows of the selfconfident factor:
CF＝δ _{1}+δ _{2}
Apparently, larger if this pattern is the possibility of alert state, CF value is larger so, and therefore this is a mode of effectively weighing predicting abnormality confidence level.According to CF value, the confidence level that can know prediction is much, and determine alarm threshold value according to confidence level size, if the selfconfident factor is between alarm threshold value and abnormal threshold value, be predicted as alarm condition, be predicted as normal condition if the selfconfident factor is less than alarm threshold value, also need to set up alarm mechanism, and take when alarm condition and the abnormality to defend treatment measures to prevent from extremely occurring or reducing the loss extremely bringing by default alarm mechanism.
Embodiment bis
Please refer to Fig. 2, the invention provides the property abnormality prognoses system in a kind of distributed system, be connected with the supervisory system of distributed system, mainly comprise: history feature value computing module, prior probability module, realtime characteristic value computing module, parallel pattern module, probability calculation module and abnormal alarm module.
History feature value computing module, extracts the data source of target data values as training in the History Performance Data that the some monitoring nodes from supervisory system obtain, and calculates the eigenwert of each historical data pattern in data source;
In the present embodiment, describe a data point by the eigenwert of three aspects, comprise performance number variable quantity (Change Value, CV), performance number rate of change (Change Rate, CR) and performance number (Value, V).Performance number is a moment t _{1}the value of performance metric.
Performance number variable quantity is a moment t _{1}with another moment t _{2}the difference of performance metric:
Wherein, moment t _{i}the value of performance metric, i=0,1 ..., n;
moment t _{i1}the value of performance metric, i=1 ..., n.
Performance number rate of change is the variation ratio of performance metric, equals performance number variable quantity divided by current time t _{1}performance number:
Wherein, moment t _{i}the value of performance metric, i=0,1 ..., n;
moment t _{i1}the value of performance metric, i=1 ..., n.
Prior probability module, be connected with the output terminal of history feature value computing module, obtain respectively the prior probability distribution of each historical data pattern under various states according to the eigenwert of each historical data pattern, and the probability distribution of adding up various states, thereby train the Bayesian model of the state of various data patterns;
According to the data characteristics result of history feature value computing module output, historical data is divided into multiple patterns, and these patterns are carried out to the mark of three states, it is abnormality, alarm condition and normal condition, then train prior probability distribution by three states, count the probability distribution at each state of each pattern, train the Bayesian model of various patterns, for further lift scheme correctness, the feature of pattern is changed into multiple subspaces, and the parameter using these subspaces as Bayesian model is trained.
In prior probability module, can train according to the eigenwert of each historical data pattern the Bayesian model of each historical data pattern, obtain respectively the prior probability of the various states of each pattern.Various states can be abnormality, alarm condition and normal condition.
Set up a disaggregated model with Naive Bayes Classifier.The use restriction of Naive Bayes Classification is between each parameter, to be mutually independently, and in the pattern obtaining is formally three parameters independently mutually, therefore meets the requirement of Naive Bayes Classification.
Suppose that current time is t _{i}, so from t _{iL}to t _{i}time period in the relevant eigenwert of all data form current data pattern, wherein L is the length of current data pattern.
In training, add label for the each pattern in training data, indicate the state of this pattern, pattern can be expressed as (Vt1, Vt2 ..., Vtn, Status).Use the training dataset containing label, can obtain the prior probability distribution (prior distribution) of all patterns of three states:
P((SD _{CV},SD _{CR},SD _{V})status)
Wherein, statusnormal condition normal, alarm condition alert or abnormality abnormal.
Three standard variances of parallel pattern are respectively SD _{cV}, SD _{cR}, SD _{v}, the probability size that the corresponding state of this pattern is status.According to training data, can also obtain the distribution situation P (status) of each state.
According to above prior probability, can be in the hope of obtaining in situation at this variance yields, calculate a specific shape probability of state size, use Bayes's classification to obtain:
As noted earlier, between three parameters, be independent of each other, therefore can be expressed as:
For further lift scheme correctness, also can be that the various eigenwert variances of all historical data patterns are arranged by value size, and be divided into some subspaces, calculate the prior probability of the particular state of the corresponding eigenwert variance in each subspace, particular state can be abnormality, alarm condition or normal condition.
Model space is divided into several subspaces, and each subspace has comprised all particular characteristic value that exist in a continuous span, has therefore obtained several discrete subspaces, the parameter using these subspaces as Naive Bayes Classification.For example, performance number rate of change variance SD _{cR}all spans are r=[a, b], wherein a is the minimum value that performance number rate of change variance is got, b is the maximal value that performance number rate of change variance is got.Be m sub spaces by this spatial division, the length of every sub spaces is:
So each subspace can be expressed as:
S _{SDCR1}＝[a,a+Δr],S _{SDCR2}＝[a+Δr,a+2*Δr],...,S _{SDCR1}＝[bΔr,b]
For each performance number rate of change variance, as long as it is put in suitable subspace.Therefore, do not need to calculate each variance the prior probability of corresponding particular state, only need to calculate the prior probability of the corresponding particular state of sub spaces:
Wherein, S _{sDCVi}performance number variable quantity variance SD _{cV}corresponding certain sub spaces;
S _{sDCRj}performance number rate of change variance SD _{cR}corresponding certain sub spaces;
S _{sDVk}performance number variance SD _{v}corresponding certain sub spaces;
Statuscertain specific state, normal condition normal, alarm condition alert or abnormality abnormal.
Realtime characteristic value computing module, the realtime performance data of obtaining according to the some monitoring nodes in supervisory system calculate the eigenwert of current data pattern.Suppose that current time is t _{i}, so from t _{i}L is to t _{i}time period in the relevant feature of all data form current data pattern, wherein L is the length of current data pattern.
Parallel pattern module, is connected with history feature value computing module with realtime characteristic value computing module, finds the data pattern the most similar to current data pattern from historical data pattern;
Be specially:
Historical data comparison module, is connected with history feature value computing module with realtime characteristic value computing module, calculates the standard variance of the eigenwert between current data pattern and each historical normal mode;
Each moment t _{i}data have three features, i.e. (CV (t _{i}), CR (t _{i}), V (ti)).Suppose that current time is t _{i}, so from t _{iL}to t _{i}time period in the relevant feature of all data form the pattern of current performance metric, wherein L is the length of current data pattern.
As Fig. 2, current pattern and historical normal mode are compared, and in historical normal mode, find a pattern the most similar to current data pattern.Calculate the standard variance (Standard Deviation) of each feature between current data pattern and each historical normal mode.If a historical data pattern is from moment t _{j}L starts, to t _{j}finish, the performance number variable quantity standard variance between current data pattern and this historical data pattern is designated as SD _{cV}(t _{j}), the performance number rate of change standard variance between current data pattern and this historical data pattern is designated as SD _{cR}(t _{j}), the worth standard variance SD between current data pattern and this historical data pattern _{v}(t _{j}).Current data pattern and historical data pattern are before contrasted one by one,
Minimum variance acquisition module, is connected with the output terminal of historical data comparison module, meets all standard variance sum minimums of current data pattern and a historical data pattern, and establishing this historical data pattern is the parallel pattern of current data pattern.
In the time that a pattern in historical data meets following formula:
Wherein, { SD _{cV}(t _{j})+SD _{cR}(t _{j})+SD _{v}(t _{j})the set that between current data pattern and all historical data patterns, the standard variance of feature forms; Minthe minimum value in set.
That is, meet all standard variance sum minimums of current data pattern and this historical data pattern, so just claim that this historical data pattern is the parallel pattern of current data pattern.Therefore,, for each current data pattern, can find the most similar pattern in a history:
(SD _{CV}(t _{k})，SD _{CR}(t _{k})，SD _{V}(t _{k})。
Probability calculation module, predicts by the Bayesian model of training in prior probability module according to the Output rusults of parallel pattern module, draws respectively the probability distribution of various states;
In the present embodiment according to the parallel pattern of minimum variance acquisition module:
(SD _{cV}(t _{k}), SD _{cR}(t _{k}), SD _{v}(t _{k}), the Bayesian model of training from prior probability module instructs prediction, obtains each shape probability of state situation of pattern:
The probability that obtains pattern carrys out the state of deterministic model, and the state of judgment model exactly, just can capture the omen of abnormal generation, thereby realizes predicting abnormality.
Abnormal alarm module, arranges the selfconfident factor and abnormal threshold value according to Output rusults in probability calculation module, is predicted as abnormality if the selfconfident factor exceeds abnormal threshold value.
Generally also comprise alarm threshold value is set, if the selfconfident factor is between alarm threshold value and abnormal threshold value, be predicted as alarm condition, be predicted as normal condition if the selfconfident factor is less than alarm threshold value.Also need to set up alarm mechanism, and take the defence treatment measures after warning by default alarm mechanism.
In the present embodiment, for current pattern (SD _{cV}, SD _{cR}, SD _{v}), obtain corresponding three kinds of shape probability of states according to above method:
P(normal(SD _{cv},SD _{CR},SD _{V}))
P(alert(SD _{cv},SD _{CR},SD _{V}))
P(abnormal(SD _{cv},SD _{CR},SD _{V}))
In order to determine which kind of state is this pattern be in, and above three shape probability of states are done to corresponding comparison:
δ _{1}＝logP(alert(SD _{CV},SD _{CR},SD _{V}))logP(normal(SD _{CV},SD _{CR},SD _{V}))
δ _{2}＝logP(alert(SD _{CV},SD _{CR},SD _{V}))logP(abnormal(SD _{CV},SD _{CR},SD _{V}))
If meet following condition, judge that current data pattern is in alarm state, next may occur extremely:
δ _{1}>=0 and δ _{2}>=0
δ _{1}which is larger to represent the possibility of current data pattern in alarm condition and the possibility in normal condition, δ _{2}that is larger to represent the possibility of current data pattern in alarm condition and the possibility in abnormality.If meet formula (310), illustrate that the possibility of current data pattern in alarm condition is all larger than the possibility in normal or abnormality, can judge next and likely occur extremely.
When sending while predicting abnormal alarm, if δ _{1}>=0, and δ _{1}be worth larger, so show that this pattern is the possibility of the significantly large normal condition of possibility of alert state.Similarly, if δ _{2}>=0, and δ _{2}be worth larger, so show that this pattern is the possibility of the significantly large abnormality of possibility of alarm condition.Can say  δ _{1} and  δ _{2} value larger, the confidence level predicting the outcome is higher, therefore can be by  δ _{1} and  δ _{2} as the reference index of the credibility of predicting abnormality.Each predicting abnormality of making is distributed to selfconfident factor (Confidence Factor, CF), being calculated as follows of the selfconfident factor:
CF＝δ _{1}+δ _{2}
Apparently, larger if this pattern is the possibility of alert state, CF value is larger so, and therefore this is a mode of effectively weighing predicting abnormality confidence level.According to CF value, the confidence level that can know prediction is much, and determine alarm threshold value according to confidence level size, if the selfconfident factor is between alarm threshold value and abnormal threshold value, be predicted as alarm condition, be predicted as normal condition if the selfconfident factor is less than alarm threshold value, also need to set up alarm mechanism, and take when alarm condition and the abnormality to defend treatment measures to prevent from extremely occurring or reducing the loss extremely bringing by default alarm mechanism.
The above; only for preferably embodiment of the present invention, but protection scope of the present invention is not limited to this, is anyly familiar with in technical scope that those skilled in the art disclose in the present invention; the variation that can expect easily or replacement, within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claim.
Claims (10)
Priority Applications (1)
Application Number  Priority Date  Filing Date  Title 

CN201410294472.2A CN104063747A (en)  20140626  20140626  Performance abnormality prediction method in distributed system and system 
Applications Claiming Priority (1)
Application Number  Priority Date  Filing Date  Title 

CN201410294472.2A CN104063747A (en)  20140626  20140626  Performance abnormality prediction method in distributed system and system 
Publications (1)
Publication Number  Publication Date 

CN104063747A true CN104063747A (en)  20140924 
Family
ID=51551447
Family Applications (1)
Application Number  Title  Priority Date  Filing Date 

CN201410294472.2A CN104063747A (en)  20140626  20140626  Performance abnormality prediction method in distributed system and system 
Country Status (1)
Country  Link 

CN (1)  CN104063747A (en) 
Cited By (12)
Publication number  Priority date  Publication date  Assignee  Title 

CN105629947A (en) *  20151130  20160601  东莞酷派软件技术有限公司  Household equipment monitoring method, household equipment monitoring device and terminal 
CN105871879A (en) *  20160506  20160817  中国联合网络通信集团有限公司  Automatic network element abnormal behavior detection method and device 
CN106095639A (en) *  20160530  20161109  中国农业银行股份有限公司  A kind of cluster subhealth state method for early warning and system 
CN106125643A (en) *  20160622  20161116  华东师范大学  A kind of industry control safety protection method based on machine learning techniques 
CN106293976A (en) *  20160815  20170104  东软集团股份有限公司  Application performance Risk Forecast Method, device and system 
CN106897113A (en) *  20170223  20170627  郑州云海信息技术有限公司  The method and device of a kind of virtualized host operation conditions prediction 
WO2017124953A1 (en) *  20160121  20170727  阿里巴巴集团控股有限公司  Method for processing machine abnormality, method for adjusting learning rate, and device 
CN107844406A (en) *  20171025  20180327  千寻位置网络有限公司  Method for detecting abnormality and system, service terminal, the memory of distributed system 
CN108039971A (en) *  20171218  20180515  北京搜狐新媒体信息技术有限公司  A kind of alarm method and device 
CN109297582A (en) *  20170725  20190201  台达电子电源(东莞)有限公司  The detection device and detection method of fan abnormal sound 
CN109921955A (en) *  20171212  20190621  北京嘀嘀无限科技发展有限公司  Portfolio monitoring method, system, computer equipment and storage medium 
WO2020078385A1 (en) *  20181018  20200423  杭州海康威视数字技术股份有限公司  Data collecting method and apparatus, and storage medium and system 
Citations (2)
Publication number  Priority date  Publication date  Assignee  Title 

CN103324155A (en) *  20120319  20130925  通用电气航空系统有限公司  System monitoring 
KR20140056952A (en) *  20121102  20140512  주식회사 세이프티아  Method and system for evaluating abnormality detection 

2014
 20140626 CN CN201410294472.2A patent/CN104063747A/en not_active Application Discontinuation
Patent Citations (2)
Publication number  Priority date  Publication date  Assignee  Title 

CN103324155A (en) *  20120319  20130925  通用电气航空系统有限公司  System monitoring 
KR20140056952A (en) *  20121102  20140512  주식회사 세이프티아  Method and system for evaluating abnormality detection 
NonPatent Citations (1)
Title 

仇沂: "分布式环境中的性能异常预测监控", 《中国优秀硕士学位论文全书数据库 信息科技辑》 * 
Cited By (17)
Publication number  Priority date  Publication date  Assignee  Title 

CN105629947A (en) *  20151130  20160601  东莞酷派软件技术有限公司  Household equipment monitoring method, household equipment monitoring device and terminal 
CN105629947B (en) *  20151130  20190201  东莞酷派软件技术有限公司  Home equipment monitoring method, home equipment monitoring device and terminal 
WO2017124953A1 (en) *  20160121  20170727  阿里巴巴集团控股有限公司  Method for processing machine abnormality, method for adjusting learning rate, and device 
CN106991095A (en) *  20160121  20170728  阿里巴巴集团控股有限公司  Machine abnormal processing method, the method for adjustment of learning rate and device 
US10748090B2 (en)  20160121  20200818  Alibaba Group Holding Limited  Method and apparatus for machineexception handling and learning rate adjustment 
CN105871879B (en) *  20160506  20190305  中国联合网络通信集团有限公司  Network element abnormal behaviour automatic testing method and device 
CN105871879A (en) *  20160506  20160817  中国联合网络通信集团有限公司  Automatic network element abnormal behavior detection method and device 
CN106095639A (en) *  20160530  20161109  中国农业银行股份有限公司  A kind of cluster subhealth state method for early warning and system 
CN106125643A (en) *  20160622  20161116  华东师范大学  A kind of industry control safety protection method based on machine learning techniques 
CN106293976A (en) *  20160815  20170104  东软集团股份有限公司  Application performance Risk Forecast Method, device and system 
CN106897113A (en) *  20170223  20170627  郑州云海信息技术有限公司  The method and device of a kind of virtualized host operation conditions prediction 
CN109297582A (en) *  20170725  20190201  台达电子电源(东莞)有限公司  The detection device and detection method of fan abnormal sound 
CN107844406A (en) *  20171025  20180327  千寻位置网络有限公司  Method for detecting abnormality and system, service terminal, the memory of distributed system 
CN109921955A (en) *  20171212  20190621  北京嘀嘀无限科技发展有限公司  Portfolio monitoring method, system, computer equipment and storage medium 
CN109921955B (en) *  20171212  20201002  北京嘀嘀无限科技发展有限公司  Traffic monitoring method, system, computer device and storage medium 
CN108039971A (en) *  20171218  20180515  北京搜狐新媒体信息技术有限公司  A kind of alarm method and device 
WO2020078385A1 (en) *  20181018  20200423  杭州海康威视数字技术股份有限公司  Data collecting method and apparatus, and storage medium and system 
Similar Documents
Publication  Publication Date  Title 

CN107408225B (en)  Adaptive handling of operational data  
US9672085B2 (en)  Adaptive fault diagnosis  
US10192170B2 (en)  System and methods for automated plant asset failure detection  
US20190114244A1 (en)  CorrelationBased Analytic For TimeSeries Data  
US9722945B2 (en)  Dynamically identifying target capacity when scaling cloud resources  
Tsui et al.  Prognostics and health management: A review on data driven approaches  
US20170351241A1 (en)  Predictive and prescriptive analytics for systems under variable operations  
JP6394726B2 (en)  Operation management apparatus, operation management method, and program  
CN106104496B (en)  The abnormality detection not being subjected to supervision for arbitrary sequence  
EP3101599A2 (en)  Advanced analytical infrastructure for machine learning  
CN104350471B (en)  Method and system for detecting anomalies in realtime in processing environment  
US10223403B2 (en)  Anomaly detection system and method  
Langone et al.  LSSVM based spectral clustering and regression for predicting maintenance of industrial machines  
US9070121B2 (en)  Approach for prioritizing network alerts  
Saxena et al.  Metrics for offline evaluation of prognostic performance  
US8060342B2 (en)  Selflearning integrity management system and related methods  
US9600394B2 (en)  Stateful detection of anomalous events in virtual machines  
US9122273B2 (en)  Failure cause diagnosis system and method  
US10013303B2 (en)  Detecting anomalies in an internet of things network  
Wang et al.  Spatiotemporal anomaly detection in gas monitoring sensor networks  
US10410135B2 (en)  Systems and/or methods for dynamic anomaly detection in machine sensor data  
US9424157B2 (en)  Early detection of failing computers  
US20160231738A1 (en)  Information processing apparatus and analysis method  
US7693982B2 (en)  Automated diagnosis and forecasting of service level objective states  
JP2018500709A5 (en)  Computing system, program and method 
Legal Events
Date  Code  Title  Description 

C06  Publication  
PB01  Publication  
C10  Entry into substantive examination  
SE01  Entry into force of request for substantive examination  
RJ01  Rejection of invention patent application after publication  
RJ01  Rejection of invention patent application after publication 
Application publication date: 20140924 