CN105871634B - Detect the method for cluster exception and the system of application, management cluster - Google Patents

Detect the method for cluster exception and the system of application, management cluster Download PDF

Info

Publication number
CN105871634B
CN105871634B CN201610380755.8A CN201610380755A CN105871634B CN 105871634 B CN105871634 B CN 105871634B CN 201610380755 A CN201610380755 A CN 201610380755A CN 105871634 B CN105871634 B CN 105871634B
Authority
CN
China
Prior art keywords
performance
class
strong point
detected
number strong
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610380755.8A
Other languages
Chinese (zh)
Other versions
CN105871634A (en
Inventor
吴海珊
阮松松
刘麒贇
傅乐琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ruixiang Technology Co ltd
Original Assignee
Beijing Oneapm Communication Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Oneapm Communication Technology Co Ltd filed Critical Beijing Oneapm Communication Technology Co Ltd
Priority to CN201610380755.8A priority Critical patent/CN105871634B/en
Publication of CN105871634A publication Critical patent/CN105871634A/en
Application granted granted Critical
Publication of CN105871634B publication Critical patent/CN105871634B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements
    • H04L43/55Testing of service level quality, e.g. simulating service usage

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses the method for detection cluster exception and the systems of application, management cluster.Wherein, the method for detecting cluster exception includes the following steps.Obtain a performance number strong point to be detected of instruction clustering performance.The determining and highest performance data class of performance number strong point similarity.Whether the similarity for judging performance number strong point and identified performance data class is more than similarity threshold.When being more than threshold value, performance number strong point is aggregated in identified performance data class, and calculate data point sum and account for whether the ratio of the data point sum of current all properties data class is more than exception class threshold value.When being less than exception class threshold value, the sum of maximum distance that performance number strong point is ranked up at a distance from each dimension performance indicator of center particle, and calculates predetermined ratio at a distance from all dimensions and the ratio between, whether be greater than range distribution threshold value.When being greater than range distribution threshold value, determine that the performance number strong point to be detected is an abnormal point.

Description

Detect the method for cluster exception and the system of application, management cluster
Technical field
The present invention relates to internet areas, more particularly to detect the method for cluster exception and the system of application, management cluster.
Background technique
With the progress of Internet technology, the cluster based on cloud computing framework is more and more applied in each field. Cluster usually may include more calculating equipment (for example, application server or database server etc.).Cluster can be matched It is set to execute Distributed Application or be configured as equilibrium and multiple similar calculating services is provided.Cluster has enhanced scalability, Usually there is a large amount of device node.In order to safeguard to clustering performance, carrying out detection to clustering performance is very important.
In face of the performance data of the big order of magnitude of cluster, the performance detection means of height automation and high accuracy are needed. Currently, published some performance detection means (or referred to as abnormality detection means) are by the way of machine learning to performance number According to classification and determine abnormal data.Machine learning for performance detection includes having supervision and unsupervised to performance data It practises.For example, carrying out cluster and abnormality detection to performance data based on the clustering algorithm of kmeans.However, existing abnormality detection Means accuracy, in terms of it is also very insufficient.
Therefore, the invention proposes a kind of new abnormality detection schemes.
Summary of the invention
For this purpose, the present invention provides a kind of new abnormality detection scheme, effective solution at least one problem above.
According to an aspect of the present invention, a kind of method detecting cluster exception is provided, is included the following steps.Obtain instruction The performance number strong point to be detected of one of clustering performance.The performance number strong point includes normalized multidimensional performance indicator.From existing The performance data class by the polymerize generation in performance number strong point acquired before in, the determining and performance number strong point to be detected The highest performance data class of similarity.Judge the performance number strong point to be detected and the similarity of identified performance data class is The no similarity threshold current more than the performance data class.When being more than current similarity threshold, by the performance to be detected Data point be aggregated to determined by performance data class, and calculate after polymerization that data point sum accounts for current institute in the performance data class Whether the ratio for having the data point sum of performance data class is more than exception class threshold value.When being less than exception class threshold value, this is waited for It is ranked up, and counts at a distance from each dimension performance indicator of the center particle at the performance number strong point and performance data class of detection Calculate the sum of maximum distance of predetermined ratio at a distance from all dimensions and the ratio between, whether greater than range distribution threshold value.Be greater than away from When from distribution threshold value, determine that the performance number strong point to be detected is an abnormal point.
Another aspect according to the present invention provides a kind of application for detecting cluster exception, including data capture unit, similar Spend computing unit, the first judging unit, polymerized unit, second judgment unit and third judging unit.Data capture unit is suitable for Obtain a performance number strong point to be detected of instruction clustering performance.The performance number strong point includes that normalized multidimensional performance refers to Mark.Similarity calculated, suitable for from the existing performance data class by the polymerize generation in performance number strong point acquired before In, the determining and highest performance data class of performance number strong point to be detected similarity.First judging unit is suitable for judgement should be to Whether the similarity of the performance number strong point of detection and identified performance data class is more than the current similarity of the performance data class Threshold value.Polymerized unit is suitable for when the first judging unit determines and is more than current similarity threshold, by the performance number to be detected Strong point is aggregated in identified performance data class.Second judgment unit is suitable for after calculating polymerize data point in the performance data class Whether the ratio that sum accounts for the data point sum of current all properties data class is more than exception class threshold value.Third judging unit is suitable for When being less than exception class threshold value, to each dimension at the performance number strong point to be detected and the center particle of the performance data class The sum of maximum distance that the distance of performance indicator is ranked up, and calculates predetermined ratio at a distance from all dimensions and the ratio between, be It is no to be greater than range distribution threshold value.When being greater than range distribution threshold value, third judging unit determines the performance number strong point to be detected For an abnormal point.
Optionally, in the abnormal application of detection cluster according to the present invention, data capture unit further comprises receiving Module and normalization module.Receiving module is suitable for receiving a performance from performance collection device instruction clustering performance collected Data group.The performance data group includes multidimensional performance indicator.Normalizing module to be suitable for normalizing the performance data group is performance number Strong point.Multidimensional performance indicator include memory usage in the cluster, cpu busy percentage, task throughput, task response-time, At least one of garbage reclamation frequency.
Optionally, in the abnormal application of detection cluster according to the present invention, similarity calculated is suitable for according to following Mode determines the highest performance data class of performance number strong point to be detected similarity.Calculate performance number strong point to be detected with The distance of the center particle of existing each performance data class.According at a distance from the center particle of each performance data class, count Calculate the similarity at performance number strong point and this performance data class to be detected.Determining and performance number strong point to be detected similarity is most High performance data class.Wherein, similarity calculated is suitable for according to following manner calculating performance number strong point to be detected and The distance of the center particle for each performance data class having: it calculates in performance number strong point to be detected and each performance data class The Euclidean distance of heart particle.
Optionally, in the abnormal application of detection cluster according to the present invention, similarity calculated is suitable for according to following Formula calculates the similarity at performance number strong point and this performance data class to be detected:
Wherein, d is the performance number strong point to be detected that is calculated at a distance from this performance data class center particle, Sim is the similarity with this performance data class.
Optionally, in the abnormal application of detection cluster according to the present invention, polymerized unit is further adapted for according to following formula Update the center particle and similarity threshold of the performance data class being added after performance number strong point to be detected:
Cr=(pt+cr*np)/(np+1)
Wherein, cr is center particle, and np is data point sum in class, and pt is the performance number strong point that is added, sim be pt and The similarity of performance data class, th are exception class threshold value, and lr is the learning rate threshold value for adjusting th.
Optionally, in the abnormal application of detection cluster according to the present invention, third judging unit is suitable for according to following sides Formula executes each dimension performance indicator of the described pair of performance number strong point to be detected and the center particle of the performance data class Distance is ranked up, and calculate the maximum distance of predetermined ratio and at a distance from all dimensions and the ratio between, whether be greater than distance point Cloth threshold value:
Pt={ n1,...,niCr={ c1,...,ci}dI=|ni-ci|niFor i-th dimension in performance number strong point pt to be detected Performance indicator, ciFor the i-th dimension numerical value of center particle cr, diIt is point i-th dimension at a distance from i-th dimension in cr,
To the d of all dimensionsiIt is ranked up, and calculatesWherein, N is all dimension sums, and M is to make a reservation in N The number of dimensions of ratioFor N-dimensional distance in it is maximum M be worth sum,For the sum of N number of distance,
Judge whether pr is greater than range distribution threshold value.
Optionally, the abnormal application of detection cluster according to the present invention further includes window judging unit, and being suitable for will be to be detected Performance number strong point be added in a sliding window.The sliding window maintains the performance number of the newest predetermined quantity got Strong point.When third judging unit determines that the performance number strong point to be detected is an abnormal point, the judgement of window judging unit should Whether the ratio of abnormal point is more than window threshold value in sliding window.
Optionally, the abnormal application of detection cluster according to the present invention further includes Alarm Unit.Alarm Unit is suitable in window When mouth judging unit is determined more than the window threshold value, according to the distance of each dimension performance indicator, performance to be detected is determined Anomalous performance index in data point.
Optionally, it in the abnormal application of detection cluster according to the present invention, is determined in the first judging unit described to be checked The similarity of the performance number strong point of survey and identified performance data class is less than the current similarity threshold of the performance data class When, polymerized unit is further adapted for being a performance data class and being added to this class by the performance number strong point to be detected is newly-generated In existing performance data class.Whether the classification sum that polymerized unit is further adapted for the current all performance data classes of judgement is more than class Other threshold value, and two nearest performance data class of distance are merged into one when being more than.Wherein, polymerized unit is suitable under State mode and two nearest performance data class of distance will be merged into one: calculating in all performance data classes, between any two in The distance of heart particle determines two nearest class cl of distance1And cl2.By cl1And cl2Two classes merge into class cl3.According to following Formula determines cl3Center particle, similarity threshold and data point sum:
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1's Data point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2For cl2Similarity threshold, th3For cl3's Similarity threshold, np3For cl3Data point sum.
Optionally, in the abnormal application of detection cluster according to the present invention, second judgment unit is further adapted for being more than different When normal class threshold value, the non-abnormal point in performance number strong point to be detected is determined.Third judging unit is further adapted for being less than range distribution When threshold value, the non-abnormal point in performance number strong point to be detected is determined.
Optionally, the abnormal application of detection cluster according to the present invention further includes class detection unit.Class detection unit is suitable for Before calculating the determination of similarity unit and the highest performance data class of performance number strong point to be detected similarity, judgement is current Existing performance data class sum whether non-zero, and or judge performance number strong point to be detected dimension whether with existing property Energy data class is consistent.
Optionally, in the abnormal application of detection cluster according to the present invention, class detection unit is further adapted for determining currently Existing performance data class sum is zero, or when the determining dimension and inconsistent existing performance data class, instruction polymerization The performance number strong point to be detected is generated a performance data class by unit.
According to a further aspect of the invention, a kind of system managing cluster, including performance collector, detection cluster are provided Application and resource management applications.Performance collection device is suitable for collecting the performance indicator of cluster.Resource management applications are suitable for according to inspection The alarm information that the abnormal application of cluster generates is surveyed, the resource distribution of cluster is adjusted.
Abnormality detection scheme according to the present invention, can be to the performance number strong point including multidimensional performance indicator obtained in real time Incremental clustering is carried out, and judges whether the class entered added by performance number strong point belongs to by adaptive threshold in cluster process Exception class.In this way, the accuracy tool of the class that abnormality detection scheme of the invention is polymerize and the outlier detection operation carried out There is robustness.Further, by carrying out statistical appraisal at a distance from each dimension of class center particle to performance number strong point, the present invention Abnormality detection scheme can the point low with similarity high to similarity in class preferably be distinguished.In this way, abnormality detection side Case can reduce rate of false alarm.In addition, abnormality detection scheme of the invention, by judging abnormal point in window based on sliding window In ratio, the accuracy of abnormality alarming may further be improved.Abnormality detection scheme of the invention can also control cluster mould The classification sum of type, and change in data dimension, Clustering Model is re-created in time, to ensure that the steady of abnormality detection It is qualitative.
Detailed description of the invention
To the accomplishment of the foregoing and related purposes, certain illustrative sides are described herein in conjunction with following description and drawings Face, these aspects indicate the various modes that can practice principles disclosed herein, and all aspects and its equivalent aspect It is intended to fall in the range of theme claimed.Read following detailed description in conjunction with the accompanying drawings, the disclosure it is above-mentioned And other purposes, feature and advantage will be apparent.Throughout the disclosure, identical appended drawing reference generally refers to identical Component or element.
Fig. 1 shows the schematic diagram of the cluster 100 of some implementation columns according to the present invention;
Fig. 2 shows the schematic diagrames of the abnormal application 200 of detection cluster according to some embodiments of the invention;
Fig. 3 shows the schematic diagram of the abnormal application 300 of detection cluster according to some embodiments of the invention;
Fig. 4 shows the flow chart of the method 400 of detection cluster exception according to some embodiments of the invention;And
Fig. 5 shows the flow chart of the method 500 of detection cluster exception according to some embodiments of the invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure It is fully disclosed to those skilled in the art.
Fig. 1 shows the schematic diagram of the cluster 100 of some implementation columns according to the present invention.
As shown in Figure 1, cluster 100 includes multiple calculating equipment.Each calculating equipment is a device node in cluster.Collection Group's system 100 includes application server 110 and 120, database server 130 and 140, management server 150 and monitoring service Device 160, but not limited to this.Wherein, resource management applications 151 are populated in management server 150.It is stayed in monitoring server 160 There are the applications 162 that performance collection device 161 and detection cluster are abnormal.
Resource management applications 151 are suitable for carrying out resource scheduling management to device node in cluster 100, for example, instruction one Device node creates one server instance, one device node of isolation or addition new device node to cluster etc.. Depending on the framework (such as Hadoop or Spark etc.) of cluster 100, resource management applications 151 can be a variety of well known clusters Management application, which is not described herein again.
Performance collection device 161 is suitable for collecting at least part of performance indicator data of cluster 100.The class of performance indicator data Type can be various achievement datas such as device node hardware, operating system and application.The type of performance indicator data is for example Including memory usage, cpu busy percentage, disk occupancy, task throughput, task response-time, garbage reclamation frequency etc., but It is without being limited thereto.Wherein, task throughput can be task that the device node unit time is capable of handling (e.g. access request, Calculating task etc.) quantity.In an embodiment in accordance with the invention, performance collection device 161 can be with periodic harvest performance number According to group.Each performance data group includes the performance indicator of multiple dimensions.Here, the performance indicator of each dimension can be same class Type, such as the memory usage of multiple equipment node.Each performance data group also may include the performance indicator of multiple types.Example Such as, a performance data group includes the multiple performance index value an of device node.In another example a performance data group can wrap Include multiple performance indicators of each in multiple equipment node.In addition, the concrete mode that performance collection device 161 collects data can be with Using a variety of well known technical approach, for example, being deployed with the probe agent of acquisition performance achievement data in each device node. Multiple probes can be by performance indicator tidal data recovering collected to performance collection device 161.According to from multiple equipment node Performance indicator data, performance collection device 161 are configurable to generate the performance data group including multiple dimensions.In general, performance The acquisition time of the performance indicator of each dimension is consistent in data group, although there may be regular hour errors.In order to Simplify description, more known implementation repeats no more to performance collection device here, and these modes can be applied at this In invention.
It detects the abnormal application 162 of cluster and is suitable for the performance data group according to collected by performance collection device, carry out based on poly- The abnormality detection of class study.Using 162 when determining that cluster 100 is abnormal, corresponding alarm information can also be generated, and be transferred to Resource management applications 151.In this way, resource management applications 151 can carry out cluster resource management and running etc. according to the alarm information Operation.
It should be noted that being all resident although detection cluster shown in fig. 1 is abnormal using 162 and performance collection device 161 In monitoring server 160, but the present invention does not do excessive limitation to this.In one embodiment, performance collection device 161 and application 162 are distributed in different device nodes.For example, performance collection device 161, which can be configured as, resides in management server 150 In.In addition, application according to the present invention 151,161 and 162 is limited to reside in individual node equipment.According to the present invention In another embodiment, each application can be Distributed Application.For example, being distributed in using 162 for monitoring cluster exception is more On a device node.In this way, the detection that can with high real-time be completed using 162 to performance data group that detection cluster is abnormal. More specific description is carried out below with reference to Fig. 2 application abnormal to detection cluster according to the present invention.
Fig. 2 shows the schematic diagrames of the abnormal application 200 of detection cluster according to some embodiments of the invention.It needs to illustrate , it both may reside in a calculating equipment using 200, and be also possible to Distributed Application, it is hereafter right to simplify the description This no longer excessive explanation.
As shown in Fig. 2, including data capture unit 210, similarity calculated 220, the first judging unit using 200 230, polymerized unit 240, second judgment unit 250 and third judging unit 260.
Data capture unit 210 is suitable for obtaining the performance number strong point to be detected of instruction cluster (100) performance.Due to subsequent The needs of similarity calculation, performance number strong point here include normalized multidimensional performance indicator.
In an embodiment in accordance with the invention, data capture unit 210 can obtain performance from performance collection device (161) Data group.Performance data group includes multidimensional performance indicator.The quantitative criteria of performance indicator itself is normalization in performance data group Dimension (value range of performance indicator be 0 to 1).In this way, data capture unit 210 can will be from the every of performance collector A performance data group is directly as a performance number strong point including multidimensional property value.
In yet another embodiment, the non-normalizing of at least part performance indicator in the performance data group from performance collection device The dimension of change.In other words, the value range of at least part performance indicator is not limited to 0 to 1 section.For this purpose, data acquisition list Member 210 can be configured as including receiving module (not shown) and normalization module (not shown).Receiving module, which is suitable for receiving, to be come From the performance data group of performance collector instruction clustering performance collected.Module is normalized to be suitable for normalizing each performance data Group is a performance data point.For example,
Pt={ n1,...,niPt be a performance data point, the performance indicator including i dimension.Each performance indicator is to take Value range is in [0,1] section.
Similarity calculated 220 is suitable for being calculated and determined highest with current performance number strong point to be detected similarity There is performance data class.Before to this performance number strong point to be detected, polymerized unit 230 has usually generated at least one Performance data class.Each performance data class includes one or more performance numbers strong point.In order to current performance data to be detected Point distinguishes, by data point has referred to as detected in each performance data class performance number strong point in the present invention.Here, have Performance data class be using 200 for performance data point establish the Clustering Models based on increment type.Specifically, similarity meter The similarity at performance number strong point to be detected Yu each existing performance data class can be calculated separately by calculating unit 220, then really Determine the highest performance data class of similarity.In one embodiment, similarity calculated 220 calculates performance to be detected first The center particle of data point and performance data class.Here center particle and performance number strong point each in performance data class dimension phase Together.The value of each dimension of center particle be in such all properties data point in the mean value of the dimension.In other words, center particle For such mass centre.Here distance can be Euclidean distance, can also according to well known to other apart from calculation come It determines.In addition, the present invention can also be to be detected to determine using similarity calculation mode well known to such as cosine similarity The similarity at performance number strong point and performance data class, which is not described herein again.
After determining performance number strong point to be detected at a distance from the center particle of a performance data class, similarity calculation Unit 220 can calculate the similarity at performance number strong point and this performance data class to be detected according to this distance.In this hair In bright one embodiment, similarity calculated is according to the similarities of following formula calculated performance data points and performance data class.
Wherein, d is the performance number strong point to be detected that is calculated at a distance from this performance data class middle line particle, Sim is the similarity with this performance data class.
In the determination of similarity calculated 220 and this highest performance data class of performance number strong point to be detected similarity Later, the first judging unit 230 is suitable for judging whether this highest similarity is more than the current similarity threshold of the performance data class Value.Here, similarity threshold can be a fixed threshold, also can be configured as the threshold value of automatic adjusument.In creation one When a performance data class, the performance data class be configured with an initial similarity threshold, for example, 0.5.In performance data class When one performance data point of every increase, similarity threshold carries out Primary regulation.About similarity threshold more detailed description see Hereafter.
When the first judging unit 230 determines that this highest similarity is more than current similarity threshold, polymerized unit 240 Suitable for performance number strong point to be detected is aggregated in this performance data class.
Second judgment unit 250 accounts for current all properties number suitable for data point sum in the performance data class after calculating polymerization Whether the ratio according to the data point sum of class is more than exception class threshold value.In general, in performance number strong point used, normal data points quantity Account for larger specific gravity.When the ratio being calculated is higher, this performance data class is that the probability of exception class is lower.
Determine that third judging unit 260 is to performance to be detected when being less than exception class threshold value in second judgment unit 250 It is ranked up at a distance from all dimensions of the center particle for the performance data class that data point is added with it.Third judging unit 260 Extract the maximum distance of predetermined ratio (such as 30%), and calculate the sum of extracted maximum distance and all dimensions away from From with the ratio between whether be greater than range distribution threshold value.According to an embodiment of the present invention, third judging unit 260 is with specific reference to following Mode is judged.
Pt={ n1,...,niCr={ c1,...,ci}dI=|ni-ci| wherein, pt is a performance data point, cr mono- A center particle, niFor i-th dimension performance indicator in performance number strong point pt to be detected, ciFor the i-th dimension numerical value of center particle cr, diIt is pt i-th dimension at a distance from i-th dimension in cr.
D of the third judging unit 260 to all dimensionsiIt is ranked up, and calculatesWherein, N is all dimensions Sum, M are the number of dimensions of predetermined ratio in N,The sum of maximum M value in N-dimensional distance,For N number of distance With.Finally, third judging unit 260 judges whether pr is greater than range distribution threshold value.It here, can be with table lower than range distribution threshold value Each dimension data size is more average in bright performance number strong point, this performance number strong point is that the probability of normal data points is higher. Normal data points can indicate cluster, and there is no abnormal.When being greater than range distribution threshold value, the determination of third judging unit 260 is to be checked The performance number strong point of survey is an abnormal point.Alarm information can also be generated according to abnormal point using 200 and notify resource management Using (151).It, can be in this way, the abnormal application 200 of detection cluster of the invention is greater than range distribution threshold value by judging whether Improve the accuracy of outlier detection.
Fig. 3 shows the schematic diagram of the abnormal application 300 of detection cluster according to some embodiments of the invention.Such as Fig. 3 institute Show, using 300 include data capture unit 310, similarity calculated 320, the first judging unit 330, polymerized unit 340, Second judgment unit 350, third judging unit 360, class detection unit 370, window judging unit 380 and Alarm Unit 390.
Data capture unit 210 is consistent in the working method and Fig. 2 of data capture unit 310, and which is not described herein again.
In one embodiment, when data capture unit 310 gets a performance number strong point to be detected, class detection Unit 370 may determine that current existing performance data class sum whether non-zero.It (indicates to be established not yet based on increasing if it is zero The Clustering Model that amount is), class detection unit 370 is suitable to indicate that polymerized unit 340 generates this performance number strong point to be detected One performance data class (establishing new Clustering Model).In this way, this class generated can be used as existing performance data class, And it polymerize and detects the performance number strong point to be detected of subsequent acquisition on this basis.
In yet another embodiment, class detection unit 370 be suitable for judge performance number strong point to be detected dimension whether with Existing performance data class is consistent.If dimension is inconsistent, show that the performance number strong point to be detected is not suitable for and existing property Energy data class is clustered.Therefore, similarity calculated 320 does not need to execute operation to the performance number strong point to be detected. In this way, being suitable for regenerating performance data class using 300.In other words, using 300 be suitable for empty existing performance number strong point (that is, Abandon established Clustering Model).For example, class detection unit 370 can delete existing performance data class, and indicate that polymerization is single This performance number strong point to be detected is generated a performance data class by member 340.
In yet another embodiment, class detection unit 370 can simultaneously to performance data class sum whether nonzero sum dimension Whether unanimously judged.It is similar when class detection unit 370 determines existing performance data class sum non-zero and consistent dimension Spending computing unit 320 can be to performance number strong point to be detected execution and the consistent operation of similarity detection unit 220, here not It repeats again.
First judging unit 330, polymerized unit 340, second judgment unit 350 and third judging unit 360 may be implemented Function identical with the first judging unit 230, polymerized unit 240, second judgment unit 250 and third judging unit 260, here It repeats no more.
In addition, second judgment unit 350 is true after performance number strong point to be detected is added to a performance data class It is more than exception class threshold that data point sum, which accounts for the ratio of the data point sum of current all properties data class, in the fixed performance data class When value, determine that performance number strong point to be detected is normal data points (non-abnormal point).Third judging unit 360 is determining pr (tool Body refers to above third judging unit 260) when being less than range distribution threshold value, determine that performance number strong point to be detected is non-different Chang Dian.
In addition, determining the similarity at performance number strong point and all properties data class to be detected in the first judging unit 330 When being less than current similarity threshold, polymerized unit 340 is further adapted for this performance number strong point to be detected being generated as one New performance data class.Polymerized unit 340 judges whether the sum of the performance data class after newly-generated class is added is more than classification Threshold value.When being more than class threshold, polymerized unit 340 be suitable for by all properties data class, two performance numbers that distance is nearest One is merged into according to class.In this way, application 300 of the invention can control classification sum, it is excessive to avoid classification number.? According to an embodiment of the present invention, polymerized unit 340 calculates first to be calculated in all performance data classes, between any two center Then the distance of particle determines two nearest class cl of distance1And cl2, and by cl1And cl2Two classes merge into class cl3.Polymerization Unit 340 can determine cl according to following formula3Center particle, similarity threshold and data point sum.
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1's Data point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2For cl2Similarity threshold, th3For cl3's Similarity threshold, np3For cl3Data point sum.
In addition, polymerized unit 340 is further adapted for after performance number strong point to be detected is added to a performance data class, more The center particle and similarity threshold of this new performance data class.In one embodiment, polymerized unit 340 is according to following formula Update center particle and similarity threshold.
Cr=(pt+cr*np)/(np+1)
Wherein, cr is center particle, and np is data point sum in class, and pt is the performance number strong point that is added, sim be pt and The similarity of performance data class, th are exception class threshold value, and lr is the learning rate threshold value for adjusting th.It is greater than update in sim When preceding th, similarity threshold th increases after update, that is, improves the standard that data point is added.Conversely, being less than update in sim When preceding th, updated th reduces.In this way, by carrying out automatic adjusument to similarity threshold, it is according to the present invention to answer There is robustness when detecting to property performance data point with 300.
To sum up, at one performance number strong point to be detected of the every acquisition of data capture unit 310, class detection unit is suitable for judgement Whether current existing performance data class is zero.
If it is zero, this performance number strong point is generated a performance data class by polymerized unit 340.In other words, using 300 Based on this performance number strong point, start one new Clustering Model of training.
If be not zero, class detection unit 370 can also detect this performance number strong point dimension whether with performance data Class is consistent.If inconsistent, class detection unit 370 empties existing performance data class.In other words, application 300 is abandoned existing Clustering Model, and it is based on this performance number strong point, start one new Clustering Model of training.
If existing performance data class is not zero, and performance number strong point to be detected dimension and existing performance data class Unanimously, pass through similarity calculated 320, the first judging unit 330, polymerized unit 340, second judgment unit 350 using 300 With third judging unit 360, to judge that this performance number strong point is abnormal point.
In addition, window judging unit 380, which is also safeguarded, a sliding window.Data capture unit 310 it is every obtain one to This performance number strong point can be all added in sliding window by the performance number strong point of detection, window judging unit 380.In this way, sliding The performance number strong point using the 300 newest predetermined quantities got is remained in dynamic window.In window judging unit 380 by one After a performance data point is added to sliding window, if third judging unit 360 determines that the performance number strong point is abnormal point, window Judging unit 380 be suitable for judge abnormal point sum in current sliding window mouth ratio whether be more than window threshold value (for example, 0.5, But not limited to this).If it exceeds window threshold value, Alarm Unit 390 can also newly be added to different in sliding window according to this The d often puti(referring specifically to above), determines the anomalous performance index of this abnormal point.In other words, Alarm Unit 390 can be true Determine abnormal one or more performance indicator in multidimensional performance indicator.On this basis, Alarm Unit 390 can be for abnormal Performance indicator generates alarm information.In this way, resource management applications according to the present invention can be accurately positioned according to the alarm information The device node being abnormal, and make corresponding resource management action.For example, a performance number strong point includes that 10 CPU are accounted for With rate index.It is abnormal that Alarm Unit 390 determines that the 5th index value exists.Resource management applications are deposited getting the 5th index value After abnormal message, it is abnormal can to determine that the corresponding device node of the 5th index exists.
Fig. 4 shows the flow chart of the method 400 of detection cluster exception according to some embodiments of the invention.Method 400 Suitable for being executed in monitoring server according to the present invention.
As shown in figure 4, method 400 starts from step S410.In step S410, one for indicating the clustering performance is obtained Performance number strong point to be detected, the performance number strong point include normalized multidimensional performance indicator.According to an embodiment of the present invention, In step S410, performance data group can be obtained from performance collection device (161).Performance data group includes multidimensional performance indicator. When multidimensional performance indicator includes memory usage in cluster according to the present invention, cpu busy percentage, task throughput, task response Between, at least one of garbage reclamation frequency.The quantitative criteria of performance indicator itself is normalized dimension (property in performance data group The value range of energy index is 0 to 1).In this way, method 400 can using each performance data group from performance collector as One performance number strong point including multidimensional property value.In yet another embodiment, in the performance data group from performance collection device The non-normalized dimension of at least part performance indicator.In other words, the value range of at least part performance indicator be not limited to 0 to 1 section.In this way, step S410 also needs to be implemented the behaviour that normalization performance data group is a performance number strong point to be detected Make.
After step S410 obtains a performance number strong point to be detected, method 400 can execute step S420.In step In S420, from the existing performance data class of the polymerize generation in performance number strong point obtained before, determination is to be detected with this The highest performance data class of performance number strong point similarity.Here, existing performance data class is actually one established Clustering Model.
According to an embodiment of the present invention, step S420 includes following implementation processes.Firstly, calculating performance number to be detected Strong point is at a distance from the center particle of existing each performance data class.Then, according to the centroplasm with each performance data class The distance of point, calculates the similarity at performance number strong point and this performance data class to be detected.Finally, determining and property to be detected It can the highest performance data class of data point similarity.Wherein, distance calculated is, for example, Euclidean distance, but not limited to this.Separately Outside, similarity calculation can be realized by following manner.
Wherein, d is the performance number strong point to be detected that is calculated at a distance from the center particle of this performance data class, Sim is the similarity with this performance data class.In addition, step S420 can also be using phase well known to such as cosine similarity The similarity at performance number strong point and performance data class to be detected is determined like degree calculation, which is not described herein again.
Step S420 it is determining with after the highest performance data class of performance number strong point to be detected similarity, method 400 into Enter step S430.In step S430, the similarity at the performance number strong point to be detected and identified performance data class is judged It whether is more than the current similarity threshold of the performance data class.
Determine that method 400 executes step S440, this is to be detected when being more than current similarity threshold in step S430 Performance number strong point be aggregated to determined by performance data class, and calculate after polymerization that data point sum accounts in the performance data class Whether the ratio of the data point sum of current all properties data class is more than exception class threshold value.
Determine that method 400 enters step S450 when being less than exception class threshold value in step S440.In step S450, The performance number strong point to be detected is arranged at a distance from each dimension performance indicator of the center particle of the performance data class Sequence, and the sum of maximum distance for calculating predetermined ratio at a distance from all dimensions and the ratio between, whether be greater than range distribution threshold value.Under Face combines formula to carry out more specific exemplary illustration to operation in step S450.
Pt={ n1,...,niCr={ c1,...,ci}
Firstly, calculating dI=|ni-ci| wherein, niFor i-th dimension performance indicator in performance number strong point pt to be detected, ciFor in The i-th dimension numerical value of heart particle cr, diIt is pt i-th dimension at a distance from i-th dimension in cr.
Then, to the d of all dimensionsiIt is ranked up, and calculatesWherein, N is all dimension sums, and M is in N The number of dimensions of predetermined ratio,For N-dimensional distance in it is maximum M be worth sum,For the sum of N number of distance.Finally, Judge whether pr is greater than range distribution threshold value.
Determine that method 400 enters step S460 when being greater than range distribution threshold value in step S450.Determine that this is to be detected Performance number strong point is an abnormal point.Method 400 more specifically realizes that details and application 200 are consistent, and which is not described herein again.
Fig. 5 shows the flow chart of the method 500 of detection cluster exception according to some embodiments of the invention.Method 500 Suitable for being executed in monitoring server according to the present invention.
As shown in figure 5, method 500 starts from step S501.Step S501 executive mode is consistent with step S410, here no longer It repeats.
Then, method enters step S502.In step S502, judge whether current existing performance data class sum is non- Zero.
When determining existing performance data class sum non-zero in step S502, method 500 can choose execution step S503.In step S503, judge whether the dimension at performance number strong point to be detected is consistent with existing performance data class.
When determining that dimension is inconsistent in step S503, method 500 executes step S504, abandons existing performance data Class, and performance number strong point to be detected is generated as a performance data class.In other words, step S504 abandons existing poly- Class model, and start a new clustering learning process.
In step S502, when determining that existing performance data class sum is zero (i.e. no Clustering Model), method is executed Step S505.In step S505, performance number strong point to be detected is generated into a performance data class, and start one it is new poly- Class learning process.
When determining that dimension is consistent in step S503, method 500 executes step S506.The embodiment and step of step S506 Rapid S420 is consistent, and which is not described herein again.It should be noted that the dimension at performance number strong point is protected in an embodiment according to the present invention Keep steady timing, and method 500 can not execute step S503.That is, determining that existing performance data class sum is non-in step S502 When zero, directly execution step S506.
In step S506 after the determining and highest performance data class of performance number strong point to be detected similarity, method 500 Enter step S507.Step S507 embodiment is consistent with step S430, and which is not described herein again.
When determining that similarity is less than the current similarity threshold of performance data class in step s 507, method 500 is executed Step S508.In step S508, by the newly-generated performance data class in performance number strong point to be detected, and this class is added Into existing performance data class.Step S509 is also executed in order to control classification sum, method 500 in Clustering Model, judgement is worked as Before (be added a new class after) all classification sums whether be greater than class threshold, and will distance recently two when being more than Performance data class merges into one.According to an embodiment of the present invention, step S509 is implemented as operations described below process, but unlimited In this.
Firstly, calculating in all performance data classes, the distance of center particle, determines two of distance recently between any two Class cl1And cl2
Then, by cl1And cl2Two classes merge into class cl3
Finally, determining cl according to following formula3Center particle, similarity threshold and data point sum:
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1's Data point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2For cl2Similarity threshold, th3For cl3's Similarity threshold, np3For cl3Data point sum.
Determine that method 500 executes step S510 when being more than current similarity threshold in step s 507.Step S510's Embodiment is consistent with step S440, and which is not described herein again.
After executing step S510, method 500 also executes step S518.In step S518, performance data class is updated Center particle and similarity threshold.According to an embodiment of the present invention, it in step S518, is updated and is added according to following formula The center particle and similarity threshold of performance data class after performance number strong point to be detected:
Cr=(pt+cr*np)/(np+1)
Wherein, cr is center particle, and np is data point sum in class, and pt is the performance number strong point that is added, sim be pt and The similarity of performance data class, th are exception class threshold value, and lr is the learning rate threshold value for adjusting th.
When determining that ratio is more than exception class threshold value in step S510, method executes step S511, determines property to be detected Energy data point is non-abnormal point.
In step S510, when the ratio of determination is less than exception class threshold value, method 500 enters step S512.Step S512 Embodiment it is consistent with step S450, which is not described herein again.
Determine that the selection of method 500 executes step S511 when not being greater than range distribution threshold value in step S512.
Determine that method 500 enters step S513 when being greater than range distribution threshold value in step S512.The implementation of step S513 Mode is consistent with step S460, and which is not described herein again.
To sum up, method 500 is in step S511 and S513, it is determined that whether performance number strong point to be detected is abnormal point. By step S504 and S505, method 500 creates a performance data class.First class of this class Clustering Model.Pass through step Rapid S508 and S509, a performance number strong point to be detected can be generated a new performance data class by method 500, and will be gathered The classification sum of class model controls in the range of class threshold.
Optionally, method 500 further includes step S514.In step S514, performance number strong point to be detected is added to In one sliding window.The sliding window usually remains with predetermined quantity (the i.e. window of newest acquisition in 500 implementation procedure of method Preset width) performance number strong point (the performance number strong point obtained by step S501).It should be noted that method 500 When executing step S504 and S505, step S514 will remove in sliding window the performance data point in newly-built performance data class Data point before.
In addition, method 500 also executes step S515 when one performance data point of addition is abnormal point in step S514. In step S515, judge whether the ratio of abnormal point in sliding window is more than window threshold value.
When being more than window threshold value in step S515, method 500 executes step S516, is existed according to abnormal point and center particle The distance of each dimension determines the anomalous performance index in performance number strong point to be detected.
Method 500 further includes step S517, generates the alarm information for being directed to anomalous performance index.Method 500 is more specific Embodiment and application 300 are consistent, and which is not described herein again.
A9, the method as described in any one of A1-A8, further includes: performance number strong point to be detected is added to a cunning In dynamic window, which maintains the performance number strong point of the newest predetermined quantity got;Determining the property to be detected When energy data point is an abnormal point, judge whether the ratio of abnormal point in the sliding window is more than window threshold value.A10, such as A9 The method, further includes: when being more than the window threshold value, according to the distance of each dimension performance indicator, determine institute State the anomalous performance index in performance number strong point to be detected.A11, the method as described in any one of A1-A10, it is described to The similarity of the performance number strong point of detection and identified performance data class is less than the current similarity threshold of the performance data class When value, this method further include: be a performance data class and this class is added by the performance number strong point to be detected is newly-generated Into existing performance data class;Whether the classification sum of the current all performance data classes of judgement is more than class threshold, and More than when two nearest performance data class of distance will be merged into one.A12, the method as described in A11, wherein it is described will be away from Include: to calculate in all performance data class from the operation that two nearest performance data class merge into one, between any two in The distance of heart particle determines two nearest class cl of distance1And cl2,
By cl1And cl2Two classes merge into class cl3,
Cl is determined according to following formula3Center particle, similarity threshold and data point sum:
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1's Data point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2For cl2Similarity threshold, th3For cl3's Similarity threshold, np3For cl3Data point sum.
A13, the method as described in any one of A1-A12, further includes: when being more than exception class threshold value, determine described to be checked The performance number strong point of survey is non-abnormal point;When being less than range distribution threshold value, determine that the performance number strong point to be detected is Non- abnormal point.A14, the method as described in any one of A1-A13 are executing the determination and the performance number strong point to be detected Before the highest performance data class of similarity, this method further include: judge current existing performance data class sum whether non-zero; And or judge whether the dimension at performance number strong point to be detected is consistent with existing performance data class.A15, the side as described in A14 Method, further includes: determining that current existing performance data class sum is zero, or determining the dimension and existing performance data When class is inconsistent, which is generated as a performance data class.A17, the application as described in A16, institute Stating data capture unit further comprises: receiving module, is suitable for receiving from performance collection device instruction clustering performance collected A performance data group, which includes multidimensional performance indicator;With normalization module, it is suitable for normalizing the performance number It is the performance number strong point according to group.A18, the application as described in A16 or A17, wherein the multidimensional performance indicator includes the collection At least one of memory usage, cpu busy percentage, task throughput, task response-time, garbage reclamation frequency in group.A19, Application as described in any one of A16-A18, wherein the similarity calculated be suitable for being determined according to following manner it is described to The highest performance data class of performance number strong point similarity of detection: performance number strong point to be detected and existing each performance are calculated The distance of the center particle of data class;According at a distance from the center particle of each performance data class, performance to be detected is calculated The similarity of data point and this performance data class;The determining and highest performance data of performance number strong point to be detected similarity Class.A20, the application as described in A19, wherein the similarity calculated is suitable for calculating property to be detected according to following manner Energy data point is at a distance from the center particle of existing each performance data class: calculating performance number strong point to be detected and every individual character The Euclidean distance of the center particle of energy data class.A21, the application as described in A19 or A20, wherein the similarity calculated Suitable for calculating the similarity at performance number strong point and this performance data class to be detected according to following formula:
Wherein, d is the performance number strong point to be detected that is calculated at a distance from this performance data class center particle, Sim is the similarity with this performance data class.A22, the application as described in any one of claim A16-A21, wherein institute Polymerized unit is stated to be further adapted for:
The center particle and phase of the performance data class after performance number strong point to be detected is added are updated according to following formula Like degree threshold value:
Cr=(pt+cr*np)/(np+1)
Wherein, cr is center particle, and np is data point sum in class, and pt is the performance number strong point that is added, sim be pt and The similarity of performance data class, th are exception class threshold value, and lr is the learning rate threshold value for adjusting th.
A23, the application as described in any one of claim A16-A22, wherein the third judging unit is suitable for basis Following manner executes each dimension performance at the described pair of performance number strong point to be detected and the center particle of the performance data class The distance of index is ranked up, and calculate the maximum distance of predetermined ratio and at a distance from all dimensions and the ratio between, whether be greater than Range distribution threshold value:
Pt={ n1,...,niCr={ c1,...,ci}dI=|ni-ci|niFor i-th dimension in performance number strong point pt to be detected Performance indicator, ciFor the i-th dimension numerical value of center particle cr, diIt is pt i-th dimension at a distance from i-th dimension in cr,
To the d of all dimensionsiIt is ranked up, and calculatesWherein, N is all dimension sums, and M is to make a reservation in N The number of dimensions of ratio,For N-dimensional distance in it is maximum M be worth sum,For the sum of N number of distance,
Judge whether pr is greater than range distribution threshold value.
A24, the application as described in any one of A16-A23, further include window judging unit, are suitable for:
Performance number strong point to be detected is added in a sliding window, which maintains newest get The performance number strong point of predetermined quantity;And
When third judging unit determines that the performance number strong point to be detected is an abnormal point, judge in the sliding window Whether the ratio of abnormal point is more than window threshold value.
A25, the application as described in A24, further include Alarm Unit, are suitable for determining to be more than described in the window judging unit When window threshold value, according to the distance of each dimension performance indicator, the exception in the performance number strong point to be detected is determined Performance indicator.
A26, the application as described in any one of A16-A25, wherein the polymerized unit is further adapted for, in the first judging unit It is current to determine that the similarity of the performance number strong point to be detected and identified performance data class is less than the performance data class Similarity threshold when,
It is a performance data class and this class is added to existing property by the performance number strong point to be detected is newly-generated In energy data class,
Whether the classification sum of the current all performance data classes of judgement is more than class threshold, and will be apart from most when being more than Two close performance data class merge into one.
A27, the application as described in A26, wherein the polymerized unit is suitable for two by distance recently according to following manner Performance data class merges into one:
It calculates in all performance data classes, between any two the distance of center particle, determines two nearest class cl of distance1 And cl2,
By cl1And cl2Two classes merge into class cl3,
Cl is determined according to following formula3Center particle, similarity threshold and data point sum:
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1's Data point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2For cl2Similarity threshold, th3For cl3's Similarity threshold, np3For cl3Data point sum.
A28, the application as described in any one of A16-A27, wherein
The second judgment unit is further adapted for determining that the performance number strong point to be detected is non-when being more than exception class threshold value Abnormal point;
The third judging unit is further adapted for when being less than range distribution threshold value, determines the performance data to be detected The non-abnormal point of point.
A29, the application as described in any one of A16-A28 further include class detection unit, are suitable for calculating similarity unit Before the determining and highest performance data class of performance number strong point to be detected similarity,
Judge current existing performance data class sum whether non-zero;With or
Judge whether the dimension at performance number strong point to be detected is consistent with existing performance data class.
A30, the application as described in A29, wherein the class detection unit is further adapted for, and is determining current existing performance number It is zero according to class sum, or when the determining dimension and inconsistent existing performance data class, indicates that the polymerized unit should Performance number strong point to be detected generates a performance data class.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention Example can be practiced without these specific details.In some instances, well known method, knot is not been shown in detail Structure and technology, so as not to obscure the understanding of this specification.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of the various inventive aspects, Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: i.e. required to protect Shield the present invention claims than feature more features expressly recited in each claim.More precisely, as following As claims reflect, inventive aspect is all features less than single embodiment disclosed above.Therefore, it abides by Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself As a separate embodiment of the present invention.
Those skilled in the art should understand that the module of the equipment in example disclosed herein or unit or groups Part can be arranged in equipment as depicted in this embodiment, or alternatively can be positioned at and the equipment in the example In different one or more equipment.Module in aforementioned exemplary can be combined into a module or furthermore be segmented into multiple Submodule.
Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed Meaning one of can in any combination mode come using.
In addition, be described as herein can be by the processor of computer system or by executing by some in the embodiment The combination of method or method element that other devices of the function are implemented.Therefore, have for implementing the method or method The processor of the necessary instruction of element forms the device for implementing this method or method element.In addition, Installation practice Element described in this is the example of following device: the device be used for implement as in order to implement the purpose of the invention element performed by Function.
As used in this, unless specifically stated, come using ordinal number " first ", " second ", " third " etc. Description plain objects, which are merely representative of, is related to the different instances of similar object, and is not intended to imply that the object being described in this way must Must have the time it is upper, spatially, sequence aspect or given sequence in any other manner.
Although the embodiment according to limited quantity describes the present invention, above description, the art are benefited from It is interior it is clear for the skilled person that in the scope of the present invention thus described, it can be envisaged that other embodiments.Additionally, it should be noted that Language used in this specification primarily to readable and introduction purpose and select, rather than in order to explain or limit Determine subject of the present invention and selects.Therefore, without departing from the scope and spirit of the appended claims, for this Many modifications and changes are obvious for the those of ordinary skill of technical field.For the scope of the present invention, to this Invent done disclosure be it is illustrative and not restrictive, it is intended that the scope of the present invention be defined by the claims appended hereto.

Claims (29)

1. a kind of method for detecting cluster exception, comprising:
A performance number strong point to be detected for indicating the clustering performance is obtained, which includes normalized multidimensional Performance indicator;
From the existing performance data class by the polymerize generation in performance number strong point acquired before, determination is to be detected with this The highest performance data class of performance number strong point similarity;
Whether the similarity for judging the performance number strong point to be detected and identified performance data class is more than the performance data class Current similarity threshold;
When being more than current similarity threshold, which is aggregated to identified performance data class In, and calculate after polymerization that data point sum accounts for the ratio of the data point sum of current all properties data class in the performance data class It whether is more than exception class threshold value;
The center particle and similarity of the performance data class after performance number strong point to be detected is added are updated according to following formula Threshold value:
Cr=(pt+cr*np)/(np+1)
Wherein, cr is center particle, and np is data point sum in class, and pt is the performance number strong point being added, and sim is pt and performance The similarity of data class, th are exception class threshold value, and lr is the learning rate threshold value for adjusting th;
When being less than exception class threshold value, to each of the performance number strong point to be detected and the center particle of the performance data class The sum of maximum distance that the distance of dimension performance indicator is ranked up, and calculates predetermined ratio at a distance from all dimensions and the ratio between Whether range distribution threshold value is greater than;And
When being greater than range distribution threshold value, determine that the performance number strong point to be detected is an abnormal point.
2. the method as described in claim 1, described to obtain a performance number strong point to be detected for indicating the clustering performance The step of include:
A performance data group from performance collection device instruction clustering performance collected is received, which includes more Tie up performance indicator;
Normalizing the performance data group is the performance number strong point.
3. method according to claim 1 or 2, wherein the multidimensional performance indicator include memory usage in the cluster, At least one of cpu busy percentage, task throughput, task response-time, garbage reclamation frequency.
4. method according to claim 1 or 2, wherein determine the highest property of performance number strong point to be detected similarity Can include: the step of data class
Performance number strong point to be detected is calculated at a distance from the center particle of existing each performance data class;
According at a distance from the center particle of each performance data class, performance number strong point to be detected and this performance data are calculated The similarity of class;
The determining and highest performance data class of performance number strong point to be detected similarity.
5. method as claimed in claim 4, wherein described to calculate performance number strong point to be detected and existing each performance number According to class center particle apart from the step of include:
Calculate the Euclidean distance of the center particle of performance number strong point to be detected and each performance data class.
6. method as claimed in claim 4, wherein it is similar to this performance data class to calculate performance number strong point to be detected The operation of degree includes:
Wherein, d is the performance number strong point to be detected that is calculated at a distance from the center particle of this performance data class, sim For the similarity with this performance data class.
7. method according to claim 1 or 2, wherein the described pair of performance number strong point to be detected and the performance data class The distance of each dimension performance indicator of center particle be ranked up, and calculate predetermined ratio maximum distance and with all dimensions The distance of degree and the ratio between, whether be greater than range distribution threshold value the step of include:
Pt={ n1,...,niCr={ c1,...,ci} di=| ni-ci| niFor i-th dimension in performance number strong point pt to be detected Performance indicator, ciFor the i-th dimension numerical value of center particle cr, diIt is pt i-th dimension at a distance from i-th dimension in cr,
To the d of all dimensionsiIt is ranked up, and calculatesWherein, N is all dimension sums, and M is predetermined ratio in N Number of dimensions,For N-dimensional distance in it is maximum M be worth sum,For the sum of N number of distance,
Judge whether pr is greater than range distribution threshold value.
8. method according to claim 1 or 2, further includes:
Performance number strong point to be detected is added in a sliding window, the sliding window maintain it is newest get it is predetermined The performance number strong point of quantity;And
Determine the performance number strong point to be detected be an abnormal point when, judge abnormal point in the sliding window ratio whether More than window threshold value.
9. method according to claim 8, further includes: when being more than the window threshold value, according to each dimension performance The distance of index determines the anomalous performance index in the performance number strong point to be detected.
10. it is method according to claim 1 or 2, at the performance number strong point to be detected and identified performance data class Similarity when being less than the current similarity threshold of the performance data class, this method further include:
It is a performance data class and this class is added to existing performance number by the performance number strong point to be detected is newly-generated According in class;
Whether the classification sum of the current all performance data classes of judgement is more than class threshold, and that distance is nearest when being more than Two performance data class merge into one.
11. method as claimed in claim 10, wherein two performance data class that distance is nearest merge into one Operation includes:
It calculates in all performance data classes, between any two the distance of center particle, determines two nearest class cl of distance1With cl2,
By cl1And cl2Two classes merge into class cl3,
Cl is determined according to following formula3Center particle, similarity threshold and data point sum:
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1Data Point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2For cl2Similarity threshold, th3For cl3It is similar Spend threshold value, np3For cl3Data point sum.
12. method according to claim 1 or 2, further includes:
When being more than exception class threshold value, determine that the performance number strong point to be detected is non-abnormal point;
When being less than range distribution threshold value, determine that the performance number strong point to be detected is non-abnormal point.
13. it is method according to claim 1 or 2, executing the determination and the performance number strong point to be detected similarity most Before high performance data class, this method further include:
Judge current existing performance data class sum whether non-zero;With or
Judge whether the dimension at performance number strong point to be detected is consistent with existing performance data class.
14. method as claimed in claim 13, further includes:
It is determining that current existing performance data class sum is zero, or is determining that the dimension and existing performance data class are different When cause, which is generated as a performance data class.
15. a kind of device for detecting cluster exception, comprising:
Data capture unit, suitable for obtaining a performance number strong point to be detected for indicating the clustering performance, the performance data Point includes normalized multidimensional performance indicator;
Similarity calculated, suitable for from the existing performance data class by the polymerize generation in performance number strong point acquired before In, the determining and highest performance data class of performance number strong point to be detected similarity;
First judging unit, suitable for judge the performance number strong point to be detected and identified performance data class similarity whether The similarity threshold current more than the performance data class;
Polymerized unit, suitable for the first judging unit determine be more than current similarity threshold when, by the performance number to be detected Strong point is aggregated in identified performance data class, updates the property after performance number strong point to be detected is added according to following formula The center particle and similarity threshold of energy data class:
Cr=(pt+cr*np)/(np+1)
Th=th+lr/np* (sim-th)
Wherein, cr is center particle, and np is data point sum in class, and pt is the performance number strong point being added, and sim is pt and performance The similarity of data class, th are exception class threshold value, and lr is the learning rate threshold value for adjusting th;
Second judgment unit accounts for current all properties data class suitable for data point sum in the performance data class after calculating polymerization Whether the ratio of data point sum is more than exception class threshold value;And
Third judging unit, suitable for when being less than exception class threshold value, to the performance number strong point to be detected and the performance data The sum of maximum distance that the distance of each dimension performance indicator of the center particle of class is ranked up, and calculates predetermined ratio and institute Have dimension distance and the ratio between whether be greater than range distribution threshold value,
And when being greater than range distribution threshold value, determine that the performance number strong point to be detected is an abnormal point.
16. device as claimed in claim 15, the data capture unit further comprises:
Receiving module, suitable for receiving a performance data group from performance collection device instruction clustering performance collected, the property Energy data group includes multidimensional performance indicator;With
Module is normalized, being suitable for normalizing the performance data group is the performance number strong point.
17. the device as described in claim 15 or 16, wherein the multidimensional performance indicator includes memory utilization in the cluster At least one of rate, cpu busy percentage, task throughput, task response-time, garbage reclamation frequency.
18. the device as described in claim 15 or 16, wherein the similarity calculated is suitable for being determined according to following manner The highest performance data class of performance number strong point to be detected similarity:
Performance number strong point to be detected is calculated at a distance from the center particle of existing each performance data class;
According at a distance from the center particle of each performance data class, performance number strong point to be detected and this performance data are calculated The similarity of class;
The determining and highest performance data class of performance number strong point to be detected similarity.
19. device as claimed in claim 18, wherein the similarity calculated is suitable for being calculated according to following manner to be checked The performance number strong point of survey is at a distance from the center particle of existing each performance data class:
Calculate the Euclidean distance of the center particle of performance number strong point to be detected and each performance data class.
20. device as claimed in claim 18, wherein the similarity calculated is suitable for being calculated according to following formula to be checked The performance number strong point of survey and the similarity of this performance data class:
Wherein, d is the performance number strong point to be detected that is calculated at a distance from this performance data class center particle, and sim is With the similarity of this performance data class.
21. the device as described in claim 15 or 16, wherein the third judging unit is suitable for executing institute according to following manner It states and the performance number strong point to be detected is carried out at a distance from each dimension performance indicator of the center particle of the performance data class Sequence, and calculate the maximum distance of predetermined ratio and at a distance from all dimensions and the ratio between, whether greater than range distribution threshold value:
Pt={ n1,...,niCr={ c1,...,ci} di=| ni-ci| niFor i-th dimension in performance number strong point pt to be detected Performance indicator, ciFor the i-th dimension numerical value of center particle cr, diIt is pt i-th dimension at a distance from i-th dimension in cr,
To the d of all dimensionsiIt is ranked up, and calculatesWherein, N is all dimension sums, and M is predetermined ratio in N Number of dimensions,For N-dimensional distance in it is maximum M be worth sum,For the sum of N number of distance,
Judge whether pr is greater than range distribution threshold value.
22. the device as described in claim 15 or 16 further includes window judging unit, is suitable for:
Performance number strong point to be detected is added in a sliding window, the sliding window maintain it is newest get it is predetermined The performance number strong point of quantity;And
When third judging unit determines that the performance number strong point to be detected is an abnormal point, judge abnormal in the sliding window Whether the ratio of point is more than window threshold value.
23. device as claimed in claim 22 further includes Alarm Unit, it is suitable for determining to be more than institute in the window judging unit When stating window threshold value, according to the distance of each dimension performance indicator, determine different in the performance number strong point to be detected Normal performance indicator.
24. the device as described in claim 15 or 16 determines institute in the first judging unit wherein the polymerized unit is further adapted for It states performance number strong point to be detected and the similarity of identified performance data class and is less than current similar of the performance data class When spending threshold value,
It is a performance data class and this class is added to existing performance number by the performance number strong point to be detected is newly-generated According in class,
Whether the classification sum of the current all performance data classes of judgement is more than class threshold, and that distance is nearest when being more than Two performance data class merge into one.
25. device as claimed in claim 24, wherein the polymerized unit is suitable for will be apart from nearest two according to following manner A performance data class merges into one:
It calculates in all performance data classes, between any two the distance of center particle, determines two nearest class cl of distance1With cl2,
By cl1And cl2Two classes merge into class cl3,
Cl is determined according to following formula3Center particle, similarity threshold and data point sum:
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1Data Point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2For cl2Similarity threshold, th3For cl3It is similar Spend threshold value, np3For cl3Data point sum.
26. the device as described in claim 15 or 16, wherein
The second judgment unit is further adapted for determining the non-exception in performance number strong point to be detected when being more than exception class threshold value Point;
The third judging unit is further adapted for when being less than range distribution threshold value, determines that the performance number strong point to be detected is non- Abnormal point.
27. the device as described in claim 15 or 16 further includes class detection unit, be suitable for calculate similarity unit determine with Before the highest performance data class of performance number strong point to be detected similarity,
Judge current existing performance data class sum whether non-zero;With or
Judge whether the dimension at performance number strong point to be detected is consistent with existing performance data class.
28. device as claimed in claim 27, wherein the class detection unit is further adapted for, and is determining current existing performance Data class sum is zero, or when the determining dimension and inconsistent existing performance data class, indicates that the polymerized unit will The performance number strong point to be detected generates a performance data class.
29. a kind of system for managing cluster, comprising:
Performance collection device, suitable for collecting the performance indicator of the cluster;
Such as the device of the detection cluster exception of any one of claim 15-28;And
Resource management apparatus adjusts the cluster suitable for the device alarm information generated according to the detection cluster exception Resource distribution.
CN201610380755.8A 2016-06-01 2016-06-01 Detect the method for cluster exception and the system of application, management cluster Active CN105871634B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610380755.8A CN105871634B (en) 2016-06-01 2016-06-01 Detect the method for cluster exception and the system of application, management cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610380755.8A CN105871634B (en) 2016-06-01 2016-06-01 Detect the method for cluster exception and the system of application, management cluster

Publications (2)

Publication Number Publication Date
CN105871634A CN105871634A (en) 2016-08-17
CN105871634B true CN105871634B (en) 2019-02-15

Family

ID=56675631

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610380755.8A Active CN105871634B (en) 2016-06-01 2016-06-01 Detect the method for cluster exception and the system of application, management cluster

Country Status (1)

Country Link
CN (1) CN105871634B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108228442B (en) * 2016-12-14 2020-10-27 华为技术有限公司 Abnormal node detection method and device
CN108206813B (en) * 2016-12-19 2021-08-06 中国移动通信集团山西有限公司 Security audit method and device based on k-means clustering algorithm and server
CN107238407B (en) * 2017-05-03 2019-10-08 华北水利水电大学 Project of South-to-North water diversion secure data abnormal patterns find method and system
CN109271289B (en) * 2017-07-18 2022-05-03 车伯乐(北京)信息科技有限公司 Application interface monitoring method, device, equipment and computer readable medium
CN107528904B (en) * 2017-09-01 2020-02-18 星环信息科技(上海)有限公司 Method and apparatus for data distributed anomaly detection
CN107835098B (en) * 2017-11-28 2021-01-29 车智互联(北京)科技有限公司 Network fault detection method and system
CN107995030B (en) * 2017-11-28 2021-09-14 车智互联(北京)科技有限公司 Network detection method, network fault detection method and system
CN109374063B (en) * 2018-12-04 2021-04-23 广东电网有限责任公司 Cluster management-based transformer anomaly detection method, device and equipment
US10977112B2 (en) 2019-01-22 2021-04-13 International Business Machines Corporation Performance anomaly detection
CN110502346A (en) * 2019-08-28 2019-11-26 高瑶 Resource information management system and method under a kind of cluster environment
CN111612038B (en) * 2020-04-24 2024-04-26 平安直通咨询有限公司上海分公司 Abnormal user detection method and device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102104611A (en) * 2011-03-31 2011-06-22 中国人民解放军信息工程大学 Promiscuous mode-based DDoS (Distributed Denial of Service) attack detection method and device
CN102547715A (en) * 2012-02-07 2012-07-04 上海交通大学 Method for detecting wireless mesh network attack
CN103001825A (en) * 2012-11-15 2013-03-27 中国科学院计算机网络信息中心 Method and system for detecting DNS (domain name system) traffic abnormality
CN104536996A (en) * 2014-12-12 2015-04-22 南京理工大学 Computational node anomaly detection method in isomorphic environments

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150219530A1 (en) * 2013-12-23 2015-08-06 Exxonmobil Research And Engineering Company Systems and methods for event detection and diagnosis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102104611A (en) * 2011-03-31 2011-06-22 中国人民解放军信息工程大学 Promiscuous mode-based DDoS (Distributed Denial of Service) attack detection method and device
CN102547715A (en) * 2012-02-07 2012-07-04 上海交通大学 Method for detecting wireless mesh network attack
CN103001825A (en) * 2012-11-15 2013-03-27 中国科学院计算机网络信息中心 Method and system for detecting DNS (domain name system) traffic abnormality
CN104536996A (en) * 2014-12-12 2015-04-22 南京理工大学 Computational node anomaly detection method in isomorphic environments

Also Published As

Publication number Publication date
CN105871634A (en) 2016-08-17

Similar Documents

Publication Publication Date Title
CN105871634B (en) Detect the method for cluster exception and the system of application, management cluster
CN109542740B (en) Abnormality detection method and apparatus
CN107871190B (en) Service index monitoring method and device
CN112188531B (en) Abnormality detection method, abnormality detection device, electronic apparatus, and computer storage medium
JP6354755B2 (en) System analysis apparatus, system analysis method, and system analysis program
KR101948604B1 (en) Method and device for equipment health monitoring based on sensor clustering
CN108429649B (en) System for comprehensive abnormity judgment based on multiple single-type acquisition results
CN106600115A (en) Intelligent operation and maintenance analysis method for enterprise information system
CN105071983A (en) Abnormal load detection method for cloud calculation on-line business
CN108984376B (en) System anomaly detection method, device and equipment
CN101908065A (en) On-line attribute abnormal point detecting method for supporting dynamic update
JP2014032657A (en) Abnormality detecting method and device thereof
CN111367747B (en) Index abnormal detection early warning device based on time annotation
CN103154904B (en) Operational administrative equipment, operation management method and program
KR20170084445A (en) Method and apparatus for detecting abnormality using time-series data
JP5928104B2 (en) Performance monitoring device, performance monitoring method, and program thereof
CN116976707B (en) User electricity consumption data anomaly analysis method and system based on electricity consumption data acquisition
CN109685140A (en) A kind of DBSCAN algorithm gantry crane state classification method based on principal component analysis
CN115858303B (en) Zabbix-based server performance monitoring method and system
CN114327964A (en) Method, device, equipment and storage medium for processing fault reasons of service system
CN111444060A (en) Anomaly detection model training method, anomaly detection method and related device
CN111240943A (en) Method, device and equipment for monitoring temperature of machine room and storage medium
KR101960755B1 (en) Method and apparatus of generating unacquired power data
CN109976986B (en) Abnormal equipment detection method and device
CN110889597A (en) Method and device for detecting abnormal business timing sequence indexes

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220720

Address after: 100193 room 101-216, 2nd floor, building 4, East District, yard 10, northwest Wangdong Road, Haidian District, Beijing

Patentee after: Beijing Ruixiang Technology Co.,Ltd.

Address before: 100191 floors 3 and 4, building a-5, Dongsheng Science Park, Zhongguancun, No. 66, xixiaokou Road, Haidian District, Beijing

Patentee before: BEIJING ONEAPM Co.,Ltd.

TR01 Transfer of patent right