CN105871634A - Method and application for detecting cluster anomalies and cluster managing system - Google Patents

Method and application for detecting cluster anomalies and cluster managing system Download PDF

Info

Publication number
CN105871634A
CN105871634A CN201610380755.8A CN201610380755A CN105871634A CN 105871634 A CN105871634 A CN 105871634A CN 201610380755 A CN201610380755 A CN 201610380755A CN 105871634 A CN105871634 A CN 105871634A
Authority
CN
China
Prior art keywords
performance
class
performance data
detected
strong point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610380755.8A
Other languages
Chinese (zh)
Other versions
CN105871634B (en
Inventor
吴海珊
阮松松
刘麒贇
傅乐琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ruixiang Technology Co ltd
Original Assignee
Beijing Oneapm Communication Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Oneapm Communication Technology Co Ltd filed Critical Beijing Oneapm Communication Technology Co Ltd
Priority to CN201610380755.8A priority Critical patent/CN105871634B/en
Publication of CN105871634A publication Critical patent/CN105871634A/en
Application granted granted Critical
Publication of CN105871634B publication Critical patent/CN105871634B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements
    • H04L43/55Testing of service level quality, e.g. simulating service usage

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a method and application for detecting cluster anomalies and a cluster managing system. The method for detecting cluster anomalies includes the following steps that a to-be-detected performance data point for indicating the cluster performance is obtained; a performance data class with the highest similarity degree of the performance data point is determined; whether the similarity degree of the performance data point and the determined performance data class is over than a similarity-degree threshold value or not is determined; when the similarity degree is over than the threshold value, the performance data point is aggregated to the determined performance data class, and whether the proportion that the data-point total amount accounts for the data-point total amount of all the performance data classes is over than an exception class threshold value or not is calculated; when the proportion is not over than the exception class threshold value, the distance of the indexes of all the dimension performance between the performance data point and a center mass point are sorted, and whether the ratio between the sum of the maximum distance of the reserved ratio and the sum of the distance of all dimensions is larger than a distance distribution threshold value or not is calculated; when the ratio is larger than the distance distribution threshold value, the to-be-detected performance data point is an anomalous point.

Description

The method of detection cluster exception and application, the system of management cluster
Technical field
The present invention relates to internet arena, particularly relate to detect the abnormal method of cluster and application, management collection The system of group.
Background technology
Along with the progress of Internet technology, cluster based on cloud computing framework is more and more applied respectively In field.Cluster generally can include that multiple stage calculates equipment (such as, application server or database clothes Business device etc.).Cluster can be configured to perform Distributed Application or be configured to equilibrium to provide multiple classes As calculate service.Cluster has enhanced scalability, is generally of substantial amounts of device node.In order to collection Group's performance is safeguarded, cluster performance is carried out detection and is very important.
In the face of the performance data of the big order of magnitude of cluster, the performance detection means of high automation and high accuracy is Need badly.At present, it has been disclosed that some performance detection means (or referred to as abnormality detection means) use Performance data is classified and determines abnormal data by the mode of machine learning.Machine learning for performance detection Including performance data being had supervision and unsupervised learning.Such as, clustering algorithm based on kmeans is to property Can data carry out clustering and abnormality detection.But, existing abnormality detection means are in the degree of accuracy, stability Etc. aspect the most not enough.
Therefore, the present invention proposes a kind of new abnormality detection scheme.
Summary of the invention
To this end, the present invention provides a kind of new abnormality detection scheme, effectively solve above at least one Problem.
According to an aspect of the present invention, it is provided that a kind of method detecting cluster exception, comprise the steps. Obtain a performance number strong point to be detected of instruction cluster performance.This performance number strong point includes normalized Multidimensional performance indications.From the existing performance data by the be polymerized generation in performance number strong point acquired before In class, determine the performance data class that the performance number strong point similarity to be detected with this is the highest.Judge that this is to be checked The performance number strong point surveyed with determined by the similarity of performance data class whether to exceed this performance data class current Similarity threshold.When exceeding current similarity threshold, the performance number strong point polymerization to be detected by this In performance data class determined by, and in this performance data class, data point sum accounts for currently after calculating polymerization Whether the ratio of the data point sum of all properties data class exceedes exception class threshold value.Not less than exception class Each dimension during threshold value, to this performance number strong point to be detected Yu the center particle of this performance data class The distance of energy index is ranked up, and calculates the ultimate range sum of predetermined ratio and the distance of all dimensions The ratio of sum, whether more than range distribution threshold value.When more than range distribution threshold value, determine that this is to be detected Performance number strong point is an abnormity point.
According to another aspect of the present invention, it is provided that a kind of application detecting cluster exception, including data acquisition Unit, similarity calculated, the first judging unit, polymerized unit, the second judging unit and the 3rd are sentenced Disconnected unit.Data capture unit is suitable to obtain a performance number strong point to be detected of instruction cluster performance. This performance number strong point includes normalized multidimensional performance indications.Similarity calculated, is suitable to from existing In performance data class by the be polymerized generation in performance number strong point acquired before, determine to be detected with this The performance data class that performance number strong point similarity is the highest.First judging unit is suitable to judge the property that this is to be detected Can data point to determined by the similarity of performance data class whether exceed current similar of this performance data class Degree threshold value.Polymerized unit is suitable to when the first judging unit determines and exceedes current similarity threshold, should Performance number strong point to be detected be aggregated to determined by performance data class.Second judging unit is suitable to calculate After polymerization, in this performance data class, data point sum accounts for the ratio of the data point sum of current all properties data class Whether example exceedes exception class threshold value.3rd judging unit is suitable to, when not less than exception class threshold value, treat this The performance number strong point of detection is entered with the distance of each dimension performance indications of the center particle of this performance data class Row sequence, and calculate the ultimate range sum of predetermined ratio and the distance of all dimensions and ratio, the biggest In range distribution threshold value.When more than range distribution threshold value, the 3rd judging unit determines this property to be detected Can data point be an abnormity point.
Alternatively, in the application that the detection cluster according to the present invention is abnormal, data capture unit is further Including receiver module and normalization module.Receiver module is suitable to receive the finger gathered from performance collection device Show a performance data group of cluster performance.This performance data group includes multidimensional performance indications.Normalized mode It is performance number strong point that block is suitable to normalize this performance data group.Multidimensional performance indications include in described cluster Deposit in utilization rate, cpu busy percentage, task throughput, task response-time, garbage reclamation frequency at least A kind of.
Alternatively, in the application that the detection cluster according to the present invention is abnormal, similarity calculated is suitable to The performance data class that described performance number strong point similarity to be detected is the highest is determined according to following manner.Calculate The distance of the center particle of performance number strong point to be detected and existing each performance data class.According to often The distance of the center particle of individual performance data class, calculates performance number strong point to be detected and this performance data The similarity of class.Determine the performance data class the highest with performance number strong point similarity to be detected.Wherein, Similarity calculated is suitable to calculate performance number strong point to be detected and existing every individual character according to following manner The distance of the center particle of energy data class: calculate performance number strong point to be detected and each performance data class The Euclidean distance of center particle.
Alternatively, in the application that the detection cluster according to the present invention is abnormal, similarity calculated is suitable to The performance number strong point to be detected according to the calculating of following formula and the similarity of this performance data class:
s i m = 1 1 + d
Wherein, d is calculated performance number strong point to be detected and this performance data class center particle Distance, sim is and the similarity of this performance data class.
Alternatively, in the application that the detection cluster according to the present invention is abnormal, polymerized unit is further adapted for basis Following formula is more newly added the center particle of the performance data class after performance number strong point to be detected with similar Degree threshold value:
Cr=(pt+cr*np)/(np+1)
t h = t h + l r n p * ( s i m - t h )
Wherein, particle centered by cr, np is data point sum in class, and pt is the performance data added Point, sim is the similarity of pt and performance data class, and th is exception class threshold value, and lr is for regulating th's Learning rate threshold value.
Alternatively, in the application that the detection cluster according to the present invention is abnormal, the 3rd judging unit is suitable to root The described center particle to this performance number strong point to be detected and this performance data class is performed according to following manner The distance of each dimension performance indications is ranked up, and calculate predetermined ratio ultimate range and with all dimensions The ratio of distance sum of degree, whether it is more than range distribution threshold value:
Pt={n1,...,niCr={c1,...,ci}dI=|ni-ci|niFor in performance number strong point pt to be detected i-th Dimension performance indications, ciCentered by the i-th dimension numerical value of particle cr, diFor point i-th dimension and i-th dimension in cr Distance,
D to all dimensionsiIt is ranked up, and calculatesWherein, N is all dimensions sum, M is the number of dimensions of predetermined ratio in NFor M the sum being worth maximum in N-dimensional distance,For the sum of N number of distance,
Judge that whether pr is more than range distribution threshold value.
Alternatively, also include window judging unit according to the application that the detection cluster of the present invention is abnormal, be suitable to Performance number strong point to be detected is joined in a sliding window.This sliding window maintains up-to-date acquisition The performance number strong point of the predetermined quantity arrived.Determine that this performance number strong point to be detected is at the 3rd judging unit During one abnormity point, window judging unit judges in this sliding window, whether the ratio of abnormity point exceedes window Threshold value.
Alternatively, Alarm Unit is also included according to the application that the detection cluster of the present invention is abnormal.Alarm Unit Be suitable to when window judging unit determines and exceedes described window threshold value, according to each dimension performance indications away from From, determine the anomalous performance index in performance number strong point to be detected.
Alternatively, in the application that the detection cluster according to the present invention is abnormal, determine at the first judging unit Described performance number strong point to be detected with determined by the similarity of performance data class not less than this performance data During the current similarity threshold of class, polymerized unit is further adapted for this that performance number strong point to be detected is newly-generated is This class is also joined in existing performance data class by one performance data class.Polymerized unit is further adapted for sentencing Whether the classification sum of disconnected current all of performance data class exceedes class threshold, and when exceeding by distance Two nearest performance data class merge into one.Wherein, be suitable to will be away from according to following manner for polymerized unit One is merged into: calculate in all of performance data class, between any two from two nearest performance data class The distance of center particle, determines closest two class cl1And cl2.By cl1And cl2Two classes are merged into Class cl3.Cl is determined according to following formula3Center particle, similarity threshold and data point sum:
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1Data point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2 For cl2Similarity threshold, th3For cl3Similarity threshold, np3For cl3Data point sum.
Alternatively, in the application that the detection cluster according to the present invention is abnormal, the second judging unit is further adapted for When exceeding exception class threshold value, determine the non-abnormity point in performance number strong point to be detected.3rd judging unit is also Be suitable to when not less than range distribution threshold value, determine the non-abnormity point in performance number strong point to be detected.
Alternatively, class detector unit is also included according to the application that the detection cluster of the present invention is abnormal.Class detects Unit is suitable to determine, at calculating similarity unit, the performance that the performance number strong point similarity to be detected with this is the highest Before data class, it is judged that current existing performance data class sum whether non-zero, and or judge to be detected The dimension at performance number strong point is the most consistent with existing performance data class.
Alternatively, in the application that the detection cluster according to the present invention is abnormal, class detector unit is further adapted for Determine that current existing performance data class sum is zero, or determine described dimension and existing performance data When class is inconsistent, this performance number strong point to be detected is generated a performance data class by instruction polymerized unit.
According to a further aspect of the invention, it is provided that a kind of system managing cluster, including performance collector, The application of detection cluster and resource management applications.Performance collection device is suitable to collect the performance indications of cluster.Money Source control application is suitable to the alarm information generated according to the application that detection cluster is abnormal, the resource of regulation cluster Configuration.
Abnormality detection scheme according to the present invention, can include the property of multidimensional performance indications to acquisition in real time Data point can carry out incremental clustering, and judge performance data by adaptive threshold in cluster process Whether the class that point is added belongs to exception class.So, the class that the abnormality detection scheme of the present invention is polymerized and The degree of accuracy of the outlier detection operation carried out has robustness.Further, by performance number strong point Carrying out statistical appraisal with the distance of the class center each dimension of particle, the abnormality detection scheme of the present invention can be right The point that in class, similarity is high and similarity is low is preferably distinguished.So, abnormality detection scheme can drop Low rate of false alarm.Additionally, the abnormality detection scheme of the present invention, by judging abnormity point based on sliding window Ratio in the window, can improve the degree of accuracy of abnormality alarming further.The abnormality detection side of the present invention Case can also control the classification sum of Clustering Model, and changes at data dimension, re-creates in time Clustering Model, thus ensure that the stability of abnormality detection.
Accompanying drawing explanation
In order to realize above-mentioned and relevant purpose, describe some herein in conjunction with explained below and accompanying drawing and say Bright property aspect, these aspects indicate can be to put into practice the various modes of principles disclosed herein, and institute Aspect and equivalence aspect thereof is had to be intended to fall under in the range of theme required for protection.Read by combining accompanying drawing Reading detailed description below, above-mentioned and other purpose, feature and the advantage of the disclosure will become brighter Aobvious.Throughout the disclosure, identical reference generally refers to identical parts or element.
Fig. 1 shows the schematic diagram of the cluster 100 of some the enforcement row according to the present invention;
Fig. 2 shows the schematic diagram of the application 200 of detection cluster exception according to some embodiments of the invention;
Fig. 3 shows the schematic diagram of the application 300 of detection cluster exception according to some embodiments of the invention;
Fig. 4 shows the flow chart of the method 400 of detection cluster exception according to some embodiments of the invention; And
Fig. 5 shows the flow chart of the method 500 of detection cluster exception according to some embodiments of the invention.
Detailed description of the invention
It is more fully described the exemplary embodiment of the disclosure below with reference to accompanying drawings.Although accompanying drawing shows The exemplary embodiment of the disclosure, it being understood, however, that may be realized in various forms the disclosure and not Should be limited by embodiments set forth here.On the contrary, it is provided that these embodiments are able to more thoroughly Understand the disclosure, and complete for the scope of the present disclosure can be conveyed to those skilled in the art.
Fig. 1 shows the schematic diagram of the cluster 100 of some the enforcement row according to the present invention.
As it is shown in figure 1, cluster 100 includes multiple calculating equipment.Each calculating equipment is in cluster one Device node.Group system 100 includes application server 110 and 120, database server 130 He 140, management server 150 and monitoring server 160, but it is not limited to this.Wherein, management server 150 In be populated with resource management applications 151.Monitoring server 160 is populated with performance collection device 161 and detection The application 162 that cluster is abnormal.
Resource management applications 151 is suitable to carry out device node in cluster 100 resource scheduling management, such as, A device node is indicated to create a server instance, one device node of isolation or add one newly Device node to cluster etc..Depend on the framework (such as Hadoop or Spark etc.) of cluster 100, Resource management applications 151 can be multiple known cluster management application, repeats no more here.
Performance collection device 161 is suitable to collect cluster 100 at least one of performance indications data.Performance refers to The type of mark data can be many achievement datas such as device node hardware, operating system and application. The type of performance indications data such as includes memory usage, cpu busy percentage, disk occupancy, task Handling capacity, task response-time, garbage reclamation frequency etc., but it is not limited to this.Wherein, task throughput It can be the task (e.g. access request, calculating task dispatching) that can process the device node unit interval Quantity.In an embodiment in accordance with the invention, performance collection device 161 can be with periodic harvest performance Data group.Each performance data group includes the performance indications of multiple dimension.Here, the performance of each dimension Index can be same type, the memory usage of the most multiple device nodes.Each performance data group is also The performance indications of multiple kind can be included.Such as, a performance data group includes a device node Multiple performance index value.The most such as, a performance data group can include each in multiple device node Multiple performance indications.It addition, the concrete mode that data collected by performance collection device 161 can use multiple Known technical approach, such as, is deployed with the probe agent of acquisition performance achievement data in each device node. Multiple probes can be by the performance indications tidal data recovering that gathered to performance collection device 161.According to from multiple The performance indications data of device node, performance collection device 161 is configurable to generate the property including multiple dimension Can data group.It is said that in general, the acquisition time of the performance indications of each dimension is consistent in performance data group , although there may be regular hour error.In order to simplify description, more to performance collection device here Known implementation repeat no more, and these modes can be used for this invention.
The application 162 of detection cluster exception is suitable to, according to the performance data group collected by performance collection device, enter Row abnormality detection based on clustering learning.Application 162 is when determining that cluster 100 is abnormal, it is also possible to generate Corresponding alarm information, and it is transferred to resource management applications 151.So, resource management applications 151 is permissible The operations such as cluster resource management and running are carried out according to this alarm information.
Although it should be noted that the application 162 of the detection cluster exception shown in Fig. 1 and performance collection device 161 dwell in monitoring server 160, but this is not done too much restriction by the present invention.An embodiment In, performance collection device 161 and application 162 are distributed in different device nodes.Such as, performance collection Device 161 can be configured to reside in management server 150.It addition, application according to the present invention 151, 161 and 162 are limited to reside in individual node equipment.According in another embodiment of the present invention, Each application can be Distributed Application.Such as, the application 162 that monitoring cluster is abnormal is distributed in multiple On device node.So, the application 162 that detection cluster is abnormal can complete performance number with high real-time Detection according to group.Below in conjunction with Fig. 2, the application detecting cluster abnormal according to the present invention is carried out more specifically Explanation.
Fig. 2 shows the schematic diagram of the application 200 of detection cluster exception according to some embodiments of the invention. It should be noted that application 200 both may reside within a calculating equipment, it is also possible to be distributed answering With, hereafter this is the most too much illustrated to simplify description.
As in figure 2 it is shown, application 200 include data capture unit 210, similarity calculated 220, the One judging unit 230, polymerized unit the 240, second judging unit 250 and the 3rd judging unit 260.
Data capture unit 210 is suitable to obtain the performance number strong point to be detected of instruction cluster (100) performance. Due to the needs of follow-up Similarity Measure, performance number strong point here includes normalized multidimensional performance indications.
In an embodiment in accordance with the invention, data capture unit 210 can be from performance collection device (161) Obtain performance data group.Performance data group includes multidimensional performance indications.Performance indications in performance data group Quantitative criteria itself is normalized dimension (span of performance indications is 0 to 1).So, data Each performance data group from performance collector can be included many directly as one by acquiring unit 210 The performance number strong point of dimension attribute value.
In yet another embodiment, refer to from performance at least some of in the performance data group of performance collection device Mark non-normalized dimension.In other words, the span of at least some of performance indications is not limited to 0 to 1 Interval.To this end, data capture unit 210 can be configured to include receiver module (not shown) and Normalization module (not shown).Receiver module is suitable to receive the instruction collection gathered from performance collection device The performance data group of group's performance.It is a performance number that normalization module is suitable to normalize each performance data group Strong point.Such as,
Pt={n1,...,niPt is a performance data point, including the performance indications of i dimension.Each performance Index is that span is in [0,1] interval.
Similarity calculated 220 is suitable to be calculated and determined and performance number strong point similarity the most to be detected The highest existing performance data class.Before the performance number strong point to be detected to this, polymerized unit 230 Generally generate at least one performance data class.Each performance data class includes one or more performance Data point.In order to make a distinction, by each performance number in the present invention with performance data point the most to be detected The performance number strong point referred to as detected according to data point in class.Here, existing performance data class is application 200 Clustering Model based on increment type set up for performance data point.Specifically, similarity calculated 220 similarities that can calculate performance number strong point to be detected and each existing performance data class respectively, so The rear performance data class determining that similarity is the highest.In one embodiment, similarity calculated 220 is first First calculate the center particle at performance number strong point to be detected and performance data class.Here center particle and property In energy data class, each performance number strong point dimension is identical.The value of each dimension of center particle is institute in such There is the performance data point average in this dimension.In other words, center particle is such mass centre.Here Distance can be Euclidean distance, it is also possible to determine according to distance calculation known to other.It addition, The present invention can be to use the known Similarity Measure modes such as such as cosine similarity to be detected to determine Performance number strong point and the similarity of performance data class, repeat no more here.
After determining performance number strong point to be detected and the distance of the center particle of a performance data class, phase Performance number strong point to be detected and this performance number can be calculated according to this distance like degree computing unit 220 Similarity according to class.In an embodiment of the invention, similarity calculated calculates according to following formula Performance number strong point and the similarity of performance data class.
s i m = 1 1 + d
Wherein, d is calculated performance number strong point to be detected and this performance data class center line particle Distance, sim is and the similarity of this performance data class.
The property that the performance number strong point similarity to be detected with this is the highest is determined in similarity calculated 220 After energy data class, the first judging unit 230 is suitable to judge whether this highest similarity exceedes this performance The similarity threshold that data class is current.Here, similarity threshold can be a fixed threshold, it is also possible to It is configured to the threshold value of Automatic adjusument.When creating a performance data class, this performance data class configures There is an initial similarity threshold, for example, 0.5.A performance data point is often increased in performance data class Time, similarity threshold carries out Primary regulation.Please see below about similarity threshold more detailed description.
When the first judging unit 230 determines that this highest similarity exceedes current similarity threshold, poly- Close unit 240 to be suitable to be aggregated in this performance data class performance number strong point to be detected.
Second judging unit 250 be suitable to calculate polymerization after in this performance data class data point sum account for current institute Whether the ratio having the data point sum of performance data class exceedes exception class threshold value.Generally, performance number used In strong point, normal data points quantity accounts for larger specific gravity.When calculated ratio is the highest, this performance Data class is that the probability of exception class is the lowest.
When the second judging unit 250 determines not less than exception class threshold value, the 3rd judging unit 260 is treated The distance of all dimensions of the center particle of the performance data class that the performance number strong point of detection adds with it is carried out Sequence.3rd judging unit 260 extracts the ultimate range of predetermined ratio (such as 30%), and calculates The distance of the ultimate range sum extracted and all dimensions and ratio whether more than range distribution threshold value. According to one embodiment of the invention, the 3rd judging unit 260 judges with specific reference to following manner.
Pt={n1,...,niCr={c1,...,ci}dI=|ni-ci| wherein, pt is a performance data point, and cr is One center particle, niFor i-th dimension performance indications, c in performance number strong point pt to be detectediCentered by particle The i-th dimension numerical value of cr, diFor pt i-th dimension and the distance of i-th dimension in cr.
3rd judging unit 260 d to all dimensionsiIt is ranked up, and calculatesWherein, N For all dimensions sum, M is the number of dimensions of predetermined ratio in N,Maximum in N-dimensional distance M value sum,Sum for N number of distance.Finally, the 3rd judging unit 260 judges that pr is No more than range distribution threshold value.Here, may indicate that in performance number strong point less than range distribution threshold value and respectively tie up Degrees of data size is more average, and this performance number strong point is that the probability of normal data points is higher.Normal number Strong point may indicate that cluster does not exist exception.When more than range distribution threshold value, the 3rd judging unit 260 Determine that performance number strong point to be detected is an abnormity point.Application 200 can also generate according to abnormity point and accuse Alarm message also notifies resource management applications (151).So, the application 200 that the detection cluster of the present invention is abnormal By judging whether more than range distribution threshold value, the accuracy of outlier detection can be improved.
Fig. 3 shows the schematic diagram of the application 300 of detection cluster exception according to some embodiments of the invention. As it is shown on figure 3, application 300 includes data capture unit 310, similarity calculated 320, first sentences Disconnected unit 330, polymerized unit the 340, second judging unit the 350, the 3rd judging unit 360, class detection Unit 370, window judging unit 380 and Alarm Unit 390.
The working method of data capture unit 310 is consistent with data capture unit in Fig. 2 210, the most not Repeat again.
In one embodiment, a performance number strong point to be detected is got at data capture unit 310 Time, class detector unit 370 may determine that current existing performance data class sum whether non-zero.If Zero (representing the Clustering Model being also not set up based on increment being), class detector unit 370 is suitable to indicate that poly- Close unit 340 and one performance data class of performance number strong point generation that this is to be detected (is i.e. set up new gathering Class model).So, this class generated can be as existing performance data class, and on this basis It is polymerized and detects the performance number strong point to be detected of follow-up acquisition.
In yet another embodiment, class detector unit 370 is suitable to judge the dimension at performance number strong point to be detected Spend the most consistent with existing performance data class.If dimension is inconsistent, then show this performance number to be detected Strong point is not suitable for clustering with existing performance data class.Therefore, similarity calculated 320 is not required to This performance number strong point to be detected is performed operation.So, application 300 is suitable to regenerate performance number According to class.In other words, application 300 is suitable to empty existing performance number strong point (that is, that abandons having set up is poly- Class model).Such as, class detector unit 370 can delete existing performance data class, and indicates polymerization This performance number strong point to be detected is generated a performance data class by unit 340.
In yet another embodiment, class detector unit 370 can be simultaneously the most non-to performance data class sum Zero-sum dimension the most unanimously judges.Existing performance data class sum is determined in class detector unit 370 When non-zero and dimension are consistent, similarity calculated 320 performance number strong point to be detected can be performed with The operation that similarity detector unit 220 is consistent, repeats no more here.
First judging unit 330, polymerized unit the 340, second judging unit 350 and the 3rd judging unit 360 Can realize and the first judging unit 230, polymerized unit the 240, second judging unit 250 and the 3rd judgement The function that unit 260 is identical, repeats no more here.
It addition, after performance number strong point to be detected joins a performance data class, second judges list The data point that unit 350 data point sum in determining this performance data class accounts for current all properties data class is total When the ratio of number exceedes exception class threshold value, determine that performance number strong point to be detected is that normal data points is (non-different Often point).3rd judging unit 360 is determining pr (specifically referring to above the 3rd judging unit 260) During not less than range distribution threshold value, determine the non-abnormity point in performance number strong point to be detected.
It addition, determine performance number strong point to be detected and all properties data class at the first judging unit 330 Similarity not less than current similarity threshold time, polymerized unit 340 is further adapted for this to be detected Performance number strong point is generated as a new performance data class.Polymerized unit 340 judges adding newly-generated class Whether the sum of performance data class exceedes class threshold afterwards.When exceeding class threshold, polymerized unit 340 Be suitable to by all properties data class, two closest performance data class merge into one.So, Classification sum can be controlled by the application 300 of the present invention, to avoid classification number too much.According to this Inventing in an embodiment, first polymerized unit 340 calculates in all of performance data class, two-by-two Between the distance of center particle, it is then determined that closest two classes cl1And cl2, and by cl1And cl2 Two classes merge into class cl3.Polymerized unit 340 can determine cl according to following formula3Center particle, phase Like degree threshold value and data point sum.
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1Data point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2 For cl2Similarity threshold, th3For cl3Similarity threshold, np3For cl3Data point sum.
It addition, polymerized unit 340 is further adapted for joining a performance data at performance number strong point to be detected After class, update center particle and the similarity threshold of this performance data class.In one embodiment, Polymerized unit 340 updates center particle and similarity threshold according to following formula.
Cr=(pt+cr*np)/(np+1)
t h = t h + l r n p * ( s i m - t h )
Wherein, particle centered by cr, np is data point sum in class, and pt is the performance data added Point, sim is the similarity of pt and performance data class, and th is exception class threshold value, and lr is for regulating th's Learning rate threshold value.When sim is more than the th before updating, after renewal, similarity threshold th increases, and i.e. carries The high standard adding data point.Otherwise, sim less than update before th time, after renewal th Reduce.So, by similarity threshold carries out Automatic adjusument, application according to the present invention 300 exists When property performance data point is detected, there is robustness.
To sum up, often obtaining a performance number strong point to be detected at data capture unit 310, class detection is single Unit is suitable to judge whether current existing performance data class is zero.
If zero, this performance number strong point is generated a performance data class by polymerized unit 340.Change speech It, application 300, based on this performance number strong point, starts to train a new Clustering Model.
If be not zero, class detector unit 370 can also detect the dimension at this performance number strong point whether with Performance data class is consistent.If inconsistent, class detector unit 370 empties existing performance data class.Change Yan Zhi, application 300 is abandoned existing Clustering Model, and based on this performance number strong point, is started to train one Individual new Clustering Model.
Performance number strong point dimension if existing performance data class is not zero and to be detected and existing property Energy data class is consistent, and application 300 is by similarity calculated the 320, first judging unit 330, polymerization Unit the 340, second judging unit 350 and the 3rd judging unit 360, judge that this performance number strong point is It it not abnormity point.
It addition, window judging unit 380 also safeguards there is a sliding window.Data capture unit 310 is every Obtaining a performance number strong point to be detected, this performance number strong point all can be added by window judging unit 380 Enter in sliding window.So, sliding window remains the application 300 up-to-date predetermined number got The performance number strong point of amount.After a performance data point is joined sliding window by window judging unit 380, If the 3rd judging unit 360 determines that this performance number strong point is abnormity point, window judging unit 380 is suitable to Judge whether the ratio of the sum of abnormity point in current sliding window mouth exceedes window threshold value (for example, 0.5, but not It is limited to this).If it exceeds window threshold value, Alarm Unit 390 can also be newly added slip according to this The d of the abnormity point in windowi(referring specifically to above), determine the anomalous performance index of this abnormity point. In other words, Alarm Unit 390 in may determine that multidimensional performance indications abnormal one or more performances refer to Mark.On this basis, Alarm Unit 390 can generate alarm information for anomalous performance index.So, Resource management applications according to the present invention can be accurately positioned according to this alarm information abnormal equipment occurs Node, and make corresponding resource management action.Such as, a performance number strong point includes 10 CPU Occupancy index.It is abnormal that Alarm Unit 390 determines that the 5th desired value exists.Resource management applications is obtaining After getting the message that the 5th desired value exists exception, it may be determined that the 5th device node that index is corresponding Exist abnormal.
Fig. 4 shows the flow chart of the method 400 of detection cluster exception according to some embodiments of the invention. Method 400 is suitable to perform in the monitoring server according to the present invention.
As shown in Figure 4, method 400 starts from step S410.In step S410, obtain and indicate described collection One performance number strong point to be detected of group's performance, this performance number strong point includes that normalized multidimensional performance refers to Mark.According to one embodiment of the invention, in step S410, can obtain from performance collection device (161) Take performance data group.Performance data group includes multidimensional performance indications.Multidimensional performance indications include according to this In bright cluster, memory usage, cpu busy percentage, task throughput, task response-time, rubbish return Receive at least one in frequency.In performance data group, the quantitative criteria of performance indications itself is normalized dimension (span of performance indications is 0 to 1).So, method 400 can be by from performance collector Each performance data group is as a performance number strong point including multidimensional property value.In yet another embodiment, The dimension that at least some of performance indications are non-normalized in the performance data group of performance collection device.Change speech It, the span of at least some of performance indications is not limited to the interval of 0 to 1.So, step S410 Also need to perform the operation that normalization performance data group is a performance number strong point to be detected.
After step S410 obtains a performance number strong point to be detected, method 400 can perform step S420.In the step s 420, from the existing performance of the be polymerized generation in performance number strong point the most obtained In data class, determine the performance data class that the performance number strong point similarity to be detected with this is the highest.Here, Existing performance data class is actually a Clustering Model set up.
According to one embodiment of the invention, step S420 includes following implementation process.First, calculate to be checked The performance number strong point surveyed and the distance of the center particle of existing each performance data class.Then, according to The distance of the center particle of each performance data class, calculates performance number strong point to be detected and this performance number Similarity according to class.Finally, the performance data class the highest with performance number strong point similarity to be detected is determined. Wherein, the distance calculated e.g. Euclidean distance, but it is not limited to this.It addition, Similarity Measure is permissible Realized by following manner.
s i m = 1 1 + d
Wherein, d is the center at calculated performance number strong point to be detected and this performance data class The distance of particle, sim is and the similarity of this performance data class.It addition, step S420 can also be adopted Performance number strong point to be detected and property is determined by known Similarity Measure modes such as such as cosine similarity The similarity of energy data class, repeats no more here.
After step S420 determines the performance data class the highest with performance number strong point similarity to be detected, side Method 400 enters step S430.In step S430, it is judged that this performance number strong point to be detected with determined The similarity of performance data class whether exceed the similarity threshold that this performance data class is current.
In step S430, determine that when exceeding current similarity threshold, method 400 performs step S440, In performance data class determined by being aggregated at the performance number strong point that this is to be detected, and calculate this property after polymerization In energy data class, whether the ratio of the data point sum that data point sum accounts for current all properties data class exceedes Exception class threshold value.
When determining in step S440 not less than exception class threshold value, method 400 enters step S450.In step In rapid S450, each dimension to this performance number strong point to be detected Yu the center particle of this performance data class The distance of performance indications is ranked up, and calculate the ultimate range sum of predetermined ratio and all dimensions away from From the ratio of sum, whether more than range distribution threshold value.Below in conjunction with formula, operation in step S450 is carried out more Concrete exemplary illustration.
Pt={n1,...,niCr={c1,...,ci}
First, d is calculatedI=|ni-ci| wherein, niRefer to for i-th dimension performance in performance number strong point pt to be detected Mark, ciCentered by the i-th dimension numerical value of particle cr, diFor pt i-th dimension and the distance of i-th dimension in cr.
Then, the d to all dimensionsiIt is ranked up, and calculatesWherein, N is all dimensions Degree sum, M is the number of dimensions of predetermined ratio in N,For M maximum in N-dimensional distance The sum of individual value,Sum for N number of distance.Finally, it is judged that whether pr is more than range distribution threshold value.
When determining in step S450 more than range distribution threshold value, method 400 enters step S460.Determine This performance number strong point to be detected is an abnormity point.Method 400 more specifically realizes details and application 200 Unanimously, repeat no more here.
Fig. 5 shows the flow chart of the method 500 of detection cluster exception according to some embodiments of the invention. Method 500 is suitable to perform in the monitoring server according to the present invention.
As it is shown in figure 5, method 500 starts from step S501.Step S501 executive mode and step S410 Unanimously, repeat no more here.
Subsequently, method enters step S502.In step S502, it is judged that current existing performance data class Sum whether non-zero.
When determining existing performance data class sum non-zero in step S502, method 500 can select to hold Row step S503.In step S503, it is judged that whether the dimension at performance number strong point to be detected is with existing Performance data class is consistent.
Determining in step S503 when dimension is inconsistent, method 500 performs step S504, abandons existing Performance data class, and performance number strong point to be detected is generated as a performance data class.In other words, Step S504 abandons existing Clustering Model, and starts a new clustering learning process.
In step S502, determine when existing performance data class sum is zero (there is no Clustering Model), Method performs step S505.In step S505, performance number strong point to be detected is generated a performance number According to class, and start a new clustering learning process.
Determining in step S503 when dimension is consistent, method 500 performs step S506.Step S506 Embodiment is consistent with step S420, repeats no more here.It should be noted that according to the present invention's When in embodiment, the dimension of performance data point keeps stablizing, method 500 can not perform step S503.That is, When determining existing performance data class sum non-zero in step S502, directly perform step S506.
After determining the performance data class the highest with performance number strong point similarity to be detected in step S506, Method 500 enters step S507.Step S507 embodiment is consistent with step S430, the most superfluous State.
When determining the similarity similarity threshold current not less than performance data class in step s 507, method 500 perform step S508.In step S508, by the newly-generated performance in performance number strong point to be detected Data class, and this class is joined in existing performance data class.In order to control classification in Clustering Model Sum, method 500 also performs step S509, it is judged that current (after adding a new class) all of class Whether sum is more than class threshold, and merged into by two closest performance data class when exceeding One.According to one embodiment of the invention, step S509 is implemented as operations described below process, but is not limited to This.
First, calculating in all of performance data class, the distance of center particle, determines distance between any two Two nearest classes cl1And cl2
Then, by cl1And cl2Two classes merge into class cl3
Finally, cl is determined according to following formula3Center particle, similarity threshold and data point sum:
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1Data point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2 For cl2Similarity threshold, th3For cl3Similarity threshold, np3For cl3Data point sum.
Determine that when exceeding current similarity threshold, method 500 performs step S510 in step s 507. The embodiment of step S510 is consistent with step S440, repeats no more here.
After performing step S510, method 500 also performs step S518.In step S518, Update center particle and the similarity threshold of performance data class.According to one embodiment of the invention, in step In S518, more it is newly added in the performance data class after performance number strong point to be detected according to following formula Heart particle and similarity threshold:
Cr=(pt+cr*np)/(np+1)
t h = t h + l r n p * ( s i m - t h )
Wherein, particle centered by cr, np is data point sum in class, and pt is the performance data added Point, sim is the similarity of pt and performance data class, and th is exception class threshold value, and lr is for regulating th's Learning rate threshold value.
Determining when ratio exceedes exception class threshold value in step S510, method performs step S511, determines and treats The performance number strong point of detection is non-abnormity point.
In step S510, when determining ratio not less than exception class threshold value, method 500 enters step S512. The embodiment of step S512 is consistent with step S450, repeats no more here.
When determining in step S512 not more than range distribution threshold value, method 500 selects to perform step S511.
When determining in step S512 more than range distribution threshold value, method 500 enters step S513.Step The embodiment of S513 is consistent with step S460, repeats no more here.
To sum up, method 500 is in step S511 and S513, it is determined that performance number strong point to be detected is No for abnormity point.By step S504 and S505, the newly-built performance data class of method 500.This First class of class Clustering Model.By step S508 and S509, method 500 can be to be detected by one Performance number strong point generate a new performance data class, and the classification sum of Clustering Model is controlled in class In the range of other threshold value.
Alternatively, method 500 also includes step S514.In step S514, by performance number to be detected Strong point joins in a sliding window.This sliding window generally remains with during method 500 performs The performance number strong point of the new predetermined quantity (i.e. the preset width of window) obtained (is i.e. obtained by step S501 The performance number strong point taken).It should be noted that method 500 is when performing step S504 and S505, Step S514 will remove in sliding window in newly-built performance data class the data point before performance data point.
During it addition, one performance data point of addition is abnormity point in step S514, method 500 also performs Step S515.In step S515, it is judged that in sliding window, whether the ratio of abnormity point exceedes window threshold value.
When exceeding window threshold value in step S515, method 500 performs step S516, according to abnormity point with Center particle, in the distance of each dimension, determines the anomalous performance index in performance number strong point to be detected.
Method 500 also includes step S517, generates the alarm information for anomalous performance index.Method 500 More specifically embodiment is consistent with application 300, repeats no more here.
A9, method as according to any one of A1-A8, also include: added at performance number strong point to be detected Entering in a sliding window, this sliding window maintains the performance data of the up-to-date predetermined quantity got Point;When determining that this performance number strong point to be detected is an abnormity point, it is judged that abnormal in this sliding window Whether the ratio of point exceedes window threshold value.A10, method as described in A9, also include: described exceeding During window threshold value, according to the distance of described each dimension performance indications, determine described performance number to be detected Anomalous performance index in strong point.A11, method as according to any one of A1-A10, described to be checked The performance number strong point surveyed with determined by the similarity of performance data class current not less than this performance data class During similarity threshold, the method also includes: by newly-generated for this performance number strong point to be detected be a performance This class is also joined in existing performance data class by data class;Judge current all of performance data class Classification sum whether exceed class threshold, and when exceeding, two closest performance data class are closed And be one.A12, method as described in A11, wherein, described by two closest performance numbers The operation merging into one according to class includes: calculate in all of performance data class, center particle between any two Distance, determine closest two class cl1And cl2,
By cl1And cl2Two classes merge into class cl3,
Cl is determined according to following formula3Center particle, similarity threshold and data point sum:
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1Data point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2 For cl2Similarity threshold, th3For cl3Similarity threshold, np3For cl3Data point sum.
A13, method as according to any one of A1-A12, also include: when exceeding exception class threshold value, Determine that described performance number strong point to be detected is non-abnormity point;When not less than range distribution threshold value, determine Described performance number strong point to be detected is non-abnormity point.A14, method as according to any one of A1-A13, Perform described determine the performance data class that the performance number strong point similarity to be detected with this is the highest before, should Method also includes: judge current existing performance data class sum whether non-zero;With or judge to be detected The dimension at performance number strong point is the most consistent with existing performance data class.A15, method as described in A14, Also include: be zero determining current existing performance data class sum, or determine that described dimension is with existing Performance data class inconsistent time, the performance number strong point that this is to be detected is generated as a performance data class. A17, application as described in A16, described data capture unit farther includes: receiver module, is suitable to Receive a performance data group of the instruction cluster performance gathered from performance collection device, this performance data Group includes multidimensional performance indications;With normalization module, being suitable to normalize this performance data group is described performance Data point.A18, application as described in A16 or A17, wherein, described multidimensional performance indications include institute State memory usage in cluster, cpu busy percentage, task throughput, task response-time, garbage reclamation In frequency at least one.A19, application as according to any one of A16-A18, wherein, described similar Degree computing unit is suitable to determine, according to following manner, the property that described performance number strong point similarity to be detected is the highest Energy data class: calculate the performance number strong point to be detected center particle with existing each performance data class Distance;According to the distance with the center particle of each performance data class, calculate performance number strong point to be detected Similarity with this performance data class;Determine the performance the highest with performance number strong point similarity to be detected Data class.A20, application as described in A19, wherein, described similarity calculated be suitable to according under The mode of stating calculates the distance of the center particle of performance number strong point to be detected and existing each performance data class: Calculate the Euclidean distance at performance number strong point to be detected and the center particle of each performance data class.A21, as Application described in A19 or A20, wherein, described similarity calculated is suitable to calculate according to following formula Performance number strong point to be detected and the similarity of this performance data class:
s i m = 1 1 + d
Wherein, d is calculated performance number strong point to be detected and this performance data class center particle Distance, sim is and the similarity of this performance data class.A22, as arbitrary in claim A16-A21 Application described in Xiang, wherein, described polymerized unit is further adapted for:
The centroplasm of performance data class after performance number strong point to be detected more it is newly added according to following formula Point and similarity threshold:
Cr=(pt+cr*np)/(np+1)
t h = t h + l r n p * ( s i m - t h )
Wherein, particle centered by cr, np is data point sum in class, and pt is the performance data added Point, sim is the similarity of pt and performance data class, and th is exception class threshold value, and lr is for regulating th's Learning rate threshold value.
A23, application as according to any one of claim A16-A22, wherein, the described 3rd judges Unit is suitable to perform according to following manner described to this performance number strong point to be detected and this performance data class The distance of each dimension performance indications of center particle is ranked up, and calculates the ultimate range of predetermined ratio And with the distance of all dimensions and ratio, whether more than range distribution threshold value:
Pt={n1,...,niCr={c1,...,ci}dI=|ni-ci|niFor in performance number strong point pt to be detected i-th Dimension performance indications, ciCentered by the i-th dimension numerical value of particle cr, diFor in pt i-th dimension and cr i-th dimension away from From,
D to all dimensionsiIt is ranked up, and calculatesWherein, N is all dimensions sum, M is the number of dimensions of predetermined ratio in N,For M the sum being worth maximum in N-dimensional distance,For the sum of N number of distance,
Judge that whether pr is more than range distribution threshold value.
A24, application as according to any one of A16-A23, also include window judging unit, be suitable to:
Being joined in a sliding window at performance number strong point to be detected, this sliding window maintains up-to-date The performance number strong point of the predetermined quantity got;And
When the 3rd judging unit determines that this performance number strong point to be detected is an abnormity point, it is judged that this is sliding In dynamic window, whether the ratio of abnormity point exceedes window threshold value.
A25, application as described in A24, also include Alarm Unit, be suitable at described window judging unit Determine when exceeding described window threshold value, according to the distance of described each dimension performance indications, determine described in treat Anomalous performance index in the performance number strong point of detection.
A26, application as according to any one of A16-A25, wherein said polymerized unit is further adapted for, First judging unit determine described performance number strong point to be detected with determined by the similarity of performance data class During current not less than this performance data class similarity threshold,
It is a performance data class by newly-generated for this performance number strong point to be detected and this class is joined In some performance data classes,
Judge whether the classification sum of current all of performance data class exceedes class threshold, and when exceeding Two closest performance data class are merged into one.
A27, application as described in A26, wherein, described polymerized unit is suitable to will be away from according to following manner One is merged into from two nearest performance data class:
Calculate in all of performance data class, the distance of center particle between any two, determine closest Two classes cl1And cl2,
By cl1And cl2Two classes merge into class cl3,
Cl is determined according to following formula3Center particle, similarity threshold and data point sum:
cr3=cr1*np1+cr2*np2
th3=(np1*th1+np2*th2)/(np1+np2)
np3=np1+np2
Wherein, cr3For cl3Center particle, cr2For cl2Center particle, cr1For cl1Center particle, np1For cl1Data point sum, np2For cl2Data point sum, th1For cl1Similarity threshold, th2 For cl2Similarity threshold, th3For cl3Similarity threshold, np3For cl3Data point sum.
A28, application as according to any one of A16-A27, wherein,
Described second judging unit is further adapted for when exceeding exception class threshold value, determines described performance to be detected The non-abnormity point of data point;
Described 3rd judging unit is further adapted for when not less than range distribution threshold value, determines described to be detected The non-abnormity point in performance number strong point.
A29, application as according to any one of A16-A28, also include class detector unit, is suitable at meter Before calculation similarity unit determines the performance data class that the performance number strong point similarity to be detected with this is the highest,
Judge current existing performance data class sum whether non-zero;With or
Judge that the dimension at performance number strong point to be detected is the most consistent with existing performance data class.
A30, application as described in A29, wherein, described class detector unit is further adapted for, current determining Existing performance data class sum is zero, or determines that described dimension is inconsistent with existing performance data class Time, indicate described polymerized unit that this performance number strong point to be detected is generated a performance data class.
In specification mentioned herein, illustrate a large amount of detail.It is to be appreciated, however, that this Inventive embodiment can be put into practice in the case of not having these details.In some instances, and It is not shown specifically known method, structure and technology, in order to do not obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help understand in each inventive aspect one Or multiple, above in the description of the exemplary embodiment of the present invention, each feature of the present invention is sometimes It is grouped together in single embodiment, figure or descriptions thereof.But, should be by the disclosure Method be construed to reflect an intention that i.e. the present invention for required protection require ratio in each claim Middle feature more features be expressly recited.More precisely, as the following claims reflect Like that, inventive aspect is all features less than single embodiment disclosed above.Therefore, it then follows tool Claims of body embodiment are thus expressly incorporated in this detailed description of the invention, and the most each right is wanted Ask itself all as the independent embodiment of the present invention.
Those skilled in the art are to be understood that module or the list of the equipment in example disclosed herein Unit or assembly can be arranged in equipment as depicted in this embodiment, or alternatively can position In the one or more equipment different from the equipment in this example.Module in aforementioned exemplary can combine It is a module or is segmented into multiple submodule in addition.
Those skilled in the art are appreciated that and can carry out the module in the equipment in embodiment certainly Change adaptively and they are arranged in one or more equipment different from this embodiment.Permissible Module in embodiment or unit or assembly are combined into a module or unit or assembly, and in addition may be used To put them into multiple submodule or subelement or sub-component.Except such feature and/or process or Outside at least some in unit excludes each other, can use any combination that (this specification is included companion With claim, summary and accompanying drawing) disclosed in all features and so disclosed any method or All processes of person's equipment or unit are combined.Unless expressly stated otherwise, this specification (includes companion With claim, summary and accompanying drawing) disclosed in each feature can by provide identical, equivalent or phase Replace like the alternative features of purpose.
Although additionally, it will be appreciated by those of skill in the art that embodiments more described herein include it Some feature included in its embodiment rather than further feature, but the group of the feature of different embodiment Close and mean to be within the scope of the present invention and formed different embodiments.Such as, in following power In profit claim, one of arbitrarily can mode making in any combination of embodiment required for protection With.
Additionally, some in described embodiment be described as at this can by the processor of computer system or Person by perform described function other device implement method or the combination of method element.Therefore, there is use Processor in the necessary instruction implementing described method or method element is formed and is used for implementing the method or method The device of element.Additionally, the element described herein of device embodiment is the example of following device: this dress Put for implementing by the function performed by the element of the purpose in order to implement this invention.
As used in this, unless specifically stated so, use ordinal number " first ", " second ", " the Three " etc. describe plain objects and be merely representative of the different instances relating to similar object, and be not intended dark Show the object being so described must have the time upper, spatially, sequence aspect or with arbitrarily other side Formula to definite sequence.
Although the embodiment according to limited quantity describes the present invention, but benefits from above description, this In technical field it is clear for the skilled person that in the scope of the present invention thus described, it can be envisaged that other Embodiment.Additionally, it should be noted that the language used in this specification is primarily to readable and teaching Purpose and select rather than select to explain or limit subject of the present invention.Therefore, In the case of without departing from the scope of the appended claims and spirit, for the common skill of the art For art personnel, many modifications and changes will be apparent from.For the scope of the present invention, to the present invention The disclosure done is illustrative and not restrictive, and the scope of the present invention is limited by appended claims Fixed.

Claims (10)

1. detect the method that cluster is abnormal, including:
Obtaining the performance number strong point to be detected indicating described cluster performance, this performance number strong point includes Normalized multidimensional performance indications;
From the existing performance data class by the be polymerized generation in performance number strong point acquired before, determine The performance data class the highest with this performance number strong point similarity to be detected;
Judge this performance number strong point to be detected with determined by the similarity of performance data class whether exceed this The similarity threshold that performance data class is current;
When exceeding current similarity threshold, the performance number strong point that this is to be detected is aggregated to determined by In performance data class, and in this performance data class, data point sum accounts for current all properties number after calculating polymerization Exception class threshold value whether is exceeded according to the ratio of the data point sum of class;
When not less than exception class threshold value, in this performance number strong point to be detected and this performance data class The distance of each dimension performance indications of heart particle is ranked up, and calculate predetermined ratio ultimate range it And with the distance of all dimensions and ratio, whether more than range distribution threshold value;And
When more than range distribution threshold value, determine that this performance number strong point to be detected is an abnormity point.
2. the method for claim 1, one of the described acquisition described cluster performance of instruction to be detected The step at performance number strong point include:
Receive a performance data group of the instruction cluster performance gathered from performance collection device, this performance Data group includes multidimensional performance indications;
Normalizing this performance data group is described performance number strong point.
3. method as claimed in claim 1 or 2, wherein, described multidimensional performance indications include described collection Memory usage, cpu busy percentage, task throughput, task response-time, garbage reclamation frequency in Qun In at least one.
4. the method as according to any one of claim 1-3, wherein it is determined that described performance to be detected The step of the performance data class that data point similarity is the highest includes:
Calculate the distance at performance number strong point to be detected and the center particle of existing each performance data class;
According to the distance with the center particle of each performance data class, calculate performance number strong point to be detected with The similarity of this performance data class;
Determine the performance data class the highest with performance number strong point similarity to be detected.
5. method as claimed in claim 4, wherein, described calculating performance number strong point to be detected with The step of the distance of the center particle of each performance data class having includes:
Calculate the Euclidean distance at performance number strong point to be detected and the center particle of each performance data class.
6. the method as described in claim 4 or 5, wherein, calculate performance number strong point to be detected and this The operation of the similarity of individual performance data class includes:
s i m = 1 1 + d
Wherein, d is the centroplasm at calculated performance number strong point to be detected and this performance data class The distance of point, sim is and the similarity of this performance data class.
7. the method as according to any one of claim 1-6, described by performance number to be detected in execution Strong point be aggregated to determined by performance data class, and it is total to calculate after polymerization data point in this performance data class Number account for current all properties data class data point sum ratio whether exceed exception class threshold value step it After, the method also includes:
The center of performance data class after performance number strong point to be detected more it is newly added according to following formula Particle and similarity threshold:
Cr=(pt+cr*np)/(np+1)
t h = t h + l r n p * ( s i m - t h )
Wherein, particle centered by cr, np is data point sum in class, and pt is the performance data added Point, sim is the similarity of pt and performance data class, and th is exception class threshold value, and lr is for regulating th's Learning rate threshold value.
8. the method as according to any one of claim 1-7, wherein, described to this performance to be detected Data point is ranked up with the distance of each dimension performance indications of the center particle of this performance data class, and Calculate predetermined ratio ultimate range and with the distance of all dimensions and ratio, whether be more than range distribution threshold The step of value includes:
Pt={n1,...,niCr={c1,...,ci}dI=|ni-ci|niFor in performance number strong point pt to be detected i-th Dimension performance indications, ciCentered by the i-th dimension numerical value of particle cr, diFor in pt i-th dimension and cr i-th dimension away from From,
D to all dimensionsiIt is ranked up, and calculatesWherein, N is all dimensions sum, M is the number of dimensions of predetermined ratio in N,For M the sum being worth maximum in N-dimensional distance,For the sum of N number of distance,
Judge that whether pr is more than range distribution threshold value.
9. detect the application that cluster is abnormal, including:
Data capture unit, is suitable to obtain the performance number strong point to be detected indicating described cluster performance, This performance number strong point includes normalized multidimensional performance indications;
Similarity calculated, is suitable to from existing by the be polymerized generation in performance number strong point acquired before Performance data class in, determine the performance data class that the performance number strong point similarity to be detected with this is the highest;
First judging unit, be suitable to judge this performance number strong point to be detected with determined by performance data class Similarity whether exceed the similarity threshold that this performance data class is current;
Polymerized unit, is suitable to, when the first judging unit determines and exceedes current similarity threshold, this be treated The performance number strong point of detection be aggregated to determined by performance data class;
Second judging unit, after being suitable to calculate polymerization, in this performance data class, data point sum accounts for current all Whether the ratio of the data point sum of performance data class exceedes exception class threshold value;And
3rd judging unit, is suitable to when not less than exception class threshold value, the performance number strong point to be detected to this It is ranked up with the distance of each dimension performance indications of the center particle of this performance data class, and calculates pre- The ultimate range sum of certainty ratio and the distance of all dimensions and ratio, whether more than range distribution threshold value,
And when more than range distribution threshold value, determine that this performance number strong point to be detected is an abnormity point.
10. manage a system for cluster, including:
Performance collection device, is suitable to collect the performance indications of described cluster;
The application that detection cluster as claimed in claim 9 is abnormal;And
Resource management applications, is suitable to the alarm information generated according to the application that described detection cluster is abnormal, Regulate the resource distribution of described cluster.
CN201610380755.8A 2016-06-01 2016-06-01 Detect the method for cluster exception and the system of application, management cluster Active CN105871634B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610380755.8A CN105871634B (en) 2016-06-01 2016-06-01 Detect the method for cluster exception and the system of application, management cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610380755.8A CN105871634B (en) 2016-06-01 2016-06-01 Detect the method for cluster exception and the system of application, management cluster

Publications (2)

Publication Number Publication Date
CN105871634A true CN105871634A (en) 2016-08-17
CN105871634B CN105871634B (en) 2019-02-15

Family

ID=56675631

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610380755.8A Active CN105871634B (en) 2016-06-01 2016-06-01 Detect the method for cluster exception and the system of application, management cluster

Country Status (1)

Country Link
CN (1) CN105871634B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107238407A (en) * 2017-05-03 2017-10-10 华北水利水电大学 Project of South-to-North water diversion secure data abnormal patterns find method and system
CN107528904A (en) * 2017-09-01 2017-12-29 星环信息科技(上海)有限公司 Method and apparatus for data distribution formula abnormality detection
CN107835098A (en) * 2017-11-28 2018-03-23 车智互联(北京)科技有限公司 A kind of network fault detecting method and system
CN107995030A (en) * 2017-11-28 2018-05-04 车智互联(北京)科技有限公司 A kind of network detection method, network fault detecting method and system
CN108206813A (en) * 2016-12-19 2018-06-26 中国移动通信集团山西有限公司 Method for auditing safely, device and server based on k means clustering algorithms
CN108228442A (en) * 2016-12-14 2018-06-29 华为技术有限公司 A kind of detection method and device of abnormal nodes
CN109271289A (en) * 2017-07-18 2019-01-25 车伯乐(北京)信息科技有限公司 A kind of application interface monitoring method, device, equipment and computer-readable medium
CN109374063A (en) * 2018-12-04 2019-02-22 广东电网有限责任公司 A kind of transformer exception detection method, device and equipment based on cluster management
CN110502346A (en) * 2019-08-28 2019-11-26 高瑶 Resource information management system and method under a kind of cluster environment
CN111612038A (en) * 2020-04-24 2020-09-01 平安直通咨询有限公司上海分公司 Abnormal user detection method and device, storage medium and electronic equipment
US10977112B2 (en) 2019-01-22 2021-04-13 International Business Machines Corporation Performance anomaly detection

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102104611A (en) * 2011-03-31 2011-06-22 中国人民解放军信息工程大学 Promiscuous mode-based DDoS (Distributed Denial of Service) attack detection method and device
CN102547715A (en) * 2012-02-07 2012-07-04 上海交通大学 Method for detecting wireless mesh network attack
CN103001825A (en) * 2012-11-15 2013-03-27 中国科学院计算机网络信息中心 Method and system for detecting DNS (domain name system) traffic abnormality
CN104536996A (en) * 2014-12-12 2015-04-22 南京理工大学 Computational node anomaly detection method in isomorphic environments
US20150219530A1 (en) * 2013-12-23 2015-08-06 Exxonmobil Research And Engineering Company Systems and methods for event detection and diagnosis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102104611A (en) * 2011-03-31 2011-06-22 中国人民解放军信息工程大学 Promiscuous mode-based DDoS (Distributed Denial of Service) attack detection method and device
CN102547715A (en) * 2012-02-07 2012-07-04 上海交通大学 Method for detecting wireless mesh network attack
CN103001825A (en) * 2012-11-15 2013-03-27 中国科学院计算机网络信息中心 Method and system for detecting DNS (domain name system) traffic abnormality
US20150219530A1 (en) * 2013-12-23 2015-08-06 Exxonmobil Research And Engineering Company Systems and methods for event detection and diagnosis
CN104536996A (en) * 2014-12-12 2015-04-22 南京理工大学 Computational node anomaly detection method in isomorphic environments

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108228442B (en) * 2016-12-14 2020-10-27 华为技术有限公司 Abnormal node detection method and device
CN108228442A (en) * 2016-12-14 2018-06-29 华为技术有限公司 A kind of detection method and device of abnormal nodes
CN108206813A (en) * 2016-12-19 2018-06-26 中国移动通信集团山西有限公司 Method for auditing safely, device and server based on k means clustering algorithms
CN108206813B (en) * 2016-12-19 2021-08-06 中国移动通信集团山西有限公司 Security audit method and device based on k-means clustering algorithm and server
CN107238407B (en) * 2017-05-03 2019-10-08 华北水利水电大学 Project of South-to-North water diversion secure data abnormal patterns find method and system
CN107238407A (en) * 2017-05-03 2017-10-10 华北水利水电大学 Project of South-to-North water diversion secure data abnormal patterns find method and system
CN109271289A (en) * 2017-07-18 2019-01-25 车伯乐(北京)信息科技有限公司 A kind of application interface monitoring method, device, equipment and computer-readable medium
CN109271289B (en) * 2017-07-18 2022-05-03 车伯乐(北京)信息科技有限公司 Application interface monitoring method, device, equipment and computer readable medium
CN107528904A (en) * 2017-09-01 2017-12-29 星环信息科技(上海)有限公司 Method and apparatus for data distribution formula abnormality detection
CN107835098A (en) * 2017-11-28 2018-03-23 车智互联(北京)科技有限公司 A kind of network fault detecting method and system
CN107835098B (en) * 2017-11-28 2021-01-29 车智互联(北京)科技有限公司 Network fault detection method and system
CN107995030A (en) * 2017-11-28 2018-05-04 车智互联(北京)科技有限公司 A kind of network detection method, network fault detecting method and system
CN107995030B (en) * 2017-11-28 2021-09-14 车智互联(北京)科技有限公司 Network detection method, network fault detection method and system
CN109374063A (en) * 2018-12-04 2019-02-22 广东电网有限责任公司 A kind of transformer exception detection method, device and equipment based on cluster management
US10977112B2 (en) 2019-01-22 2021-04-13 International Business Machines Corporation Performance anomaly detection
US11269714B2 (en) 2019-01-22 2022-03-08 International Business Machines Corporation Performance anomaly detection
CN110502346A (en) * 2019-08-28 2019-11-26 高瑶 Resource information management system and method under a kind of cluster environment
CN111612038A (en) * 2020-04-24 2020-09-01 平安直通咨询有限公司上海分公司 Abnormal user detection method and device, storage medium and electronic equipment
CN111612038B (en) * 2020-04-24 2024-04-26 平安直通咨询有限公司上海分公司 Abnormal user detection method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN105871634B (en) 2019-02-15

Similar Documents

Publication Publication Date Title
CN105871634A (en) Method and application for detecting cluster anomalies and cluster managing system
WO2020259421A1 (en) Method and apparatus for monitoring service system
Charrad et al. NbClust: an R package for determining the relevant number of clusters in a data set
CN105825298B (en) Power grid metering early warning system and method based on load characteristic estimation
CN106600115A (en) Intelligent operation and maintenance analysis method for enterprise information system
US20210014102A1 (en) Reinforced machine learning tool for anomaly detection
CN105184084A (en) Method and system for predicting fault type of electric power metering automation terminal
CN111176953B (en) Abnormality detection and model training method, computer equipment and storage medium
CN111367747B (en) Index abnormal detection early warning device based on time annotation
US20200293945A1 (en) Apparatus and method of high dimensional data analysis in real-time
CN112906738B (en) Water quality detection and treatment method
CN115996249B (en) Data transmission method and device based on grading
US20150205856A1 (en) Dynamic brownian motion with density superposition for abnormality detection
CN111294841A (en) Method and device for processing wireless network problem and storage medium
CN111160959A (en) User click conversion estimation method and device
KR101960755B1 (en) Method and apparatus of generating unacquired power data
Maksimović et al. Comparative analysis of data mining techniques applied to wireless sensor network data for fire detection
CN113810792A (en) Edge data acquisition and analysis system based on cloud computing
CN115114124A (en) Host risk assessment method and device
CN109976986A (en) The detection method and device of warping apparatus
CN116228312A (en) Processing method and device for large-amount point exchange behavior
CN104714205B (en) Electricity meter misplacement detection system and method thereof
CN115564410A (en) State monitoring method and device for relay protection equipment
CN110796377B (en) Power grid service system monitoring method supporting fuzzy theory
CN117808157B (en) Intelligent identification-based unreported outage behavior prediction analysis system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220720

Address after: 100193 room 101-216, 2nd floor, building 4, East District, yard 10, northwest Wangdong Road, Haidian District, Beijing

Patentee after: Beijing Ruixiang Technology Co.,Ltd.

Address before: 100191 floors 3 and 4, building a-5, Dongsheng Science Park, Zhongguancun, No. 66, xixiaokou Road, Haidian District, Beijing

Patentee before: BEIJING ONEAPM Co.,Ltd.

TR01 Transfer of patent right