CN104050070B - High-dimensional flow data changing point detection method in distributed system - Google Patents

High-dimensional flow data changing point detection method in distributed system Download PDF

Info

Publication number
CN104050070B
CN104050070B CN201410243426.XA CN201410243426A CN104050070B CN 104050070 B CN104050070 B CN 104050070B CN 201410243426 A CN201410243426 A CN 201410243426A CN 104050070 B CN104050070 B CN 104050070B
Authority
CN
China
Prior art keywords
data
change
trivial
point
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410243426.XA
Other languages
Chinese (zh)
Other versions
CN104050070A (en
Inventor
赵丽
刘欣然
曹玮
付戈
刘谦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN201410243426.XA priority Critical patent/CN104050070B/en
Publication of CN104050070A publication Critical patent/CN104050070A/en
Application granted granted Critical
Publication of CN104050070B publication Critical patent/CN104050070B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Complex Calculations (AREA)

Abstract

The invention provides a high-dimensional flow data changing point detection method in a distributed system. The method comprises the following steps: obtaining standardized high-dimensional original flow data in the distributed system; carrying out dimension reduction on the high-dimensional original flow data; clustering ordered sequence data characterized by main components and determining non-trivial points of main component data; judging whether original flow data of each dimension is obviously changed or not at the corresponding non-trivial point. The method is used for detecting a changing data of the high-dimensional flow data in the distributed system, and is further used for helping a manager to better monitor and analyze the flow data in the distributed system.

Description

Higher-dimension data on flows change point detecting method in a kind of distributed system
Technical field
The present invention relates to a kind of detection method in data mining technology field, high in particular to a kind of distributed system Dimension data on flows change point detecting method.
Background technology
Administrative staff can be helped quickly to grasp different in system answering from analysis the monitoring of data on flows in distributed system Loading condition, and then the reasonability of analyzing software system structure and real-time detection abnormal conditions.The flow of distributed system The investigation that the analysis of data also can help to website visiting temperature, access the information such as content focus, user's access habits.
However, because Services in Distributed System device quantity is big, on each server, the application program of deployment constantly produces greatly Amount data on flows, higher higher-dimension (higher-dimension the refers to two dimension and above dimension) data on flows of the data on flows dimension of generation, and number According to having periodically, administrative staff are difficult to directly data be observed and analyze.For example, for a http-server, Page click volume often has periodically, and that is, the data volume on daytime is more much bigger than the data of night;If the data on certain day daytime Amount is obviously reduced very but still ratio is bigger at night, and administrative staff are likely to monitor this change.Produce and data periodic wave Dynamic different change is referred to as non-trivial change, and the data point producing non-trivial change is referred to as non-trivial change point, and as data becomes Change point.Further, since Services in Distributed System device quantity is big, the data on flows of generation is many, and administrative staff are relatively fewer, directly Connect these data costs of observation very big or even infeasible.The detection side to the change of higher-dimension data on flows is not suggested that in prior art Method, proposes a kind of effective data on flows change point detecting method very necessary.
Technology according to the present invention includes PCA (PCA), serial specimen culstering method and the F method of inspection.
The principal component analytical method original higher-dimension data on flows of minority principal component feature interpretation, to reduce feature space Dimension simultaneously retains the purpose of the topmost information of sample.The principle of principal component analysis be by one-component may be related higher-dimension to Amount x, is projected in the new orthogonal space being characterized by principal component by eigenvectors matrix, the order of principal component is by original number Size order according to the variance projecting to this principal component determines, is characterized as low-dimensional vector y with some principal components ranking anteposition Original high dimensional data, and only have lost some secondary information.Meanwhile, according to low-dimensional principal component vector and characteristic vector square Battle array, can reconstruct corresponding original high dimension vector substantially.
Optimum segmentation algorithm (also referred to as " serial specimen culstering method ") is to carry out optimum segmentation to ordered sample sequence.Optimum Partitioning algorithm basic ideas are a given sample sequence and classification number, by searching for all possible splitting scheme, find section A kind of minimum scheme of the summation of interior sum of squares of deviations is as final splitting scheme.Total deviation square due to a data sequence With equal to sum of squares of deviations in section and intersegmental sum of squares of deviations sum, therefore in section, sum of squares of deviations minimum means that intersegmental deviation is put down Square and maximum, that is, in every section, have the most uniform physical property, and intersegmental reach maximum difference, therefore be optimal dividing.? The early complexity that proposed in 1958 by Fisher is O (n2) optimum segmentation algorithm, minimum former with all kinds of internal specimen differences Then, ordered sample is classified.
F inspection is a kind of statistical method, and also referred to as " homogeneity test of variance ", the method is the side by checking two groups of samples Whether difference has significant difference, i.e. homogeneity of variance to judge whether two groups of samples have significant difference.Mainly by comparing two groups The F statistic that the between-group variance of data and intra-class variance obtain, if ratio is more than F distribution critical value then it is assumed that having notable Difference, then not thinks there is significant difference if less than F distribution critical value.F distribution critical value is relevant with the free degree and confidence level, Can be obtained by looking into F distribution tables of critical values.
Content of the invention
For overcoming above-mentioned the deficiencies in the prior art, the invention provides higher-dimension data on flows change in a kind of distributed system Point detecting method.
Realizing the solution that above-mentioned purpose adopted is:
Higher-dimension data on flows change point detecting method in a kind of distributed system, it thes improvement is that:Methods described bag Include following steps:
I, the original data on flows of higher-dimension of the described distributed system Plays of acquisition;
II, higher-dimension original flow Data Dimensionality Reduction;
The ordered sample data clusters that III, principal component characterize, determine the non-trivial point of number of principal components evidence;
Whether IV, the original each dimension data on flows of judgement occur non-trivial to change at non-trivial point.
Further, described step I includes:
S101, the server of described distributed system are provided with water flow collection device, and in the acquisition unit interval, application program is former Beginning data on flows;
S102, the original flow tables of data of different for the same time obtaining described servers is shown as high dimension vector, different The original flow data group of time point becomes original traffic matrix
In formula, xjT () is the data volume that j-th application program of t-th sampling time point produces, the t row of matrix represents The data volume producing in all applications of t-th time point, jth row represent the number that j-th application program produces in all time points According to amount;
S103, described original traffic matrix X is standardized process, obtain the original traffic matrix of standard
In formula,
Further, in described step II, with PCA, principal component analysis is carried out to original data on flows, and Determine the principal component of the original flow of described standard.
Further, in described step III, the principal component that described step II is obtained is as described higher-dimension data on flows Feature is clustered;It is flow number with the best cutting point that periodic serial specimen culstering method determines ordered data sample According to non-trivial point.
Further, in described step IV, according to described non-trivial point, to often one-dimensional primary flow amount data described non-flat The both sides data on flows of all points carries out periodic homogeneity test of variance, judges that described non-trivial point whether there is non-trivial and becomes Change, if F statistical value exceeds F and checks critical value, there is non-trivial change, otherwise there is not non-trivial change.
Further, described step III comprises the following steps:
S301, principal component component η (t) include the PC component y ' of one or more dimensionskT (), b (n, m) represents n orderly sample Product are divided into m class, b (n, m):G1={ i1,i1+1,...,i2-1},G2={ i2,i2+1,...,i3-1},...,Gm={ im,im+ 1 ..., n }, its branch is 1=i1< i2< ... < im< im+1- 1, im+1=n+1;
S302, number of principal components evidence are periodic data, and the side-play amount in setting cycle is s, and the cycle is tp, GkClass bias internal Amount is represented with the sample average of s:
Wherein, s.t.tp| t-s represents and meets t-s by tpThe constraint divided exactly;
S303, such as following formula obtain sum of squares of deviations in periodic class:
Wherein, s.t.tp| t-s represents and meets t-s by tpThe constraint divided exactly, T represents the transposition of vector;Define loss function For:
S304, determine non-trivial point with dynamic programming method.
Further, described step IV comprises the following steps:
S401, setting H0Represent that the data on flows that application program produces does not have non-trivial change, H in sliced time point1Table Show that the data on flows that application program produces has non-trivial change in sliced time point;
S402, obtained by change SSE in change SSA between class and class and determine F statistical value, including:
S403, given level of signifiance α, determine the F that confidence level is ααValue, if F is > FαThen it is assumed that x 'jT () deposits in moment point t In non-trivial change, otherwise x 'jThere is not non-trivial change in moment point t in (t).
Further, described step S402 comprises the following steps:
S4021, change SSA as described in following formula determines between class:
Wherein,
Change SSE in S4022, class as described in following formula determines:
Wherein, s.t.tp| t-s represents and meets t-s by tpThe constraint divided exactly;
S4023, total sum of squares of deviations
S4024, following formula determine as described in F statistical value:Wherein, fSSAAnd fSSEIt is respectively SSA and SSE The free degree.
Compared with prior art, the invention has the advantages that:
1st, the method that the present invention provides adopts principal component analytical method to original flow Data Dimensionality Reduction, principal component analytical method This correlation can be represented, characterize the Main change of sample data by minority principal component, when producing non-trivial change, Initial data no longer obeys the rule of correlation, and principal component also can produce corresponding change, the therefore observation to minority principal component It will be seen that the change of initial data.
2nd, data on flows has periodically in a distributed system, and the method that the present invention provides is directed to the expansion of periodic samples The serial specimen culstering method of exhibition, and calculate the best cutting point of ordered data using dynamic programming algorithm, i.e. data on flows Non-trivial change point;The method has redefined the loss function that n ordered sample point is divided into k class, is calculating similar sample During the inter- object distance of point, the sampled point of different cycles same offset is sued for peace after calculating inter- object distance respectively again.
3rd, the method that the present invention provides is directed to the data on flows that orderly principal component characterizes, using serial specimen culstering method Sample sequence is divided, it is to avoid using when not considering the methods such as Kmeans and DBscan of order by same period The sample point that non-trivial change does not inside occur is in inhomogeneity.
4th, the method that the present invention provides is directed to periodic data on flows it is proposed that being directed to the F of the extension of periodic data The method of inspection, and F statistic is redefined according to traditional F method of inspection, accurately find to produce the primary flow of non-trivial change Amount;According to the non-trivial change point obtaining, with the F method of inspection of extension, original data on flows is checked whether one by one in non-trivial There were significant differences for change point both sides, to obtain the definite original flow changing.
Brief description
The flow chart that Fig. 1 changes point detecting method for higher-dimension data on flows in the distributed system of the present invention;
Fig. 2 is the application structure exemplary plot of the inventive method.
Specific embodiment
Below in conjunction with the accompanying drawings the specific embodiment of the present invention is described in further detail.
As shown in figure 1, Fig. 1 changes the flow process of point detecting method for higher-dimension data on flows in the distributed system of the present invention Figure, in distributed system, higher-dimension data on flows change point detecting method comprises the following steps:
Step one, the standardized original data on flows obtaining in described distributed system;
Step 2, higher-dimension original flow Data Dimensionality Reduction;
The ordered sample data clusters that step 3, principal component characterize, determine the non-trivial point of original data on flows;
Step 4, the data on flows of acquisition non-trivial point.
In step one, obtain original data on flows in distributed system, and it is standardized process, specifically include:
In S101, distributed system, by the flow collection client that is deployed on monitored server with certain when Between interval acquiring application program produce data volume;
S102, the data of the generation of acquired same sampling time different application is expressed as high dimension vector, no Form primary flow amount data matrix, such as following formula with the data of time point:
In formula, xjT () is the data volume that j-th application program of t-th sampling time point produces, the t row of matrix represents The data volume producing in all applications of t-th time point, jth row represent the number that j-th application program produces in all time points According to amount;
S103, described original flow data matrix X is standardized process, obtain standardized original data on flows square Battle array, such as following formula:
In formula,
In step 2, original data on flows is carried out with PCA to described standard primary flow with principal component analysis Moment matrix dimensionality reduction, obtains coefficient matrix and principal component matrix;Determine the principal component of the original traffic matrix of described standard.
PCA is that original possible related multidimensional data is mapped to new orthogonal being characterized by principal component Space in, the order of principal component projected to by initial data the variance of this principal component size order determine, before generally ranking The variance of the minority principal component of position is maximum, represents topmost information.
Step 2 specifically includes following steps:
Original data on flows X ' is multiplied with eigenvectors matrix A and obtains principal component matrix Y ', be i.e. Y '=X ' A;Wherein, A For p × p rank matrix.
Initial data x'(t in original flow data matrix)=[x'1(t),x'2(t),...,x'p(t)] be converted to by Principal component y'(t)=[y'1(t),y'2(t),...,y'p(t)] the new vector that constitutes, i.e. y'(t)=x'(t) A;
Wherein, y'k(t)=x'1(t)a1k+x'2(t)a2k+...+x'p(t)apk;Y ' is n × p rank matrix, and λ represents The variance of row, i.e. λk=var (y 'k).Y ' middle first row y'1Variance λ1Maximum, last arranges y'pVariance λPMinimum.
Parameter Y ', A and λ can be obtained by analysis of covariance scheduling algorithm.
In step 3, according to step 2 using obtained rank anteposition minority principal component new as higher-dimension data on flows Feature, it is considered to the periodicity of data on flows, provides new ordered sample on the basis of existing serial specimen culstering method and gathers Class method, and the best cutting point of ordered data sample is determined using dynamic programming algorithm, the as non-trivial point of data on flows.
Step 3 specifically includes following steps:
S301, the selected principal component representation in components ranking anteposition is η (1), η (2) ..., η (n) is n time point Ordered data, η (t) be one or more dimensions PC component y 'k(t), such as { y '1(t),y′2(t)};
A certain point-score n Ordered Sample being divided into m class is represented with b (n, m), is designated as:
b(n,m):G1={ i1,i1+1,...,i2-1},G2={ i2,i2+1,...,i3-1},...,Gm={ im,im+ 1 ..., n }, its branch is 1=i1< i2< ... < im< n=im+1- 1, im+1=n+1;
S302, the data for principal component are periodic data, and the side-play amount in setting cycle is s, and the cycle is tp, determine Gk Class bias internal amount is that the sample average of s is:
Wherein, a | b represents that b can be divided exactly by a;s.t.tp| t-s represents and meets t-s by tpThe constraint divided exactly.
In S303, the periodic class of acquisition, sum of squares of deviations is:
Wherein, s.t.tp| t-s represents and meets t-s by tpThe constraint divided exactly, T represents the transposition of vector;Define loss function For:When n and m determines, all kinds of sum of squares of deviations of the less expression of L [b (n, m)] is more Little, classify more reasonable, P (n, m)={ G1,G2,…GmIt is expressed as optimal classification;
S304, the dynamic programming method being proposed in 1958 using Fisher are searching algorithm, determine non-trivial point.
In step 4, according to non-trivial point, to often one-dimensional primary flow amount data in the both sides flow number of described non-trivial point According to carrying out periodic homogeneity test of variance, the present invention considers the periodicity of data on flows, has done one to homogeneity test of variance Step is improved, and judges whether described non-trivial point has significant change with this.
Step 4 specifically includes following steps:
S401, hypothesis are directed to the delivery flow rate of each application in sliced time point point ik" there is not non-trivial change " occurs " exist non-trivial change " two kinds of situations it is assumed that:
H0:The data on flows vector x that application program j is producedjIn time point ikThere is not non-trivial change;H1:To application The data on flows vector x that program j producesjIn time point ikThere is non-trivial change;
To a time changing point ik, test Gk-1And GkIn sample data whether dramatically different.By calculating F statistics Value, if F statistical value exceeds F and checks critical value, is judged as there is non-trivial change, otherwise it is assumed that there is not non-trivial change.
S402, F statistical value computing formula is passed through " change between class " SSA and " change in class " SSE and is obtained.
S4021 has periodic SSA computing formula:
Wherein, s.t.tp| t-s represents and meets t-s by tpThe constraint divided exactly;
To in classification Gk-1With classification GkData offset s,For x 'jThe mean value of (t), that is,:
It is S4022, bigger apart from public center deviation due to having the sample of the different classifications of identical data side-play amount, The value of SSA is also bigger, then it is more likely that different classes, on the contrary, SSA is more little more be probably identical class.
Impact with stochastic error is compared, and that is, in class, deviation SSE computing formula is:
S4023, withRepresent total sum of squares of deviations, such as following formula calculate total deviation and square:
Then have
S4024, such as following formula calculate F statistic:
Wherein, fSSAAnd fSSEIt is the free degree of SSA and SSE, f in the present embodimentSSA=(I-1) × tp, fSSE=N-I × tp.
I represents that certain factor has I level, and in the present invention, the data of change point both sides should be 2 levels, and N is to be tested Number of samples, therefore, in the present embodiment, WithIt is respectively classification Gk-1With classification GkSample Number, I=2.
S4025, given level of signifiance α, can find the F for α corresponding to confidence level by looking into F table(I-1,N-I)Value, that is, F(I-1,N-I),α.
If F is > F(I-1,N-I),αThen it is assumed that Gk-1And GkIt is dramatically different, x 'jT () has non-trivial to change in moment t, no Then think x 'jT () does not have non-trivial to change in moment t.
Finally it should be noted that:Above example is merely to illustrate the technical scheme of the application rather than to its protection domain Restriction, although being described in detail to the application with reference to above-described embodiment, those of ordinary skill in the art should Understand:Those skilled in the art read the application after still can to application specific embodiment carry out a variety of changes, modification or Person's equivalent, but these changes, modification or equivalent, are all applying within pending claims.

Claims (7)

1. in a kind of distributed system higher-dimension data on flows change point detecting method it is characterised in that:Methods described includes following Step:
I, the original data on flows of higher-dimension of the described distributed system Plays of acquisition;
II, higher-dimension original flow Data Dimensionality Reduction;
The ordered sample data clusters that III, principal component characterize, determine the non-trivial point of number of principal components evidence;
Whether IV, the original each dimension data on flows of judgement occur non-trivial to change at non-trivial point;
In described step III, the principal component that described step II is obtained is clustered as the feature of described higher-dimension data on flows; It is the non-trivial point of data on flows with the best cutting point that periodic serial specimen culstering method determines ordered data sample.
2. the method for claim 1 it is characterised in that:Described step I includes:
S101, the server of described distributed system are provided with water flow collection device, obtain the primary flow of application program in the unit interval Amount data;
S102, the original flow tables of data of different for the same time obtaining described servers is shown as high dimension vector, different time The original flow data group of point becomes original traffic matrix
In formula, xjT () is the data volume that j-th application program of t-th sampling time point produces, the t row of matrix represents in t The data volume that all applications of individual time point produce, jth row represent the data volume that j-th application program produces in all time points;
S103, described original traffic matrix X is standardized process, obtain the original traffic matrix of standard
In formula,
P represents total p application program, and n represents total n sampling time point.
3. the method for claim 1 it is characterised in that:In described step II, with PCA to primary flow Amount data carries out principal component analysis, and determines the principal component of the original flow of standard.
4. the method for claim 1 it is characterised in that:In described step IV, according to described non-trivial point, to often one-dimensional Original data on flows carries out periodic homogeneity test of variance in the both sides data on flows of described non-trivial point, judges described non-flat All points whether there is non-trivial and change, if F statistical value exceeds F and checks critical value, there is non-trivial change, otherwise do not exist non- Ordinary change.
5. the method for claim 1 it is characterised in that:Described step III comprises the following steps:
S301, principal component component η (t) include the PC component y ' of one or more dimensionskT (), b (n, m) represents and divides n Ordered Sample For m class, b (n, m):G1={ i1,i1+1,…,i2-1},G2={ i2,i2+1,…,i3-1},…,Gm={ im,im+ 1 ..., n }, its Branch is 1=i1< i2< ... < im< im+1- 1, im+1=n+1, imRepresent first sample of m class;
S302, number of principal components evidence are periodic data, and the side-play amount in setting cycle is s, and the cycle is tp, GkClass bias internal amount s Sample average represent:
Wherein, s.t.tp| t-s represents and meets t-s by tpThe constraint divided exactly;
S303, such as following formula obtain sum of squares of deviations in periodic class:
D ( i k , i k + 1 - 1 ) = Σ s = 0 t p - 1 Σ t = i k , s . t . t p | t - s i k + 1 - 1 ( η ( t ) - η ‾ G k ( s ) ) ( η ( t ) - η ‾ G k ( s ) ) T ;
Wherein, s.t.tp| t-s represents and meets t-s by tpThe constraint divided exactly, T represents the transposition of vector;Defining loss function is:
S304, determine non-trivial point with dynamic programming method.
6. method as claimed in claim 4 it is characterised in that:Described step IV comprises the following steps:
S401, setting H0Represent that the data on flows that application program produces does not have non-trivial change, H in sliced time point1Representing should There is non-trivial change in sliced time point in the data on flows being produced with program;
S402, obtained by change SSE in change SSA between class and class and determine F statistical value, including:
S403, given level of signifiance α, determine the F that confidence level is ααValue, if F is > FαThen it is assumed that x 'jT () exists non-in moment point t Ordinary change, otherwise x 'jThere is not non-trivial change in moment point t in (t);
Quadratic sum between change SSA is for the class of two class samples of non-trivial change point both sides between class, in class, change SSE is in sample class Quadratic sum.
7. method as claimed in claim 5 it is characterised in that:Described step S402 comprises the following steps:
S4021, change SSA as described in following formula determines between class:
SSA { G k - 1 , G k } = Σ s = 0 t p - 1 [ n G k - 1 ( s ) ( x j ′ ‾ G k - 1 ( s ) - x j ′ ‾ ‾ { G k - 1 , G k } ( s ) ) 2 + n G k ( s ) ( x j ′ ‾ G k ( s ) - x j ′ ‾ ‾ { G k - 1 , G k } ( s ) ) 2 ] ,
Wherein,
Change SSE in S4022, class as described in following formula determines:
Wherein, s.t.tp| T-s table | s shows and meets t-s by tpThe constraint divided exactly;
S4023, total sum of squares of deviations
S4024, following formula determine as described in F statistical value:Wherein, fSSAAnd fSSEBe respectively SSA and SSE from By spending;.
CN201410243426.XA 2014-03-28 2014-03-28 High-dimensional flow data changing point detection method in distributed system Expired - Fee Related CN104050070B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410243426.XA CN104050070B (en) 2014-03-28 2014-03-28 High-dimensional flow data changing point detection method in distributed system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410243426.XA CN104050070B (en) 2014-03-28 2014-03-28 High-dimensional flow data changing point detection method in distributed system

Publications (2)

Publication Number Publication Date
CN104050070A CN104050070A (en) 2014-09-17
CN104050070B true CN104050070B (en) 2017-02-22

Family

ID=51502959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410243426.XA Expired - Fee Related CN104050070B (en) 2014-03-28 2014-03-28 High-dimensional flow data changing point detection method in distributed system

Country Status (1)

Country Link
CN (1) CN104050070B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113393673A (en) * 2021-08-17 2021-09-14 深圳市城市交通规划设计研究中心股份有限公司 Traffic signal scheduling plan and time interval optimization method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999633A (en) * 2012-12-18 2013-03-27 北京师范大学珠海分校 Cloud cluster extraction method of network information

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7529181B2 (en) * 2004-12-07 2009-05-05 Emc Corporation Method and apparatus for adaptive monitoring and management of distributed systems

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999633A (en) * 2012-12-18 2013-03-27 北京师范大学珠海分校 Cloud cluster extraction method of network information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于分布式计算的网络流量异常检测系统;陈伟;《中国优秀硕士学位论文全文数据库》;20091231;第32-48页 *

Also Published As

Publication number Publication date
CN104050070A (en) 2014-09-17

Similar Documents

Publication Publication Date Title
Charrad et al. NbClust: an R package for determining the relevant number of clusters in a data set
CN105071983B (en) Abnormal load detection method for cloud calculation on-line business
Bifet et al. Pitfalls in benchmarking data stream classification and how to avoid them
Maqbool et al. The weighted combined algorithm: A linkage algorithm for software clustering
US9087306B2 (en) Computer-implemented systems and methods for time series exploration
US10031829B2 (en) Method and system for it resources performance analysis
Hong et al. Towards automatic spatial verification of sensor placement in buildings
CN102637178A (en) Music recommending method, music recommending device and music recommending system
CN110245650B (en) Vibrate intelligent detecting method and Related product
CN106326913A (en) Money laundering account determination method and device
CN106030565B (en) Use the computer performance prediction of search technique
Mothe et al. Community detection: Comparison of state of the art algorithms
Egri et al. Cross-correlation based clustering and dimension reduction of multivariate time series
Saxena Educational data mining: performance evaluation of decision tree and clustering techniques using weka platform
CN1749988A (en) Methods and apparatus for managing and predicting performance of automatic classifiers
CN104050070B (en) High-dimensional flow data changing point detection method in distributed system
Balzanella et al. Histogram-based clustering of multiple data streams
Zhao et al. Anomaly detection of aircraft lead‐acid battery
CN109615018B (en) User personalized behavior evaluation method and device, computer equipment and storage medium
CN110472188A (en) A kind of abnormal patterns detection method of facing sensing data
Assent et al. Clustering multidimensional sequences in spatial and temporal databases
Crabtree et al. Standardized evaluation method for web clustering results
Ceroni et al. Towards an entity–based automatic event validation
Mercioni et al. Evaluating hierarchical and non-hierarchical grouping for develop a smart system
Dagnely et al. Annotating the performance of industrial assets via relevancy estimation of event logs

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170222

CF01 Termination of patent right due to non-payment of annual fee