CN106203474A - A kind of flow data clustering method dynamically changed based on density value - Google Patents

A kind of flow data clustering method dynamically changed based on density value Download PDF

Info

Publication number
CN106203474A
CN106203474A CN201610486506.7A CN201610486506A CN106203474A CN 106203474 A CN106203474 A CN 106203474A CN 201610486506 A CN201610486506 A CN 201610486506A CN 106203474 A CN106203474 A CN 106203474A
Authority
CN
China
Prior art keywords
clucell
data structure
data
fluxion
outlier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610486506.7A
Other languages
Chinese (zh)
Inventor
巩树凤
张岩峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201610486506.7A priority Critical patent/CN106203474A/en
Publication of CN106203474A publication Critical patent/CN106203474A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23211Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with adaptive number of clusters

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of flow data clustering method dynamically changed based on density value, and the method is the radius r using in history flow data set D the distance between all-pair to determine data structure CluCell;For newly-increased flow data, density center clustering algorithm is used to set up the flow data Clustering Model dynamically changed based on density value;According to newly-increased flow data and the distance relation of data structure CluCell and newly-increased data and the distance relation of outlier, update flow data Clustering Model, thus process newly-increased flow data;The flow data of arbitrary shape can not only be clustered by the method, and it can be found that the cluster occurred in flow data cluster process generates, merges and the change of division, user can be according to the cluster result change detected, this algorithm can also detect outlier during performing cluster, outlier is often the wrong data etc. produced in system, may determine that whether system breaks down by the outlier detected.

Description

A kind of flow data clustering method dynamically changed based on density value
Technical field
The invention belongs to the technical field of flow data cluster analysis, be specifically related to a kind of stream dynamically changed based on density value Data clustering method.
Background technology
Existing a lot of flow data clustering methods have certain one-sidedness, there is certain defect.Although these algorithms Stream data can carry out cluster analysis, but not reach the requirement of flow data cluster analysis.Cluster analysis is required to Accomplish following some: arbitrary shape is clustered, identify outlier (being not belonging to the point of any cluster), detection cluster change (merge and divide).Such as, although CluStream algorithm can cluster with stream data, but the method is only applicable to that The data of a little linear partition, and undesirable to the data clusters effect of the Nonlinear separability of some concave edge shapes etc, secondly, CluStream algorithm includes two stages: one, online data extracts the stage;Two, off-line data clustering phase;Need every time When checking cluster result, being required for triggering an off-line operation, when frequently checking cluster result, cluster efficiency can reduce.D- Stream and DenStream algorithm uses method based on density extract data and cluster, although the two algorithm energy Enough flow data set to arbitrary shape cluster, but, it is poly-that the two algorithm yet suffers from needing off-line operation just to can know that The problem whether class state changes.E-Stream is by the statistics to coordinate axes, according to currently clustering situation, in conjunction with just arrive to Data change cluster state, and change according to change-detection cluster result before and after cluster state, the division such as clustered and conjunction And.Although E-Stream algorithm just can detect without off-line operation cluster state change, but due to this algorithm be based on Coordinate is added up, and therefore can only cluster the data set of linear separability, and not ideal to the cluster result of arbitrary shape.
Summary of the invention
For the deficiencies in the prior art, the present invention proposes a kind of flow data clustering method dynamically changed based on density value.
The technical scheme is that
A kind of flow data clustering method dynamically changed based on density value, comprises the following steps:
Step 1: use in history flow data set D the distance between all-pair to determine the radius of data structure CluCell r;
Step 1.1: by K flow data composition history flow data set D of history buffer;
Step 1.2: calculate each point in history flow data set D between distance;
Step 1.3: by each point in history flow data set D between distance value be ranked up from small to large, choose front A% The value at place is as the radius r, wherein 1 < A < 2 of data structure CluCell;
Step 2: for newly-increased flow data, uses density center clustering algorithm to set up the fluxion dynamically changed based on density value According to Clustering Model;
Step 2.1: number threshold value M of setting data structure C luCell;
Step 2.2: receive the current fluxion strong point p arrived, it is judged that currently whether there is data structure CluCell, if so, Perform step 2.3, otherwise, perform step 2.5;
Step 2.3: current all data structures CluCell are carried out density decay according to current time, finds distance stream The data structure CluCell c that data point p is nearestk, and determine its distance dpk
Step 2.4: judging distance dpkWith the magnitude relationship of the radius r of data structure CluCell, if dpk≤ r, then perform Step 2.6, if dpk> r, then perform step 2.5;
Step 2.5: set up data structure CluCell c centered by the p of fluxion strong pointp, delete fluxion strong point p, perform step Rapid 2.7;
Step 2.6: by data structure CluCell c nearest for distance fluxion strong point pkDensity value add 1, delete flow data Point p, returns step 2.2;
Step 2.7: whether number N of statistics current data structure C luCell reaches the number threshold of data structure CluCell Value M, if so, performs step 2.8, otherwise, returns step 2.2;
Step 2.8: the group that scolds calculating current all data structures CluCell is worth, according to current all data structures The density value of CluCell and scold group to be worth the decision diagram of drawing data structure;
All density center points are scolded group to be worth by step 2.9: determine density center point according to the decision diagram of data structure Minima scold group to be worth δ as minimummin
Step 2.10: obtain with density center according to the relations of dependence scolding group to be worth of current all data structures CluCell Point is the tree of root, i.e. clustering tree, all clustering trees one flow data Clustering Model of composition;
Step 3: according to the distance relation of newly-increased flow data and data structure CluCell and newly-increased data and outlier away from From relation, update flow data Clustering Model, thus process newly-increased flow data;
Step 3.1: set time threshold Δ t;
Step 3.2: receive the current fluxion strong point p ' arrived;
Step 3.3: current all data structures CluCell are carried out density decay according to current time, and delete time Between be not inserted into data structure CluCell of flow data in threshold value Δ t;
Step 3.4: find the nearest data structure CluCell c of distance fluxion strong point p 'k′, and determine its distance dp′k′
Step 3.5: judging distance dp′k′Group is scolded to be worth δ with radius r and the minimum of data structure CluCellminSize close System, if dp′k′≤ r, then perform step 3.6, if r is < dp′k′≤δmin, then perform step 3.7, if dp′k′> δmin, then step is performed 3.8;
Step 3.6: by data structure CluCell c nearest for distance fluxion strong point p 'k′Density value add 1, delete fluxion Strong point p ', performs step 3.11;
Step 3.7: set up data structure CluCell c centered by the p ' of fluxion strong pointp′, delete fluxion strong point p ', perform Step 3.9:
Step 3.8: be inserted in the outlier pond temporarily depositing outlier by fluxion strong point p ', performs step 3.12;
Step 3.9: calculate in outlier pond each outlier to data structure CluCell cp′Distance dp′oIf there being Z dp′oThe outlier of≤r, then by data structure CluCell cp′Density value add the freshness of this Z outlier, delete this Z , if there is Z ' individual r < d in outlierp′o≤δminOutlier, then centered by the individual outlier of this Z ', set up data structure CluCell, deletes the individual outlier of this Z ';
Step 3.10: the group that scolds updating each data structure CluCell is worth, and according to each data structure after updating The group that scolds of CluCell is worth renewal clustering tree, returns step 3.2;
Step 3.11: find the nearest outlier o ' of fluxion strong point p ', and determine its distance dp′o′
Step 3.12: judging distance dp′o′Group is scolded to be worth δ with radius r and the minimum of data structure CluCellminSize close System, if dp′o′≤ r, then perform step 3.13, if r is < dp′o′≤δmin, then perform step 3.14, if dp′o′> δmin, then step is returned Rapid 3.2;
Step 3.13: set up data structure CluCell c centered by the p ' of fluxion strong pointp′, by data structure CluCell cp′Density value plus the freshness of outlier o ', delete outlier o ' and fluxion strong point p ', return step 3.9;
Step 3.14: set up data structure CluCell c centered by the p ' of fluxion strong pointp′, build centered by outlier o ' Vertical data structure CluCell co′, delete outlier o ' and fluxion strong point p ', return step 3.9.
The density value decay formula of described several Ju structure C luCell is as follows:
ρ t = Σ i = 0 n f i t = Σ i = 0 n 2 - λ ( t - t i ) = 2 - λ ( t - t l ) ρ t l ;
Wherein, ρtFor data structure CluCell at the density value of moment t, fi tFor i-th fluxion strong point piMoment t's Freshness, 0 < i < n, n are fluxion strong point number, t in data structure CluCelllClose for last data structure CluCell Angle value die-away time, tiFor fluxion strong point piThe generation time, λ is freshness attenuation quotient.
The formula scolding group to be worth of described data structure CluCell is as follows:
δ v t = min l : ρ l t > ρ v t ( | c l , c v | ) ;
Wherein,For data structure CluCell cvThe group that scolds at moment t is worth, | cl, cv| for data structure CluCell cv With data structure CluCell clDistance,For data structure CluCell clAt the density value of moment t,For data structure CluCell cvDensity value at moment t.
The described group that scolds according to each data structure CluCell is worth renewal clustering tree method particularly includes:
If current data structure C luCell cμThe group that scolds be worth δμ t> δminTime, then will be with data structure CluCell cμFor The subtree of root splits off from former clustering tree, forms new clustering tree, if the single data structure C luCell in current clustering tree cmThe group that scolds be worth δm t< δminTime, then will be with data structure CluCell cmClustering tree for root is merged into data structure CluCell cmThe group that scolds be worth the clustering tree at depended on data structure CluCell place, and with CluCell cmGroup is scolded to put depended on data Structure C luCell is father node.
The density value of current all data structures CluCell of described basis and the decision diagram scolding group to be worth drawing data structure have Body is: using the density value of current all data structures CluCell as abscissa, by current all data structures CluCell Group is scolded to be worth for vertical coordinate, the decision diagram of drawing data structure.
Beneficial effects of the present invention:
The present invention proposes a kind of flow data clustering method dynamically changed based on density value, and the method can not only be to arbitrary shape The flow data of shape clusters, and it can be found that the cluster occurred in flow data cluster process generates, merges and division Change, user can according to detect cluster result change, this algorithm perform cluster during can also detect from Group's point, outlier is often the wrong data etc. produced in system, may determine that whether system is sent out by the outlier detected Raw fault.
Accompanying drawing explanation
Fig. 1 is the flow chart of the flow data clustering method dynamically changed based on density value in embodiment of the present invention;
Fig. 2 is to use density center clustering algorithm to set up the fluxion dynamically changed based on density value in embodiment of the present invention Flow chart according to Clustering Model;
Fig. 3 is the decision diagram of the data structure drawn in embodiment of the present invention;
Fig. 4 is according to increasing the distance relation of flow data and data structure CluCell newly and increasing newly in embodiment of the present invention The distance relation of data and outlier updates the flow chart of flow data Clustering Model.
Detailed description of the invention
Below in conjunction with the accompanying drawings the specific embodiment of the invention is described in detail.
A kind of flow data clustering method dynamically changed based on density value, in present embodiment, 17k 2-D data is carried out Cluster is as it is shown in figure 1, comprise the following steps:
Step 1: use in history flow data set D the distance between all-pair to determine the radius of data structure CluCell r。
In present embodiment, data structure CluCell is by n flow data { p1, p2... pnForm have four attributes Space higher-dimension spheroid { s, r, ρt, δt}.Wherein s is seed points or the central point of data structure CluCell, and r is data structure The radius of CluCell, ρtIt is the data structure CluCell density value in time t, δtFor data structure CluCell in the moment The group that scolds of t is worth
Step 1.1: by K flow data composition history flow data set D of history buffer.
Step 1.2: calculate each point in history flow data set D between distance.
Step 1.3: by each point in history flow data set D between distance value be ranked up from small to large, choose front A% The value at place is as the radius r, wherein 1 < A < 2 of data structure CluCell.
In present embodiment, by each point in history flow data set D between distance value be ranked up from small to large, choose Value at 1.5% is as the radius r of data structure CluCell.The radius r of data structure CluCell obtained is 3.
Step 2: for newly-increased flow data, uses density center clustering algorithm to set up the fluxion dynamically changed based on density value According to Clustering Model, as shown in Figure 2.
Step 2.1: number threshold value M of setting data structure C luCell.
In present embodiment, number threshold value M of data structure CluCell set is as 25.
Step 2.2: receive the current fluxion strong point p arrived, it is judged that currently whether there is data structure CluCell, if so, Perform step 2.3, otherwise, perform step 2.5.
Step 2.3: current all data structures CluCell are carried out density decay according to current time, finds distance stream The data structure CluCell c that data point p is nearestk, and determine its distance dpk
In present embodiment, shown in the density value decay formula such as formula (1) of data structure CluCell:
ρ t = Σ i = 0 n f i t = Σ i = 0 n 2 - λ ( t - t i ) = 2 - λ ( t - t l ) ρ t l - - - ( 1 )
Wherein, ρtFor data structure CluCell at the density value of moment t,For i-th fluxion strong point piMoment t's Freshness, 0 < i < n, n are fluxion strong point number, t in data structure CluCelllClose for last data structure CluCell Angle value die-away time, tiFor fluxion strong point piThe generation time, λ is freshness attenuation quotient.
Step 2.4: judging distance dpkWith the magnitude relationship of the radius r of data structure CluCell, if dpk≤ r, then perform Step 2.6, if dpk> r, then perform step 2.5.
Step 2.5: set up data structure CluCell c centered by the p of fluxion strong pointp, delete fluxion strong point p, perform step Rapid 2.7.
Step 2.6: by data structure CluCell c nearest for distance fluxion strong point pkDensity value add 1, delete flow data Point p, returns step 2.2.
Step 2.7: whether number N of statistics current number Ju structure C luCell reaches the number threshold of several Ju structure C luCell Value M, if so, performs step 2.8, otherwise, returns step 2.2.
Step 2.8: the group that scolds calculating current all data structures CluCell is worth, according to current all data structures The density value of CluCell and scold group to be worth the decision diagram of drawing data structure.
In present embodiment, the formula such as formula (2) scolding group to be worth of data structure CluCell is shown:
δ v t = min l : ρ l t > ρ v t ( | c l , c v | ) - - - ( 2 )
Wherein,For data structure CluCell cvThe group that scolds at moment t is worth, | cl, cv| for data structure CluCell cv With data structure CluCell clDistance,For data structure CluCell clAt the density value of moment t,For data structure CluCell cvDensity value at moment t.
In present embodiment, using the density value of current all data structures CluCell as abscissa, by current all numbers It is worth for vertical coordinate according to the group that scolds of structure C luCell, the decision diagram of drawing data structure, as shown in Figure 3.
All density center points are scolded group to be worth by step 2.9: determine density center point according to the decision diagram of data structure Minima scold group to be worth δ as minimummin
In present embodiment, according to Fig. 3, select three data point conducts in the upper right corner in the decision diagram of data structure Density center point.
Step 2.10: obtain with density center according to the relations of dependence scolding group to be worth of current all data structures CluCell Point is the tree of root, i.e. clustering tree, all clustering trees one flow data Clustering Model of composition.
Step 3: according to the distance relation of newly-increased flow data and data structure CluCell and newly-increased data and outlier away from From relation, update flow data Clustering Model, thus process newly-increased flow data, as shown in Figure 4.
Step 3.1: set time threshold Δ t.
In present embodiment, the time threshold Δ t set is as 5s.
Step 3.2: receive the current fluxion strong point p ' arrived.
Step 3.3: current all data structures CluCell are carried out density decay according to current time, and delete time Between be not inserted into data structure CluCell of flow data in threshold value Δ t.
Step 3.4: find the nearest data structure CluCell c of distance fluxion strong point p 'k′, and determine its distance dp′k′
Step 3.5: judging distance dp′k′Group is scolded to be worth δ with radius r and the minimum of data structure CluCellminSize close System, if dp′k′≤ r, then perform step 3.6, if r is < dp′k′≤δmin, then perform step 3.7, if dp′k′> δmin, then step is performed 3.8。
Step 3.6: by data structure CluCell c nearest for distance fluxion strong point p 'k′Density value add 1, delete fluxion Strong point p ', performs step 3.10.
Step 3.7: set up data structure CluCell c centered by the p ' of fluxion strong pointp′, delete fluxion strong point p ', perform Step 3.9.
Step 3.8: be inserted in the outlier pond temporarily depositing outlier by fluxion strong point p ', performs step 3.11.
Step 3.9: calculate in outlier pond each outlier to data structure CluCell cp′Distance dp′oIf there being Z dp′oThe outlier of≤r, then by data structure CluCell cp′Density value add the freshness of this Z outlier, delete this Z , if there is Z ' individual r < d in outlierp′o≤δminOutlier, then centered by the individual outlier of this Z ', set up data structure CluCell, deletes the individual outlier of this Z '.
Step 3.10: the group that scolds updating each data structure CluCell is worth, and according to each data structure after updating The group that scolds of CluCell is worth renewal clustering tree, returns step 3.2.
In present embodiment, if current data structure C luCell cμThe group that scolds be worth δμ t> δminTime, then will be with data structure CluCell cμSubtree for root splits off from former clustering tree, forms new clustering tree, if the radical in current clustering tree According to structure C luCell cmThe group that scolds be worth δm t< δminTime, then will be with data structure CluCell cmClustering tree for root is merged into Data structure CluCell cmThe group that scolds be worth the clustering tree at depended on data structure CluCell place, and with CluCell cmScold Data structure CluCell that group's point is depended on is father node.
Step 3.11: find the nearest outlier o ' of fluxion strong point p ', and determine its distance dp′o′
Step 3.12: judging distance dp′o′Group is scolded to be worth δ with radius r and the minimum of data structure CluCellminSize close System, if dp′o′≤ r, then perform step 3.13, if r is < dp′o′≤δmin, then perform step 3.14, if dp′o′> δmin, then step is returned Rapid 3.2:
Step 3.13: set up data structure CluCell c centered by the p ' of fluxion strong pointp′, by data structure CluCell cp′Density value plus the freshness of outlier o ', delete outlier o ' and fluxion strong point p ', return step 3.9;
Step 3.14: set up data structure CluCell c centered by the p ' of fluxion strong pointp′, build centered by outlier o ' Vertical data structure CluCell co′, delete outlier o ' and fluxion strong point p ', return step 3.9.

Claims (5)

1. the flow data clustering method dynamically changed based on density value, it is characterised in that comprise the following steps:
Step 1: use in history flow data set D the distance between all-pair to determine the radius r of data structure CluCell;
Step 1.1: by K flow data composition history flow data set D of history buffer;
Step 1.2: calculate each point in history flow data set D between distance;
Step 1.3: by each point in history flow data set D between distance value be ranked up from small to large, choose at front A% It is worth as the radius r of data structure CluCell, wherein 1 < A < 2:
Step 2: for newly-increased flow data, uses density center clustering algorithm to set up the flow data dynamically changed based on density value and gathers Class model:
Step 2.1: number threshold value M of setting data structure C luCell;
Step 2.2: receive the current fluxion strong point p arrived, it is judged that currently whether there is data structure CluCell, if so, perform Step 2.3, otherwise, performs step 2.5;
Step 2.3: current all data structures CluCell are carried out density decay according to current time, finds distance flow data The data structure CluCell c that some p is nearestk, and determine its distance dpk
Step 2.4: judging distance dpkWith the magnitude relationship of the radius r of data structure CluCell, if dpk≤ r, then perform step 2.6, if dpk> r, then perform step 2.5;
Step 2.5: set up data structure CluCell c centered by the p of fluxion strong pointp, delete fluxion strong point p, perform step 2.7;
Step 2.6: by data structure CluCell c nearest for distance fluxion strong point pkDensity value add 1, delete fluxion strong point p, Return step 2.2;
Step 2.7: whether number N of statistics current data structure C luCell reaches number threshold value M of data structure CluCell, If so, perform step 2.8, otherwise, return step 2.2;
Step 2.8: the group that scolds calculating current all data structures CluCell is worth, according to current all data structures CluCell Density value and scold group to be worth the decision diagram of drawing data structure;
Step 2.9: determine density center point according to the decision diagram of data structure, by all density center points scold group to be worth in Little value scolds group to be worth δ as minimummin
Step 2.10: obtain according to the relations of dependence scolding group to be worth of current all data structures CluCell with density center point be The tree of root, i.e. clustering tree, all clustering trees one flow data Clustering Model of composition;
Step 3: close with the distance relation of data structure CluCell and the distance of newly-increased data and outlier according to newly-increased flow data System, updates flow data Clustering Model, thus processes newly-increased flow data;
Step 3.1: set time threshold Δ t;
Step 3.2: receive the current fluxion strong point p' arrived;
Step 3.3: according to current time, current all data structures CluCell are carried out density decay, and delete at time threshold Data structure CluCell of flow data it is not inserted in value Δ t;
Step 3.4: find the nearest data structure CluCell c of distance fluxion strong point p'k', and determine its distance dp'k'
Step 3.5: judging distance dp'k'Group is scolded to be worth δ with radius r and the minimum of data structure CluCellminMagnitude relationship, if dp'k'≤ r, then perform step 3.6, if r is < dp'k'≤δmin, then perform step 3.7, if dp'k'min, then step 3.8 is performed;
Step 3.6: by data structure CluCell c nearest for distance fluxion strong point p'k'Density value add 1, delete fluxion strong point P', performs step 3.11;
Step 3.7: set up data structure CluCell c centered by the p' of fluxion strong pointp', delete fluxion strong point p', perform step 3.9;
Step 3.8: be inserted in the outlier pond temporarily depositing outlier by fluxion strong point p', performs step 3.12;
Step 3.9: calculate in outlier pond each outlier to data structure CluCell cp'Distance dp'oIf there is Z dp'o The outlier of≤r, then by data structure CluCell cp'Density value add the freshness of this Z outlier, delete this Z and peel off , if there is Z' r < d in pointp'o≤δminOutlier, then centered by this Z' outlier, set up data structure CluCell, delete Except this Z' outlier;
Step 3.10: the group that scolds updating each data structure CluCell is worth, and according to each data structure CluCell after updating Scold group to be worth renewal clustering tree, return step 3.2;
Step 3.11: find nearest outlier o' of fluxion strong point p', and determine its distance dp'o'
Step 3.12: judging distance dp'o'Group is scolded to be worth δ with radius r and the minimum of data structure CluCellminMagnitude relationship, if dp'o'≤ r, then perform step 3.13, if r is < dp'o'≤δmin, then perform step 3.14, if dp'o'min, then step 3.2 is returned;
Step 3.13: set up data structure CluCell c centered by the p' of fluxion strong pointp', by data structure CluCell cp''s Density value, plus the freshness of outlier o', deletes outlier o' and fluxion strong point p', returns step 3.9;
Step 3.14: set up data structure CluCell c centered by the p' of fluxion strong pointp', centered by outlier o', set up number According to structure C luCell co', delete outlier o' and fluxion strong point p', return step 3.9.
The flow data clustering method dynamically changed based on density value the most according to claim 1, it is characterised in that described number Density value decay formula according to structure C luCell is as follows:
&rho; t = &Sigma; i = 0 n f i t = &Sigma; i = 0 n 2 - &lambda; ( t - t i ) = 2 - &lambda; ( t - t i ) &rho; t l ;
Wherein, ρtFor data structure CluCell at the density value of moment t,For i-th fluxion strong point piFresh at moment t Degree, 0 < i < n, n is fluxion strong point number, t in data structure CluCelllDensity value for last data structure CluCell declines Subtract time, tiFor fluxion strong point piThe generation time, λ is freshness attenuation quotient.
The flow data clustering method dynamically changed based on density value the most according to claim 1, it is characterised in that described number The formula scolding group to be worth according to structure C luCell is as follows:
&delta; v t = min l : &rho; l t > &rho; v t ( | c l , c v | ) ;
Wherein,For data structure CluCell cvThe group that scolds at moment t is worth, | cl,cv| for data structure CluCell cvWith number According to structure C luCell clDistance,For data structure CluCell clAt the density value of moment t,For data structure CluCell cvDensity value at moment t.
The flow data clustering method dynamically changed based on density value the most according to claim 1, it is characterised in that described It is worth renewal clustering tree according to the group that scolds of each data structure CluCell method particularly includes:
If current data structure C luCell cμThe group that scolds be worthTime, then will be with data structure CluCell cμSon for root Tree splits off from former clustering tree, forms new clustering tree, if the single data structure C luCell c in current clustering treem's Group is scolded to be worthTime, then will be with data structure CluCell cmClustering tree for root is merged into data structure CluCell cm The group that scolds be worth the clustering tree at depended on data structure CluCell place, and with CluCell cmGroup is scolded to put depended on data Structure C luCell is father node.
The flow data clustering method dynamically changed based on density value the most according to claim 1, it is characterised in that described According to the density value of current all data structures CluCell with scold decision diagram that group is worth drawing data structure particularly as follows: by current institute There is the density value of data structure CluCell as abscissa, the group that scolds of current all data structures CluCell is worth for vertical seat Mark, the decision diagram of drawing data structure.
CN201610486506.7A 2016-06-27 2016-06-27 A kind of flow data clustering method dynamically changed based on density value Pending CN106203474A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610486506.7A CN106203474A (en) 2016-06-27 2016-06-27 A kind of flow data clustering method dynamically changed based on density value

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610486506.7A CN106203474A (en) 2016-06-27 2016-06-27 A kind of flow data clustering method dynamically changed based on density value

Publications (1)

Publication Number Publication Date
CN106203474A true CN106203474A (en) 2016-12-07

Family

ID=57460984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610486506.7A Pending CN106203474A (en) 2016-06-27 2016-06-27 A kind of flow data clustering method dynamically changed based on density value

Country Status (1)

Country Link
CN (1) CN106203474A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107038593A (en) * 2017-04-06 2017-08-11 广东顺德中山大学卡内基梅隆大学国际联合研究院 A kind of method for processing abnormal data and system based on anti-fake traceability system
CN108449306A (en) * 2017-02-16 2018-08-24 上海行邑信息科技有限公司 One kind degree of peeling off detection method
CN109933040A (en) * 2017-12-18 2019-06-25 中国科学院沈阳自动化研究所 Fault monitoring method based on level density peaks cluster and most like mode

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1855097A (en) * 2005-04-20 2006-11-01 国际商业机器公司 Method and device for processing data flow
CN101394345A (en) * 2008-10-22 2009-03-25 南京邮电大学 Co-evolutionary clustering method oriented to data stream sensing by general computation
CN103353883A (en) * 2013-06-19 2013-10-16 华南师范大学 Big data stream type cluster processing system and method for on-demand clustering
CN104244035A (en) * 2014-08-27 2014-12-24 南京邮电大学 Network video flow classification method based on multilayer clustering
US20150154457A1 (en) * 2012-06-28 2015-06-04 International Business Machines Corporation Object retrieval in video data using complementary detectors

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1855097A (en) * 2005-04-20 2006-11-01 国际商业机器公司 Method and device for processing data flow
CN101394345A (en) * 2008-10-22 2009-03-25 南京邮电大学 Co-evolutionary clustering method oriented to data stream sensing by general computation
US20150154457A1 (en) * 2012-06-28 2015-06-04 International Business Machines Corporation Object retrieval in video data using complementary detectors
CN103353883A (en) * 2013-06-19 2013-10-16 华南师范大学 Big data stream type cluster processing system and method for on-demand clustering
CN104244035A (en) * 2014-08-27 2014-12-24 南京邮电大学 Network video flow classification method based on multilayer clustering

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108449306A (en) * 2017-02-16 2018-08-24 上海行邑信息科技有限公司 One kind degree of peeling off detection method
CN107038593A (en) * 2017-04-06 2017-08-11 广东顺德中山大学卡内基梅隆大学国际联合研究院 A kind of method for processing abnormal data and system based on anti-fake traceability system
CN107038593B (en) * 2017-04-06 2020-07-21 广东顺德中山大学卡内基梅隆大学国际联合研究院 Abnormal data processing method and system based on anti-counterfeiting traceability system
CN109933040A (en) * 2017-12-18 2019-06-25 中国科学院沈阳自动化研究所 Fault monitoring method based on level density peaks cluster and most like mode
CN109933040B (en) * 2017-12-18 2020-08-07 中国科学院沈阳自动化研究所 Fault monitoring method based on hierarchical density peak clustering and most similar mode

Similar Documents

Publication Publication Date Title
CN106250905B (en) Real-time energy consumption abnormity detection method combined with building structure characteristics of colleges and universities
CN105550426B (en) A kind of multiple dimensioned binary tree blast furnace method for diagnosing faults based on sample decomposition
CN106203474A (en) A kind of flow data clustering method dynamically changed based on density value
CN109543765A (en) A kind of industrial data denoising method based on improvement IForest
CN107169557A (en) A kind of method being improved to cuckoo optimized algorithm
CN107579846B (en) Cloud computing fault data detection method and system
CN103886396A (en) Method for determining mixing optimizing of artificial fish stock and particle swarm
CN107831024A (en) Fan vibration malfunction diagnostic method based on multiple spot vibration signal characteristics value
CN108897936A (en) A kind of sewage source heat pump unit method for diagnosing faults based on PSO-BP model
CN108388113B (en) Least square method supporting vector machine soft-measuring modeling method based on distribution estimation local optimum
CN108388745A (en) Least square method supporting vector machine flexible measurement method based on distributed parallel local optimum parameter
CN106127595A (en) A kind of community structure detection method based on positive and negative side information
CN106952267A (en) Threedimensional model collection is divided into segmentation method and device
CN105049286A (en) Cloud platform speed measurement data judging method based on hierarchical clustering
CN109669030A (en) A kind of industrial injecting products defect diagnostic method based on decision tree
CN110399917A (en) A kind of image classification method based on hyperparameter optimization CNN
CN112231775A (en) Hardware Trojan horse detection method based on Adaboost algorithm
CN105590167A (en) Method and device for analyzing electric field multivariate operating data
CN107609982B (en) Method for carrying out community discovery by considering community structure stability and increment related nodes
CN105787296B (en) A kind of comparative approach of macro genome and macro transcript profile sample distinctiveness ratio
CN108665002A (en) A kind of two classification task label noises tolerance grader learning method
CN109299402A (en) Based on the pre-staged address matching method of element
CN105183612B (en) The appraisal procedure of server free memory abnormal growth and operation conditions
CN107562778A (en) A kind of outlier excavation method based on deviation feature
CN116739147A (en) BIM-based intelligent energy consumption management and dynamic carbon emission calculation combined method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161207