CN105930457A - Distributed architecture-based data flow frequent item mining method - Google Patents
Distributed architecture-based data flow frequent item mining method Download PDFInfo
- Publication number
- CN105930457A CN105930457A CN201610254621.1A CN201610254621A CN105930457A CN 105930457 A CN105930457 A CN 105930457A CN 201610254621 A CN201610254621 A CN 201610254621A CN 105930457 A CN105930457 A CN 105930457A
- Authority
- CN
- China
- Prior art keywords
- data item
- entry
- frequency
- leaf node
- rickle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Fuzzy Systems (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a distributed architecture-based data flow frequent item mining method. According to the method, a two-layer tree-shaped communication structure which comprises m leaf nodes and 1 root node is adopted, wherein the leaf nodes are responsible for processing data items in data flows and sending frequency increments to the root node when increments of frequencies of the data items exceed a threshold value; the root node is responsible for collecting updating transferred by the leaf nodes. The method is low in communication overhead, and can be used for responding frequent item query requests initiated by users in real time.
Description
Technical field
The invention belongs to data mining technology field, relate to a kind of frequent-item method, particularly relate to be applicable to the frequent of distributed structure/architecture
Item method for digging.
Background technology
In the data mining problem such as association rule mining, sequential mode mining, relevant mining, multilayered schema excavation, frequent-item is
Basic step, is also committed step.Having the much research of the frequent-item of data stream on single node at present, these researchs substantially can be drawn
Point two classes: method based on enumerator and method based on sketch.
Frequent-item method based on enumerator can safeguard one group of enumerator frequency for statistical data item.Each enumerator comprises two ginsengs
Number, is data item title and data item frequency respectively.When data stream has data item to arrive, if there is this number in the enumerator safeguarded
According to item, increase the frequency that this enumerator is recorded the most accordingly;Otherwise increase a new enumerator to be used for storing new data item or replacing one
Original enumerator.The expense of the frequent-item method process individual data item being typically based on enumerator is the lowest, and it is the most right to however it is necessary that
All enumerators are ranked up.Frequent-item method based on enumerator has Frequent, Space Saving and Lossy Counting etc..
Method based on sketch uses the Hash table being made up of one-dimensional or bidimensional enumerator array to estimate the frequency of each data item in data stream
Rate.This kind of method generally uses salted hash Salted by each maps data items to corresponding multiple enumerators, and single enumerator may be by many numbers
Being shared according to item, the data item i.e. with identical cryptographic Hash shares same enumerator.When a data item arrives, correspondence only need to be revised
The value of enumerator.When user submits inquiry request to, use the value of enumerator of correspondence to estimate frequency, it is possible to will by mistake with higher confidence level
Difference controls within the scope of certain.It is said that in general, method based on Hash needs the support of additional data structures, as use heap come with
The frequent episode of track candidate.CountSketch, CountMin Sketch and hCount etc. is had based on Skech frequent-item method.
Traditional frequent-item method the most only considers to process data mapping on single node, but at present should in a lot of reality
In with, it is large-scale distributed for needing the data excavated, such as stream detection in network monitoring, DDOS attack detection etc..Currently for dividing
The research of the frequent-item method under cloth stream environment mainly includes that what A.Manjhi proposed is applicable to the tree-like or topology knot of multipath figure
The Tributary-Delta method of structure;The approximation side of a kind of frequent episode can followed the tracks of continuously in high speed distributed stream that G.Cormode proposes
Method;The method etc. that the distributed top-k that B.Babcock and C.Olston proposes monitors.The main deficiency that these methods exist includes: communication is opened
Sell excessive and do not support real-time query.
Summary of the invention
The present invention seeks to solve existing method and there is the problem that communication overhead is excessive and does not support real-time query, it is provided that be a kind of based on distributed
The data stream frequent item method for digging of framework, may be used to improve tradition frequent-item method disposal ability on distributed structure/architecture.
The invention provides data stream frequent item method for digging based on distributed structure/architecture, the method is the distributed traffic frequency of a kind of Weighted Coefficients
ε-the approximation method of numerous method for digging.The method uses 2 layers of tree-like communication structure, including m leaf node and 1 root node;Described
Leaf node is responsible for processing the data item in data stream, the data item frequency in data stream is stored in the rickle of leaf node, and is counting
During according to item frequency increment more than threshold value, data item frequency increment is sent to root node;Described root node is responsible for calculating data item in overall architecture
In frequency estimation, data item frequency estimation is stored in the rickle of root node;The bar of storage in the rickle of described leaf node
Mesh includes data item title, data item frequency and data item frequency increment;In the rickle of described root node, the entry of storage includes data item
Title and data item frequency estimation.
Technical solution of the present invention:
Step 1), each leaf node i from the data stream received, take out data item successively, described data item includes data item title vtAnd number
According to item frequency cV, t;
Step 2), update data item frequency sum N of described leaf nodei=Ni+cV, tAnd the increment of data item frequency sum
Δi=Δi+cV, t, equal sign therein statement assignment, lower same;
Step 3), according to step 1) data item title v of data item taken outtWith data item frequency cV, tMinimum at described leaf node
Heap HiIn find out suitable entry, and be the data item title in this entry, data item frequency and data item frequency increment assignment;This step
Including:
Step 3-1), judge described in leaf node rickle in whether there is data item name and be referred to as vtEntry, if exist perform next step,
Otherwise, step 3-5 is performed);
Step 3-2), judge described in the rickle H of leaf nodeiThe fullest, if the fullest, perform next step, otherwise, perform step 3-
4);
Step 3-3), from the rickle H of described leaf nodeiEntry item that middle taking-up data item frequency is minimummin, this entry is composed again
Value, then performs step 4);Wherein, this entry assignment is included:
Make v=vt, cv=cv+cV, t, Δv=cV, t;
Described v represents the data item title taking out entry, described vtRepresent the data item title of the data item taken out, described cvExpression takes
Shaping purpose data item frequency, described cV, tRepresent the data item frequency of the data item taken out, described ΔvRepresent the data item frequency taking out entry
Rate increment;
Step 3-4), create a new entry be new entry assignment, by the rickle H of the leaf node described in the insertion of new entryiIn, then
Perform step 4);Wherein, new entry assignment is included:
Make v=vt, cv=cV, t, Δv=cV, t;
Step 3-5), from the rickle H of described leaf nodeiThe already present entry of middle taking-upAnd this entry is updated, so
Rear execution step 4);Wherein, the renewal of this entry is included:
Make cv=cv+cV, t, Δv=Δv+cV, t;
Step 4), judge described in the increment of data item frequency sum and the data item frequency increment of described entry whether more than threshold value, if
More than threshold value, update to root node transmission;This step includes:
Step 4-1), judge described in the increment Delta of data item frequency sumiWhether meet Δi> βiNi, if it is satisfied, perform next step, no
Then, step 4-3 is performed);Wherein,
Described βiRepresent the renewal retardation coefficient of user-defined leaf node, described NiRepresent described leaf node data item frequency it
With;
Step 4-2), described leaf node sends 0-msg to root node and updates, then by ΔiValue be set to 0;Wherein
Described 0-msg updates the content sent and includes the increment Delta of described data item frequency sumi;
Step 4-3), the data item frequency increment Δ of described entryvWhether meet Δv> βiNi, if it is satisfied, perform next step, otherwise,
Perform step 5);
Described βiRepresent the renewal retardation coefficient of user-defined leaf node, described NiRepresent described leaf node data item frequency it
With;
Step 4-4), described leaf node send data item update to root node, then by ΔvValue be set to 0;Wherein
The content that described data item update sends includes the data item title of described entry and the data item frequency increment Δ of described entryv;
Step 5), described root node from the renewal that described leaf node sends, take out renewal successively, and according to the updating maintenance phase taken out
The data answered;This step includes:
Step 5-1), judge the type of the renewal that the described leaf node that root node takes out sends, if 0-msg updates, perform next step,
If data item update, perform step 5-3);
Step 5-2), update described in the estimated value N=N+ Δ of data item frequency sum of root nodei, wherein equal sign represents assignment, then
Perform step 6);Wherein,
Described N represents the estimated value of the data item frequency sum of root node, described ΔiThe 0-msg that leaf node described in expression sends updates
Frequency;
Step 5-3), update described in the rickle H of root node0;This step includes:
Step 5-3-1), take out described in leaf node send renewal in data item title vtAnd data item frequency increment ΔV, t;
Step 5-3-2), judge described in root node rickle in whether there is data item name and be referred to as vtEntry itemvIf existing under performing
One step, otherwise, performs step 5-3-4);
Step 5-3-3), take out described in entry itemv, to and this entry is updated, then perform step 6);Wherein, to this
Mesh updates and includes:
Make cv=cv+Δv,t, wherein equal sign represents assignment;
Described v represents the data item title taking out entry, described cvRepresent the data item frequency taking out entry, described ΔV, tRepresent and take out
The data item frequency increment of data item update;
Step 5-3-4), judge described in the rickle H of root node0The fullest, if the fullest, perform next step, otherwise perform 5-3-6);
Step 5-3-5), take out described in the rickle H of root node0Entry item that middle data item frequency is minimummin, this entry is composed again
Value, then performs step 6);Wherein, this entry assignment is included:
Make v=vt, cv=cv+ΔV, t;
Described vtRepresent the data item title of the data item update taken out;
Step 5-3-6), create a new entry be new entry assignment, by the rickle of the root node maintenance described in the insertion of new entry, so
Rear execution step 6);Wherein, new entry assignment is included:
Make v=vt, cv=ΔV, t;
Step 6), according to the request of user, the rickle H described in root node traversal0, return all data item frequenciesBar
Mesh is frequent episode to be excavated.
The present invention is in described step 4) and step 5) between also include the rickle H to leaf nodeiCarry out carrying out according to the frequency of data item
The operating procedure of sequence.
And, the step 5 described) and step 6) between also include the rickle H to root node0Carry out arranging according to the frequency of data item
The operating procedure of sequence.
Advantages of the present invention and beneficial effect:
Error between estimated value and the actual value of the data item frequency of the frequent episode of the method output that the present invention provides is not more than ε N, single link
On maximum communication expense be not more thanThe frequent episode inquiry that user is real-time can be supported.
Accompanying drawing explanation
Fig. 1 is the communication structure of data stream frequent item method for digging based on distributed structure/architecture.
Fig. 2 is the average relative error of data stream frequent item method for digging based on distributed structure/architecture.
Fig. 3 is the single link communications expense of data stream frequent item method for digging based on distributed structure/architecture.
Fig. 4 is the initialization time of data stream frequent item method for digging based on distributed structure/architecture.
Detailed description of the invention
For the apparent method expressing the present invention intuitively, thin to data stream frequent item method for digging based on distributed structure/architecture below
Joint is described in detail:
1. determine parameter
Distributed frequent-item method it needs to be determined that parameter include:
(1) Item-supportAnd degree of error
(2) leaf node data m;
(3) each leaf node i and the rickle H of root nodeiAnd H0Size beWith
(4) delay of each leaf node i updates factor betai(0 < βi< ε).
The size of the rickle of leaf node quantity m=8 in the present embodiment, leaf node and root node is 10000, i.e.
α0=αi=0.0001, βi∈ [0.001,0.005], ε ∈ [0.0001,0.0005],
2. initialize
The initialization time determining each leaf node i isBi> 0 is the bandwidth of link, unit between leaf node i and root node
For packets/second.In the initial moment that the inventive method is run, each leaf node i can process receive from data stream SiData item update,
But without any message being passed to its root node until init state terminates.
3. the data item during leaf node processes data stream
As data stream SiIn have new data item (v, cV, t) arrive leaf node i time, first update data stream SiMiddle data item frequency sum
Ni=Ni+cV, t, wherein equal sign represents assignment, lower same, and the frequency estimation increment Delta of data item frequency sumi=Δi+cV, t.Secondly
Update corresponding data item frequency: if v is ∈ Hi, then the frequency estimation c of corresponding data item entry is increasedv=cv+cV, tWith data item v
Frequency estimation increment Deltav=Δv+cV, t;Otherwise find HiData item item that medium frequency estimated value is minimummin, by itemminData item
Name replaces with v, and updates its data item frequency cv=cv+cV, tWith frequency estimation increment Deltav=cV, t.Finally check whether meet condition to
Root node transmission data item update: if Δi> βiNi, then send renewal (0, Δi) give root node, and reset frequency estimation increment Deltai=0;
If Δv> βiNi, then renewal (v, Δ are sentv) give root node, and reset frequency estimation increment Deltav=0.
4. root node processes the renewal that leaf node sends
When root node receives data item update (v, the Δ of leaf node transmissionv) time, if v is ∈ H0, then the frequency of corresponding data item entry is increased
Estimated value cv=cv+Δv;Otherwise find H0Data item item that medium frequency estimated value is minimummin, by itemminData item name replace with v,
And update its data item frequency cv=cv+Δv.When root node receive leaf node transmission 0-msg update (0, Δi) time, update root node pair
Estimated value N of data item frequency sum0=N0+Δi。
5. root node processes the inquiry of Client-initiated frequent episode
When user submits frequent episode inquiry to root node, root node scans rickle H0Each data item entry (v, the c of middle maintenancev).If
cv≥ωN0, then it is assumed that v is frequent episode and is exported by v, and wherein ω is output threshold value, has For user-defined Item-support, αmax=max (αi), βmax=max (βi)。
The present invention uses the mode of real data and computer simulation to implement.
The present invention selects the network flow data collection gathered under 3 groups of real network environments as the data source in embodiment.This 3 group data set
Respectively: CERNET data set, it is to gather on the OC-48 link of CERNET (China Education and Research Network)
TCP bi-directional data collection;CAIDA48 data set, is the anonymous data collection gathered on OC-48west coast peering link;
CAIDA192 data set, is the individual event anonymous data collection gathered on OC-192 link.Network flow data is concentrated IP data by the present invention
The five-tuple (source IP address, purpose IP address, source port, destination interface, transport layer protocol) of bag is defined as data item name, by packet
The length of load is defined as data item frequency.
Definition updates the relative value of retardation coefficient βFig. 2 shows that the inventive method processes the average relative of 3 groups of different Network data sets
Error.Can observe, when ε ∈ [0.0001,0.0005], the average relative error of method is respectively less than value and the N product of current ε.
Fig. 3 shows that the inventive method processes the single-link expense of 3 groups of different Network data sets.For every width subgraph of Fig. 3, have respectively
Article 4, curve, from top to bottom every curve represents the theoretical maximum of the single link communications expense processing current network data set respectively
Actual maximum, actual mean value and the actual minimum of single link communications expense.Can observe, the actual maximum in single link is led to
Letter expense is not more than
Fig. 4 shows the initialization time of the inventive method.Can observe, the relative value updating retardation coefficient β is the biggest, the inventive method institute
The initialization time needed is the fewest.
Claims (3)
1. a data stream frequent item method for digging based on distributed structure/architecture, the method uses 2 layers of tree-like communication structure, including m
Leaf node and 1 root node;Described leaf node is responsible for processing the data item in data stream, the data item frequency in data stream is stored in
In the rickle of leaf node, and when data item frequency increment is more than threshold value, data item frequency increment is sent to root node;Described joint
Point is responsible for calculating data item frequency estimation in overall architecture, is stored in by data item frequency estimation in the rickle of root node;Described
In the rickle of leaf node, the entry of storage includes data item title, data item frequency and data item frequency increment;Described root node is
In rickle, the entry of storage includes data item title and data item frequency estimation;The method includes:
Step 1), each leaf node i from the data stream received, take out data item successively, described data item includes data item title vtAnd number
According to item frequency cV, t;
Step 2), update data item frequency sum N of described leaf nodei=Ni+cV, tAnd the increment of data item frequency sum
Δi=Δi+cV, t, equal sign therein statement assignment, lower same;
Step 3), according to step 1) data item title v of data item taken outtWith data item frequency cV, tMinimum at described leaf node
Heap HiIn find out suitable entry, and be the data item title in this entry, data item frequency and data item frequency increment assignment;This step
Including:
Step 3-1), judge described in leaf node rickle in whether there is data item name and be referred to as vtEntry, if exist perform next step,
Otherwise, step 3-5 is performed);
Step 3-2), judge described in the rickle H of leaf nodeiThe fullest, if the fullest, perform next step, otherwise, perform step 3-
4);
Step 3-3), from the rickle H of described leaf nodeiEntry item that middle taking-up data item frequency is minimummin, this entry is composed again
Value, then performs step 4);Wherein, this entry assignment is included:
Make v=vt, cv=cv+cV, t, Δv=cV, t;
Described v represents the data item title taking out entry, described vtRepresent the data item title of the data item taken out, described cvExpression takes
Shaping purpose data item frequency, described cV, tRepresent the data item frequency of the data item taken out, described ΔvRepresent the data item frequency taking out entry
Rate increment;
Step 3-4), create a new entry be new entry assignment, by the rickle H of the leaf node described in the insertion of new entryiIn, then
Perform step 4);Wherein, new entry assignment is included:
Make v=vt, cv=cV, t, Δv=cV, t;
Step 3-5), from the rickle H of described leaf nodeiThe already present entry of middle taking-upAnd this entry is updated, so
Rear execution step 4);Wherein, the renewal of this entry is included:
Make cv=cv+cV, t, Δv=Δv+cV, t;
Step 4), judge described in the increment of data item frequency sum and the data item frequency increment of described entry whether more than threshold value, if
More than threshold value, update to root node transmission;This step includes:
Step 4-1), judge described in the increment Delta of data item frequency sumiWhether meet Δi> βiNi, if it is satisfied, perform next step, no
Then, step 4-3 is performed);Wherein,
Described βiRepresent the renewal retardation coefficient of user-defined leaf node, described NiRepresent described leaf node data item frequency it
With;
Step 4-2), described leaf node sends 0-msg to root node and updates, then by ΔiValue be set to 0;Wherein
Described 0-msg updates the content sent and includes the increment Delta of described data item frequency sumi;
Step 4-3), the data item frequency increment Δ of described entryvWhether meet Δv> βiNi, if it is satisfied, perform next step, otherwise,
Perform step 5);
Described βiRepresent the renewal retardation coefficient of user-defined leaf node, described NiRepresent described leaf node data item frequency it
With;
Step 4-4), described leaf node send data item update to root node, then by ΔvValue be set to 0;Wherein
The content that described data item update sends includes the data item title of described entry and the data item frequency increment Δ of described entryv;
Step 5), described root node from the renewal that described leaf node sends, take out renewal successively, and according to the updating maintenance phase taken out
The data answered;This step includes:
Step 5-1), judge the type of the renewal that the described leaf node that root node takes out sends, if 0-msg updates, perform next step,
If data item update, perform step 5-3);
Step 5-2), update described in the estimated value N=N+ Δ of data item frequency sum of root nodei, wherein equal sign represents assignment, then
Perform step 6);Wherein,
Described N represents the estimated value of the data item frequency sum of root node, described ΔiThe 0-msg that leaf node described in expression sends updates
Frequency;
Step 5-3), update described in the rickle H of root node0;This step includes:
Step 5-3-1), take out described in leaf node send renewal in data item title vtAnd data item frequency increment ΔV, t;
Step 5-3-2), judge described in root node rickle in whether there is data item name and be referred to as vtEntry itemvIf existing under performing
One step, otherwise, performs step 5-3-4);
Step 5-3-3), take out described in entry itemv, to and this entry is updated, then perform step 6);Wherein, to this
Mesh updates and includes:
Make cv=cv+ΔV, t, wherein equal sign represents assignment;
Described v represents the data item title taking out entry, described cvRepresent the data item frequency taking out entry, described ΔV, tRepresent and take out
The data item frequency increment of data item update;
Step 5-3-4), judge described in the rickle H of root node0The fullest, if the fullest, perform next step, otherwise perform 5-3-6);
Step 5-3-5), take out described in the rickle H of root node0Entry item that middle data item frequency is minimummin, this entry is composed again
Value, then performs step 6);Wherein, this entry assignment is included:
Make v=vt, cv=cv+ΔV, t;
Described vtRepresent the data item title of the data item update taken out;
Step 5-3-6), create a new entry be new entry assignment, by the rickle of the root node maintenance described in the insertion of new entry, so
Rear execution step 6);Wherein, new entry assignment is included:
Make v=vt, cv=ΔV, t;
Step 6), according to the request of user, the rickle H described in root node traversal0, return all data item frequenciesBar
Mesh is frequent episode to be excavated.
Data stream frequent item method for digging based on distributed structure/architecture the most according to claim 1, it is characterised in that in described step
4) and step 5) between also include the rickle H to leaf nodeiCarry out the operating procedure being ranked up according to the frequency of data item.
Data stream frequent item method for digging based on distributed structure/architecture the most according to claim 1, it is characterised in that in described step
5) and step 6) between also include the rickle H to root node0Carry out the operating procedure being ranked up according to the frequency of data item.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610254621.1A CN105930457A (en) | 2016-04-21 | 2016-04-21 | Distributed architecture-based data flow frequent item mining method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610254621.1A CN105930457A (en) | 2016-04-21 | 2016-04-21 | Distributed architecture-based data flow frequent item mining method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105930457A true CN105930457A (en) | 2016-09-07 |
Family
ID=56838834
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610254621.1A Pending CN105930457A (en) | 2016-04-21 | 2016-04-21 | Distributed architecture-based data flow frequent item mining method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105930457A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115473933A (en) * | 2022-10-10 | 2022-12-13 | 国网江苏省电力有限公司南通供电分公司 | Network system associated service discovery method based on frequent subgraph mining |
CN116881338A (en) * | 2023-09-07 | 2023-10-13 | 北京傲星科技有限公司 | Data mining method and related equipment for data stream based on large model |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030028531A1 (en) * | 2000-01-03 | 2003-02-06 | Jiawei Han | Methods and system for mining frequent patterns |
US20090112863A1 (en) * | 2007-10-26 | 2009-04-30 | Industry-Academic Cooperation Foundation, Yonsei University | Method and apparatus for finding maximal frequent itmesets over data streams |
CN101650730A (en) * | 2009-09-08 | 2010-02-17 | 中国科学院计算技术研究所 | Method and system for discovering weighted-value frequent-item in data flow |
CN103258049A (en) * | 2013-05-27 | 2013-08-21 | 重庆邮电大学 | Association rule mining method based on mass data |
CN103731738A (en) * | 2014-01-23 | 2014-04-16 | 哈尔滨理工大学 | Video recommendation method and device based on user group behavioral analysis |
CN104376365A (en) * | 2014-11-28 | 2015-02-25 | 国家电网公司 | Method for constructing information system running rule libraries on basis of association rule mining |
-
2016
- 2016-04-21 CN CN201610254621.1A patent/CN105930457A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030028531A1 (en) * | 2000-01-03 | 2003-02-06 | Jiawei Han | Methods and system for mining frequent patterns |
US20090112863A1 (en) * | 2007-10-26 | 2009-04-30 | Industry-Academic Cooperation Foundation, Yonsei University | Method and apparatus for finding maximal frequent itmesets over data streams |
CN101650730A (en) * | 2009-09-08 | 2010-02-17 | 中国科学院计算技术研究所 | Method and system for discovering weighted-value frequent-item in data flow |
CN103258049A (en) * | 2013-05-27 | 2013-08-21 | 重庆邮电大学 | Association rule mining method based on mass data |
CN103731738A (en) * | 2014-01-23 | 2014-04-16 | 哈尔滨理工大学 | Video recommendation method and device based on user group behavioral analysis |
CN104376365A (en) * | 2014-11-28 | 2015-02-25 | 国家电网公司 | Method for constructing information system running rule libraries on basis of association rule mining |
Non-Patent Citations (2)
Title |
---|
Y ZHANG 等: "Parallel Optimization of Frequent Algorithm on Multi-core Processors", 《 INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND COMMUNICATION TECHNOLOGY》 * |
YU ZHANG 等: "An efficient framework for parallel and continuous frequent item monitoring", 《 CONCURRENCY & COMPUTATION PRACTICE & EXPERIENCE》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115473933A (en) * | 2022-10-10 | 2022-12-13 | 国网江苏省电力有限公司南通供电分公司 | Network system associated service discovery method based on frequent subgraph mining |
CN116881338A (en) * | 2023-09-07 | 2023-10-13 | 北京傲星科技有限公司 | Data mining method and related equipment for data stream based on large model |
CN116881338B (en) * | 2023-09-07 | 2024-01-26 | 北京傲星科技有限公司 | Data mining method and related equipment for data stream based on large model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7710884B2 (en) | Methods and system for dynamic reallocation of data processing resources for efficient processing of sensor data in a distributed network | |
US20200167369A1 (en) | Scalable spine nodes with partial replication of routing information in a network environment | |
US10284383B2 (en) | Aggregation protocol | |
US10230639B1 (en) | Enhanced prefix matching | |
Van Renesse et al. | Willow: DHT, aggregation, and publish/subscribe in one protocol | |
JP2005353039A5 (en) | ||
Guo et al. | Exploiting efficient and scalable shuffle transfers in future data center networks | |
CN107579923A (en) | The balancing link load method and SDN controllers of a kind of SDN | |
CN102970242B (en) | Method for achieving load balancing | |
Xiao et al. | Using parallel bloom filters for multiattribute representation on network services | |
CN103281211B (en) | Large-scale network node system for managing in groups and management method | |
CN106850547A (en) | A kind of data restoration method and system based on http protocol | |
CN101650730A (en) | Method and system for discovering weighted-value frequent-item in data flow | |
CN103729427B (en) | A kind of flow table conversion method based on self-defined multilevel flow table incremental update | |
CN105681438A (en) | Centralized caching decision strategy in content-centric networking | |
Dolev et al. | Hypertree for self-stabilizing peer-to-peer systems | |
CN107346270A (en) | Method and system based on the sets cardinal calculated in real time | |
CN106685745A (en) | Network topology construction method and device | |
CN105471893B (en) | A kind of distributed equivalent data flow connection method | |
CN104780101A (en) | FIB (Forward Information Base) table structure in named data networking forwarding plane and retrieval method thereof | |
CN105930457A (en) | Distributed architecture-based data flow frequent item mining method | |
Yu et al. | Hardware accelerator to speed up packet processing in NDN router | |
CN114567634A (en) | Method, system, storage medium and electronic device for calculating E-level graph facing backward | |
Wu et al. | N-DISE: NDN-based data distribution for large-scale data-intensive science | |
Yang et al. | Approaching optimal compression with fast update for large scale routing tables |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160907 |
|
WD01 | Invention patent application deemed withdrawn after publication |