CN102737126B - Classification rule mining method under cloud computing environment - Google Patents
Classification rule mining method under cloud computing environment Download PDFInfo
- Publication number
- CN102737126B CN102737126B CN201210203816.5A CN201210203816A CN102737126B CN 102737126 B CN102737126 B CN 102737126B CN 201210203816 A CN201210203816 A CN 201210203816A CN 102737126 B CN102737126 B CN 102737126B
- Authority
- CN
- China
- Prior art keywords
- key
- value
- control center
- cloud computing
- computing environment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Abstract
The invention discloses a classification rule mining method under a cloud computing environment. The classification rule mining method is characterized in that a master-slave type organization structure consisting of a control center and a plurality of slave servers is adopted, and comprises the following steps of: dividing, by the control center, a data set to be classified into a training sample and a test sample, and distributing each of data blocks of the same size which are formed by uniformly dividing the training sample to one processing unit; training, by the processing units, the data blocks, by using a genetic algorithm to obtain an atomic rule for classification; and finally, reducing the atomic rule by a classifier, and selecting a reduction result which meets classification accuracy requirement as a final result of the classification rule mining. The classification rule mining method is suitable for data classification on distributed data storage under the cloud computing environment; distributed parallel processing of data classification tasks under the cloud computing environment can be performed; and the classification rule mining method has a positive effect on the classification processing problem of mass data under the cloud computing environment.
Description
Technical field
The invention belongs to data analysis technique field under cloud computing environment, be specifically related to the classifying rules method for digging under a kind of cloud computing environment.
Background technology
Sorting technique research is the important research field of data analysis and management under cloud computing environment.On the one hand, classification is the important task type of data mining, data under cloud computing environment have the features such as magnanimity, distributivity and dynamic, these features have been brought challenge to the data management under cloud computing environment, by classifying rules, excavate these data analysis, contribute to improve the efficiency of magnanimity data analysis and management under cloud computing environment.On the other hand, cloud environment has ultra-large storage and computing power, and resource and structure have dynamic retractility, and provides as required service by Intel Virtualization Technology and huge resource pool, makes efficient Data classification become possibility.It is the vital task in data analysis management that classifying rules excavates, and contributes to understand better the mass data under cloud computing environment, the Operation Decision under auxiliary cloud computing environment; Meanwhile, the high performance calculating of cloud computing and storage capacity, for classifying rules excavates the guarantee that efficient operation is provided.Therefore, it is the important technology that the data analysis under cloud computing environment is processed that classifying rules excavates, and its research theoretical and application is significant.
In the research of classifying rules digging technology, Chinese scholars has proposed many solutions, comprise and take statistical method that Bayesian Method is representative, decision tree method and rule induction are representative and the learning method of take, and neural net method etc., when these methods are applied to the classifying rules excavation of small-scale static data collection, there is higher nicety of grading.Yet still there is bottleneck problem in these methods, comprise and need to carry out scanning and sequence repeatedly to data set, cause the poor efficiency of algorithm; More responsive to noise and certain data, be prone to over-fitting; For the scalability of large training set, not fine etc.Especially under cloud computing environment, extensive property and the dynamic of distributed mass data collection, cause the surge of Data classification process search volume and dimension, increased the computational complexity of classification, reduced the efficiency of traditional classification method, thereby existing classifying rules method for digging cannot directly apply in cloud computing environment.
Summary of the invention
In order to address the above problem, the present invention overcomes the limitation of prior art, and the method for digging of the classifying rules under a kind of cloud computing environment is provided.The present invention is applicable to the Data classification in Distributed Storage under cloud computing environment, can carry out the distributed parallel of Data classification task under cloud computing environment and process, and the classification processing problem of mass data under cloud computing environment has been played to positive effect.Utilize the scale of large-scale calculations node under cloud computing environment to calculate effect, effectively improve the efficiency of magnanimity Data classification rule digging under cloud computing environment; And by institutional framework and the regular training process based on genetic algorithm of master-slave mode, solve classifying rules and excavate the distributed implementation under cloud computing environment.
The present invention is that technical solution problem adopts following technical scheme:
The feature of the classifying rules method for digging under cloud computing environment of the present invention is:
Described cloud computing environment consists of a plurality of distributed servers; While implementing classifying rules excavation under described cloud computing environment, take master-slave mode institutional framework, described master-slave mode institutional framework is control center for a station server is set, and other server is dependent server; By described control center, arrange execution, the management and running of whole mining task and coordinate the operation of each dependent server; Described each dependent server is the concrete performance element of task, and described classifying rules method for digging carries out as follows:
A ,You control center is divided into training sample and test sample book by data set to be sorted, described training sample is evenly divided, obtain each data block of formed objects, for each data block, specify a dependent server of carrying out classified excavation task as processing unit, described each data block is assigned in corresponding each processing unit;
B, data block distribution being obtained by described processing unit employing genetic algorithm are implemented the training of the atomic rule for classifying, and the atomic rule that training is obtained deposits buffer zone in;
C, by described control center, the atomic rule in buffer zone is divided, select idle dependent server as sorter, format classification device carries out the redundancy yojan of atomic rule, and detect the nicety of grading of yojan result, filter out and meet the yojan result that nicety of grading requires, the net result excavating as classifying rules.
The feature of the classifying rules method for digging under cloud computing environment of the present invention is also:
The implementation of described step a is:
Control center is after receiving the request of classified excavation task, data set to be sorted is divided into training sample and test sample book, according to user, the requirement of classified excavation task and the feature of training sample are proposed, search for suitable dependent server as processing unit, and training sample is evenly divided into the data block that size is identical; If the processing units quantity satisfying condition is N, training sample size is M, and the data block size divided is M/N;
The satisfied condition of described suitable dependent server is: storage space is not less than M/N, and the response time is not more than the desired time the latest of user;
Control center accesses described training sample, and the block copy of dividing gained, to corresponding processing unit, and is utilized to the operational order of genetic algorithm training atomic rule to processing unit transmission.
The training process of the atomic rule in described step b is:
Processing unit carries out genetic coding to each data recording of composition data piece, by the loop iteration Generation of atoms rule of genetic manipulation, by described atomic rule with <key, the form of value> key-value pair deposits buffer zone in, described <key, key in value> key-value pair is class label, and value is the atomic rule under such label;
Control center periodically reads the <key in buffer zone, value> key-value pair, generate <key, the list of value list> key-value pair deposits buffer zone in, described <key, key in the list of value list> key-value pair is class label, and value list is the atomic rule list under such label;
The complete message of ,Xiang control center transmission processing unit operations after the operation of all data recording in the complete paired data piece of processing unit.
Described step c carries out according to the following procedure:
By control center, search for idle dependent server as sorter, the number of sorter is <key, the number of key value in the list of value list> key-value pair, the corresponding key value of each sorter; Control center is by <key, in atomic rule list in the list of value list> key-value pair and test sample book, have in the sorter that the record of same item label is sent to, and transmit to sorter the operational order that redundancy yojan and nicety of grading detect;
Sorter only records once a plurality of same atoms rules under same class label, deletes the atomic rule of redundancy, obtains the atomic rule after yojan;
Sorter utilizes the atomic rule after yojan to classify to test sample book, whether consistently with the class label of test sample book detect classification results, in the test sample book of supposing to be classified by atomic rule a, there is the class label of Y bar record consistent with classification results, have class label and the classification results of N bar record inconsistent, the nicety of grading of atomic rule a is Y/ (Y+N); Suppose in the classified excavation mission requirements that user proposes, nicety of grading requires as α, all niceties of grading is not less than to the net result that the atomic rule of α excavates as classifying rules and sends control center to;
Control center gathers the net result that all sorters generate, then summarized results is fed back to the requestor of classifying rules mining task.
Compare with existing data classification method, beneficial effect of the present invention is embodied in:
1, the present invention is divided into a plurality of subtasks by the classifying rules mining task of mass data, be assigned on the extensive server cluster in cloud computing environment and process, reduce the computation complexity of individual task, utilize the scale of cloud computing server cluster to calculate effect, significantly improve the efficiency of whole classifying rules mining task;
2, master-slave mode institutional framework in the present invention, has realized the distribution of task under cloud computing environment, scheduling and management, for classifying rules excavates, provides distributed realization mechanism; Meanwhile, the genetic algorithm itself that rule training adopts has good concurrency, has solved the poor problem of general classification technology concurrency in distributed environment.
Accompanying drawing explanation
Fig. 1 is the schematic diagram of classifying rules method for digging under cloud computing environment of the present invention
Fig. 2 is the process flow diagram that in the present invention, the atomic rule based on genetic manipulation loop iteration generates
Embodiment
In classifying rules method for digging under the present embodiment cloud computing environment:
Cloud computing environment consists of a plurality of distributed servers; While implementing classifying rules excavation under cloud computing environment, take master-slave mode institutional framework, master-slave mode institutional framework is control center for a station server is set, and other server is dependent server; By control center, arrange execution, the management and running of whole mining task and coordinate the operation of each dependent server; Each dependent server is the concrete performance element of task.Classifying rules method for digging as shown in Figure 1, carries out as follows:
1 ,You control center is divided into training sample and test sample book by data set to be sorted, training sample is evenly divided, obtain each data block of formed objects, for each data block, specify a dependent server of carrying out classified excavation task as processing unit, each data block is assigned in corresponding each processing unit;
2, by processing unit, adopt genetic algorithm to distributing the data block obtaining to implement the training of the atomic rule for classifying, the atomic rule that training is obtained deposits buffer zone in;
3 ,You control centers divide the atomic rule in buffer zone, select idle dependent server as sorter, format classification device carries out the redundancy yojan of atomic rule, and detect the nicety of grading of yojan result, filter out and meet the yojan result that nicety of grading requires, the net result excavating as classifying rules.
Tentation data collection is by relation schema R (a
1, a
2..., a
k) expression, wherein a
i(i=1,2 ..., k) be attribute.By set of properties <a
1, a
2..., a
k> is divided into two parts, comprises k-1 conditional attribute and 1 class label, attribute a
1, a
2..., a
k-1for conditional attribute, attribute a
kfor class label.Each of data centralization data recording is a k dimensional vector [c
1, c
2..., c
k], c
ifor attribute a in this data recording
ivalue.
The form of expression of atomic rule is: If (a
1=c
1) ∧ (a
2=c
2) ∧ ... ∧ (a
k-1=c
k-1), Then a
k=c
k.
The implementation of step 1 is:
Control center is after receiving the request of classified excavation task, data set to be sorted is divided into training sample and test sample book, according to user, the requirement of classified excavation task and the feature of training sample are proposed, search for suitable dependent server as processing unit, and training sample is evenly divided into the data block that size is identical; If the processing units quantity satisfying condition is N, training sample size is M, and the data block size divided is M/N;
Suitable dependent server need meet following condition: storage space is not less than M/N, and the response time is not more than the desired time the latest of user.
Control center access training sample, to corresponding processing unit, and utilizes the operational order of genetic algorithm training atomic rule by the block copy of dividing gained to processing unit transmission.
The training process of step 2 Atom rule is:
Processing unit carries out genetic coding to each data recording of composition data piece, by the loop iteration Generation of atoms rule of genetic manipulation, by atomic rule with <key, the form of value> key-value pair deposits buffer zone in, <key, key in value> key-value pair is class label, and value is the atomic rule under such label;
The genetic coding process of data recording is: every data logger is shown item chromosome, and chromogene value consists of the scale-of-two code value of k property value.If property value is discrete value, can directly carry out binary coding; If property value is successive value, to carries out after discretize continuous property, then carry out binary coding.Binary coding process is: establish attribute a
ithere is s Category Attributes value v
1, v
2..., v
s, with the binary number representation attribute a containing s code bit
ivalue.If a
ivalue be v
j, the j place value of this property value binary code is 1, the value position of all the other code bits is 0.For example, sex attribute has " man ", " female " two values, if property value is " man ", the binary coding of this attribute is " 01 "; If property value is " female ", be encoded to " 10 ".In genetic algorithm, every chromosome is that a heredity is individual, all populations of the individual formation of heredity, and population scale represents with the individual quantity n of heredity, the population being obtained by genetic coding process is initial population, a gene position for a scale-of-two code bit homologue.
The loop iteration process of genetic manipulation is as shown in Figure 2: in the t time iteration, first evaluate t for the fitness of population P (t), judge whether afterwards to stop iteration, if meet iteration stopping condition, stop iteration, using P (t) output as atomic rule; If do not meet iteration stopping condition, select, crossover and mutation operation, generate t+1 for population P (t+1), make t=t+1, implement next iteration.
Iteration stopping condition is any one of following two conditions:
1. iterations t>100;
2. fitness fitness>0.75.
Fitness evaluation is: establishing the individual corresponding data recording of a certain heredity is [c
1, c
2..., c
k], the fitness of this heredity individuality is fitness=TT/n+TT/ (TT+TF), wherein TT meets " (a in data block
1=c
1) ∧ (a
2=c
2) ∧ ... ∧ (a
k=c
k) " data recording number, TF is that data meet " (a in fast
1=c
1) ∧ (a
2=c
2) ∧ ... ∧ (a
k-1=c
k-1) ∧ (a
k≠ c
k) " data recording number, FT meets " ┐ [(a in data block
1=c
1) ∧ (a
2=c
2) ∧ ... ∧ (a
k-1=c
k-1)] ∧ (a
k=c
k) " data recording number, FF meets " ┐ [(a in data block
1=c
1) ∧ (a
2=c
2) ∧ ... ∧ (a
k-1=c
k-1)] ∧ (a
k≠ c
k) " data recording number.
Selection is operating as: hereditary individual X
iselection probability
copy p
i* n X
icopy as the individuality in genetic manipulation population next time, fitnexss (X
i) be X
ifitness value.
Interlace operation is: according to crossover probability p
ctwo individual X of heredity of random selection
iand X
j, a gene position w on random selective staining body, by X
iand X
jthe exchange of gene section below of upper gene position w forms two new individualities, as the individuality in genetic manipulation population next time.Wherein, crossover probability p
cfor the numerical value between [0.4,0.9], also can adopt adaptive crossover probability.
Mutation operation is: according to variation Probability p
man individual X of heredity of random selection
i, select at random X
ia upper gene position, carries out negate to the binary code in this gene position.Wherein, variation Probability p
mfor the numerical value between [0.01,0.1], also can adopt adaptive variation probability.
Control center periodically reads the <key in buffer zone, value> key-value pair, generate <key, the list of value list> key-value pair deposits buffer zone in, <key, key in the list of value list> key-value pair is class label, and value list is the atomic rule list under such label;
The complete message of ,Xiang control center transmission processing unit operations after the operation of all data recording in the complete paired data piece of processing unit.
Step 3 is carried out according to the following procedure:
By control center, search for idle dependent server as sorter, the number of sorter is <key, the number of key value in the list of value list> key-value pair, the corresponding key value of each sorter; Control center is by <key, in atomic rule list in the list of value list> key-value pair and test sample book, have in the sorter that the record of same item label is sent to, and transmit to sorter the operational order that redundancy yojan and nicety of grading detect;
Sorter only records once a plurality of same atoms rules under same class label, deletes the atomic rule of redundancy, obtains the atomic rule after yojan;
Sorter utilizes the atomic rule after yojan to classify to test sample book, whether consistently with the class label of test sample book detect classification results, in the test sample book of supposing to be classified by atomic rule a, there is the class label of Y bar record consistent with classification results, have class label and the classification results of N bar record inconsistent, the nicety of grading of atomic rule a is Y/ (Y+N).In the classified excavation mission requirements that user proposes, nicety of grading requires as α, and all niceties of grading are not less than to the atomic rule of α as the net result of classifying rules excavation, sends control center to;
Control center gathers the net result that all sorters generate, then summarized results is fed back to the requestor of classifying rules mining task.
Claims (3)
1. the classifying rules method for digging under cloud computing environment, is characterized in that:
Described cloud computing environment consists of a plurality of distributed servers; While implementing classifying rules excavation under described cloud computing environment, take master-slave mode institutional framework, described master-slave mode institutional framework is control center for a station server is set, and other server is dependent server; By described control center, arrange execution, the management and running of whole mining task and coordinate the operation of each dependent server; Described each dependent server is the concrete performance element of task, and described classifying rules method for digging carries out as follows:
A ,You control center is divided into training sample and test sample book by data set to be sorted, described training sample is evenly divided, obtain each data block of formed objects, for each data block, specify a dependent server of carrying out classified excavation task as processing unit, described each data block is assigned in corresponding each processing unit;
B, data block distribution being obtained by described processing unit employing genetic algorithm are implemented the training of the atomic rule for classifying, and the atomic rule that training is obtained deposits buffer zone in; The training process of described atomic rule is:
Processing unit carries out genetic coding to each data recording of composition data piece, by the loop iteration Generation of atoms rule of genetic manipulation, by described atomic rule with <key, the form of value> key-value pair deposits buffer zone in, described <key, key in value> key-value pair is class label, and value is the atomic rule under such label;
Control center periodically reads the <key in buffer zone, value> key-value pair, generate <key, the list of value list> key-value pair deposits buffer zone in, described <key, key in the list of value list> key-value pair is class label, and value list is the atomic rule list under such label;
The complete message of ,Xiang control center transmission processing unit operations after the operation of all data recording in the complete paired data piece of processing unit;
C, by described control center, the atomic rule in buffer zone is divided, select idle dependent server as sorter, format classification device carries out the redundancy yojan of atomic rule, and detect the nicety of grading of yojan result, filter out and meet the yojan result that nicety of grading requires, the net result excavating as classifying rules.
2. according to the classifying rules method for digging under the cloud computing environment described in claims 1, it is characterized in that, the implementation of described step a is:
Control center is after receiving the request of classified excavation task, data set to be sorted is divided into training sample and test sample book, according to user, the requirement of classified excavation task and the feature of training sample are proposed, search for suitable dependent server as processing unit, and training sample is evenly divided into the data block that size is identical; If the processing units quantity satisfying condition is N, training sample size is M, and the data block size divided is M/N;
The satisfied condition of described suitable dependent server is: storage space is not less than M/N, and the response time is not more than the desired time the latest of user;
Control center accesses described training sample, and the block copy of dividing gained, to corresponding processing unit, and is utilized to the operational order of genetic algorithm training atomic rule to processing unit transmission.
3. according to the classifying rules method for digging under the cloud computing environment described in claims 1, it is characterized in that, described step c carries out according to the following procedure:
By control center, search for idle dependent server as sorter, the number of sorter is <key, the number of key value in the list of value list> key-value pair, the corresponding key value of each sorter; Control center is by <key, the record in the atomic rule list in the list of value list> key-value pair and test sample book with same item label is sent in sorter, and to sorter, transmits the operational order of redundancy yojan and nicety of grading detection;
Sorter only records once a plurality of same atoms rules under same class label, deletes the atomic rule of redundancy, obtains the atomic rule after yojan;
Sorter utilizes the atomic rule after yojan to classify to test sample book, whether consistently with the class label of test sample book detect classification results, in the test sample book of supposing to be classified by atomic rule a, there is the class label of Y bar record consistent with classification results, have class label and the classification results of N bar record inconsistent, the nicety of grading of atomic rule a is Y/ (Y+N); Suppose in the classified excavation mission requirements that user proposes, nicety of grading requires as α, all niceties of grading is not less than to the net result that the atomic rule of α excavates as classifying rules and sends control center to;
Control center gathers the net result that all sorters generate, then summarized results is fed back to the requestor of classifying rules mining task.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210203816.5A CN102737126B (en) | 2012-06-19 | 2012-06-19 | Classification rule mining method under cloud computing environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210203816.5A CN102737126B (en) | 2012-06-19 | 2012-06-19 | Classification rule mining method under cloud computing environment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102737126A CN102737126A (en) | 2012-10-17 |
CN102737126B true CN102737126B (en) | 2014-03-12 |
Family
ID=46992627
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210203816.5A Expired - Fee Related CN102737126B (en) | 2012-06-19 | 2012-06-19 | Classification rule mining method under cloud computing environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102737126B (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104113516A (en) * | 2013-04-19 | 2014-10-22 | 中国移动通信集团设计院有限公司 | Method and terminal for recognizing rule conflicts of firewalls |
EP2988230A4 (en) * | 2014-06-27 | 2016-10-19 | Huawei Tech Co Ltd | Data processing method and computer system |
CN105900064B (en) * | 2014-11-19 | 2019-05-03 | 华为技术有限公司 | The method and apparatus for dispatching data flow task |
CN104468239B (en) * | 2014-12-22 | 2018-08-21 | 上海大唐移动通信设备有限公司 | A kind of rule-based data processing method and device |
US10430429B2 (en) | 2015-09-01 | 2019-10-01 | Cognizant Technology Solutions U.S. Corporation | Data mining management server |
CN107153630B (en) | 2016-03-04 | 2020-11-06 | 阿里巴巴集团控股有限公司 | Training method and training system of machine learning system |
CN105912674A (en) * | 2016-04-13 | 2016-08-31 | 精硕世纪科技(北京)有限公司 | Method, device and system for noise reduction and classification of data |
CN105975331A (en) * | 2016-04-26 | 2016-09-28 | 浪潮(北京)电子信息产业有限公司 | Data parallel processing method and apparatus |
CN106372185B (en) * | 2016-08-31 | 2017-07-04 | 广东京奥信息科技有限公司 | A kind of data preprocessing method of heterogeneous data source |
CN107169513B (en) * | 2017-05-05 | 2019-10-18 | 第四范式(北京)技术有限公司 | Control distributed machines learning system and its method that data use sequence |
CN107590263B (en) * | 2017-09-22 | 2020-07-07 | 辽宁工程技术大学 | Distributed big data classification method based on multivariate decision tree model |
CN108763952B (en) * | 2018-05-03 | 2022-04-05 | 创新先进技术有限公司 | Data classification method and device and electronic equipment |
CN110109753A (en) * | 2019-04-25 | 2019-08-09 | 成都信息工程大学 | Resource regulating method and system based on various dimensions constraint genetic algorithm |
TWI739229B (en) | 2019-12-03 | 2021-09-11 | 財團法人工業技術研究院 | Method and device for screening out dispatching rules |
CN111814882B (en) * | 2020-07-10 | 2021-06-22 | 辽东学院 | Data classification system based on computer big data |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1282106C (en) * | 2004-11-29 | 2006-10-25 | 南京大学 | Rapid study classifying method |
KR101600303B1 (en) * | 2007-11-08 | 2016-03-07 | 센티언트 테크놀로지스 (바베이도스) 리미티드 | Distributed network for performing complex algorithms |
CN101556650B (en) * | 2009-04-01 | 2011-07-06 | 东北大学 | Distributed self-adapting pulmonary nodule computer detection method and system thereof |
US9275048B2 (en) * | 2009-09-09 | 2016-03-01 | Tapicu, Inc. | System and methods for solving information retrieval problem sets |
CN102143022B (en) * | 2011-03-16 | 2013-09-25 | 北京邮电大学 | Cloud measurement device and method for IP network |
-
2012
- 2012-06-19 CN CN201210203816.5A patent/CN102737126B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN102737126A (en) | 2012-10-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102737126B (en) | Classification rule mining method under cloud computing environment | |
Bifet et al. | Extremely fast decision tree mining for evolving data streams | |
Xiao et al. | SMK-means: an improved mini batch k-means algorithm based on mapreduce with big data | |
Gao et al. | Classifying data streams with skewed class distributions and concept drifts | |
Luo et al. | A parallel dbscan algorithm based on spark | |
Lin et al. | A K-means clustering with optimized initial center based on Hadoop platform | |
Elagib et al. | Big data analysis solutions using MapReduce framework | |
Zhang et al. | Multi-source big data dynamic compressive sensing and optimization method for water resources based on IoT | |
Liu | High performance latent dirichlet allocation for text mining | |
Xu | Research and implementation of improved random forest algorithm based on Spark | |
Li et al. | Scalable random forests for massive data | |
CN111159406A (en) | Big data text clustering method and system based on parallel improved K-means algorithm | |
Gong et al. | Evolutionary computation in China: A literature survey | |
Al_Zyadat et al. | Securitizing big data characteristics used tall array and mapreduce | |
Wang et al. | Association rules mining in parallel conditional tree based on grid computing inspired partition algorithm | |
Saxena et al. | A framework for multi-sensor data fusion in the context of IoT smart city parking data | |
Xu et al. | Applying an improved elephant herding optimization algorithm with spark-based parallelization to feature selection for intrusion detection | |
Sun et al. | Key nodes discovery in large-scale logistics network based on MapReduce | |
Zhu et al. | Efficient Gaussian Kernel Microcluster Real-Time Clustering Method for Industrial Internet of Things (IIoT) Streams | |
Jia et al. | An improved FP-growth algorithm based on SOM partition | |
Wang et al. | Sknn algorithm for filling missing oil data based on knn | |
Gao et al. | Dynamic community detection using nonnegative matrix factorization | |
Huang et al. | Optimization of K-means Algorithm Base on MapReduce | |
Zeng | A clustering method of combining grid and genetic algorithm in wireless sensor networks | |
Song et al. | HTME: A data streams processing strategy based on Hoeffding tree in MapReduce environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20140312 Termination date: 20210619 |
|
CF01 | Termination of patent right due to non-payment of annual fee |