CN102737126A - Classification rule mining method under cloud computing environment - Google Patents

Classification rule mining method under cloud computing environment Download PDF

Info

Publication number
CN102737126A
CN102737126A CN2012102038165A CN201210203816A CN102737126A CN 102737126 A CN102737126 A CN 102737126A CN 2012102038165 A CN2012102038165 A CN 2012102038165A CN 201210203816 A CN201210203816 A CN 201210203816A CN 102737126 A CN102737126 A CN 102737126A
Authority
CN
China
Prior art keywords
key
value
cloud computing
atomic rule
computing environment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012102038165A
Other languages
Chinese (zh)
Other versions
CN102737126B (en
Inventor
杨善林
丁静
罗贺
丁帅
徐达宇
范雯娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN201210203816.5A priority Critical patent/CN102737126B/en
Publication of CN102737126A publication Critical patent/CN102737126A/en
Application granted granted Critical
Publication of CN102737126B publication Critical patent/CN102737126B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a classification rule mining method under a cloud computing environment. The classification rule mining method is characterized in that a master-slave type organization structure consisting of a control center and a plurality of slave servers is adopted, and comprises the following steps of: dividing, by the control center, a data set to be classified into a training sample and a test sample, and distributing each of data blocks of the same size which are formed by uniformly dividing the training sample to one processing unit; training, by the processing units, the data blocks, by using a genetic algorithm to obtain an atomic rule for classification; and finally, reducing the atomic rule by a classifier, and selecting a reduction result which meets classification accuracy requirement as a final result of the classification rule mining. The classification rule mining method is suitable for data classification on distributed data storage under the cloud computing environment; distributed parallel processing of data classification tasks under the cloud computing environment can be performed; and the classification rule mining method has a positive effect on the classification processing problem of mass data under the cloud computing environment.

Description

Classifying rules method for digging under the cloud computing environment
Technical field
The invention belongs to data analysis technique field under the cloud computing environment, be specifically related to the classifying rules method for digging under a kind of cloud computing environment.
Background technology
Sorting technique research is the important research field of data analysis and management under the cloud computing environment.On the one hand; Classification is the important task type of data mining; Data under the cloud computing environment have characteristics such as magnanimity property, distributivity and dynamic; These characteristics have been brought challenge to the data management under the cloud computing environment, excavate through classifying rules these data are analyzed, and help to improve magnanimity data analysis and efficiency of managing under the cloud computing environment.On the other hand, cloud environment has ultra-large storage and computing power, and resource and structure have dynamic retractility property, and through Intel Virtualization Technology and huge resource pool service is provided as required, makes that data qualification becomes possibility efficiently.It is the vital task in the data analysis management that classifying rules excavates, and helps to understand better the mass data under the cloud computing environment, the Operation Decision under the auxiliary cloud computing environment; Simultaneously, high performance calculating of cloud computing and storage capacity are excavated the guarantee that efficient operation is provided for classifying rules.Therefore, it is the important technology that the data analysis under the cloud computing environment is handled that classifying rules excavates, and it is theoretical significant with the research of using.
In the research of classifying rules digging technology; Chinese scholars has proposed many solutions; Comprise with Bayes's method be representative statistical method, with decision tree method and rule induction method be representative and learning method; And neural net method etc., when these methods are applied to the classifying rules excavation of small-scale static data collection, have higher nicety of grading.Yet still there is bottleneck problem in these methods, comprise carrying out repeatedly scanning and ordering to data set, cause the poor efficiency of algorithm; Noise with data are relatively more responsive really, is prone to over-fitting; For the scalability of big training set is not fine etc.Especially under cloud computing environment; The extensive property and the dynamic of distributed mass data collection; Cause the surge of data qualification process searches space and dimension; Increased the computational complexity of classification, reduced the efficient of traditional classification method, thereby existing classifying rules method for digging can't directly apply in the cloud computing environment.
Summary of the invention
In order to address the above problem, the present invention overcomes the limitation of prior art, and the method for digging of the classifying rules under a kind of cloud computing environment is provided.The present invention is applicable to the data qualification on the Distributed Storage under the cloud computing environment, can carry out the distributed parallel of data qualification task under the cloud computing environment and handle, and magnanimity classification of Data under the cloud computing environment is handled problems played positive effect.Utilize the scale of large-scale calculations node under the cloud computing environment to calculate effect, effectively improve the efficient of magnanimity data qualification rule digging under the cloud computing environment; And the institutional framework through master-slave mode and based on the regular training process of genetic algorithm, solve classifying rules and excavate the distributed realization under cloud computing environment.
The present invention adopts following technical scheme for the technical solution problem:
The characteristics of the classifying rules method for digging under the cloud computing environment of the present invention are:
Said cloud computing environment is made up of a plurality of distributed servers; When under said cloud computing environment, implementing the classifying rules excavation, take the master-slave mode institutional framework, said master-slave mode institutional framework is a control center for a station server is set, and other server is a dependent server; Arrange execution, the management and running of whole mining task and coordinate the operation of each dependent server by said control center; Said each dependent server is the concrete performance element of task, and said classifying rules method for digging carries out as follows:
A, by control center will be to be classified data set be divided into training sample and test sample book; Said training sample is evenly divided; Obtain each data block of identical size; For each data block specifies a dependent server of carrying out the classified excavation task as processing unit, with said each data block allocations to each processing unit of correspondence;
The training of the atomic rule that b, the data block enforcement of adopting genetic algorithm that distribution is obtained by said processing unit are used to classify, the atomic rule that training is obtained deposits buffer zone in;
C, the atomic rule in the buffer zone is divided by said control center; Select idle dependent server as sorter; The format classification device carries out the redundant yojan of atomic rule; And detection yojan result's nicety of grading, filter out and satisfy the yojan result that nicety of grading requires, as the net result of classifying rules excavation.
The characteristics of the classifying rules method for digging under the cloud computing environment of the present invention also are:
The implementation of said step a is:
Control centre is after the request of receiving the classified excavation task; Data set to be classified is divided into training sample and test sample book; Based on the user requirement of classified excavation task and the characteristic of training sample are proposed; Search for suitable dependent server as processing unit, and training sample evenly is divided into the identical data block of size; If the processing units quantity that satisfies condition is N, the training sample size is M, and the data block of then being divided size is M/N;
The condition that said suitable dependent server satisfies is: storage space is not less than M/N, and the response time is not more than the desired time the latest of user;
Control center visits said training sample, and the block copy of dividing gained to corresponding processing unit, and is utilized the operational order of genetic algorithm training atomic rule to the processing unit transmission.
The training process of the atomic rule among the said step b is:
Processing unit carries out genetic coding to each bar data recording of composition data piece; Loop iteration through genetic manipulation generates atomic rule; Deposit said atomic rule in buffer zone, said < key with the form of < key, value>key-value pair; Value>key type of being label in the key-value pair, value is the atomic rule under such label;
Control center periodically reads < key, the value>key-value pair in the buffer zone, generates < key; Value list>key-value pair tabulation deposits buffer zone in; Key type of being label in the tabulation of said < key, value list>key-value pair, value list is the tabulation of the atomic rule under such label;
Processing unit is accomplished after the operation to all data recording in the data block, sends the message that processing unit operation finishes to control center.
Said step c is undertaken by following process:
Search for idle dependent server as sorter by control center, the number of sorter is the number of key value in the tabulation of < key, value list>key-value pair, the corresponding key value of each sorter; In the sorter that control center is sent to the record that has the same item label in the tabulation of the atomic rule in < key, value list>key-value pair tabulation and the test sample book, and transmit the operational order of redundant yojan and nicety of grading detection to sorter;
Sorter to same type label a plurality of same atoms rules down only record once, the atomic rule of deletion redundancy obtains the atomic rule after the yojan;
Sorter utilizes the atomic rule after the yojan that test sample book is classified; Whether detect classification results consistent with the class label of test sample book; Suppose by in the test sample book of atomic rule a classification; Have the class label and the classification results of Y bar record consistent, have the class label and the classification results of N bar record inconsistent, then the nicety of grading of atomic rule a is Y/ (Y+N); Suppose in the classified excavation mission requirements that the user proposes that nicety of grading requires to be α that the atomic rule that all niceties of grading is not less than α sends control center to as the net result that classifying rules excavates;
Control center gathers the net result that all sorters generate, and summarized results is fed back to the requestor of classifying rules mining task again.
Compare with existing data classification method, beneficial effect of the present invention is embodied in:
1, the present invention is divided into a plurality of subtasks with the classifying rules mining task of mass data; Be assigned on the extensive server cluster in the cloud computing environment and handle; Reduce the computation complexity of individual task; Utilize the scale of cloud computing server cluster to calculate effect, significantly improve the efficient of whole classifying rules mining task;
2, master-slave mode institutional framework among the present invention has realized the distribution of task under the cloud computing environment, scheduling and management, for classifying rules excavates distributed realization mechanism is provided; Simultaneously, the genetic algorithm itself that the rule training is adopted has good concurrency, has solved the problem of general classification technology concurrency difference in distributed environment.
Description of drawings
Fig. 1 is the schematic diagram of classifying rules method for digging under the cloud computing environment of the present invention
The process flow diagram of Fig. 2 for generating based on the atomic rule of genetic manipulation loop iteration among the present invention
Embodiment
In the classifying rules method for digging under the present embodiment cloud computing environment:
Cloud computing environment is made up of a plurality of distributed servers; When under cloud computing environment, implementing the classifying rules excavation, take the master-slave mode institutional framework, the master-slave mode institutional framework is a control center for a station server is set, and other server is a dependent server; Arrange execution, the management and running of whole mining task and coordinate the operation of each dependent server by control center; Each dependent server is the concrete performance element of task.The classifying rules method for digging is as shown in Figure 1, carries out as follows:
1, by control center will be to be classified data set be divided into training sample and test sample book; Training sample is evenly divided; Obtain each data block of identical size; For each data block specifies a dependent server of carrying out the classified excavation task as processing unit, with each data block allocations to each processing unit of correspondence;
The training of the atomic rule that the data block enforcement of 2, adopting genetic algorithm that distribution is obtained by processing unit is used to classify, the atomic rule that training is obtained deposits buffer zone in;
3, by control center the atomic rule in the buffer zone is divided; Select idle dependent server as sorter; The format classification device carries out the redundant yojan of atomic rule; And detection yojan result's nicety of grading, filter out and satisfy the yojan result that nicety of grading requires, as the net result of classifying rules excavation.
The tentation data collection is by relation schema R (a 1, a 2..., a k) expression, wherein a i(i=1,2 ..., k) be attribute.With set of properties<a 1, a 2..., a k>Be divided into two parts, comprise k-1 conditional attribute and 1 class label, attribute a 1, a 2..., a K-1Be conditional attribute, attribute a kBe the class label.Each bar data recording of data centralization is a k dimensional vector [c 1, c 2..., c k], c iBe attribute a in this data recording iValue.
The form of expression of atomic rule is: If (a 1=c 1) ∧ (a 2=c 2) ∧ ... ∧ (a K-1=c K-1), Then a k=c k
The implementation of step 1 is:
Control centre is after the request of receiving the classified excavation task; Data set to be classified is divided into training sample and test sample book; Based on the user requirement of classified excavation task and the characteristic of training sample are proposed; Search for suitable dependent server as processing unit, and training sample evenly is divided into the identical data block of size; If the processing units quantity that satisfies condition is N, the training sample size is M, and the data block of then being divided size is M/N;
Suitable dependent server need satisfy following condition: storage space is not less than M/N, and the response time is not more than the desired time the latest of user.
Control center visit training sample to corresponding processing unit, and utilizes the operational order of genetic algorithm training atomic rule with the block copy of dividing gained to the processing unit transmission.
The training process of atomic rule is in the step 2:
Processing unit carries out genetic coding to each bar data recording of composition data piece; Loop iteration through genetic manipulation generates atomic rule; Deposit atomic rule in buffer zone, < key with the form of < key, value>key-value pair; Value>key type of being label in the key-value pair, value is the atomic rule under such label;
The genetic coding process of data recording is: every data recording is expressed as a chromosome, and the chromogene value is made up of the scale-of-two code value of k property value.If property value is a discrete value, can directly carry out binary coding; If property value is successive value, then to carry out discretize to the connection attribute value after, carry out binary coding again.The binary coding process is: establish attribute a iS discrete property value v arranged 1, v 2..., v s, then with the binary number representation attribute a that contains s sign indicating number position iValue.If a iValue be v j, then the j place value of this property value binary code is 1, the value position of remainder code position is 0.For example, the sex attribute has " man ", " woman " two values, if property value is " man ", then the binary coding of this attribute is " 01 "; If property value is " woman ", be encoded to " 10 ".In the genetic algorithm; Every chromosome is that a heredity is individual, population of the individual formation of all heredity, and population scale is represented with the individual quantity n of heredity; The population that is obtained by the genetic coding process is an initial population, the gene position of a binary code position homologue.
The loop iteration process of genetic manipulation is as shown in Figure 2: in the t time iteration, at first estimate the fitness of t for population P (t), judge whether to stop iteration afterwards, if satisfy the iteration stopping condition, then stop iteration, P (t) is exported as atomic rule; If do not satisfy the iteration stopping condition, then select, intersection and mutation operation, generate t+1 for population P (t+1), make t=t+1, implement next iteration.
The iteration stopping condition is any of following two conditions:
1. iterations t>100;
2. fitness fitness>0.75.
Fitness evaluation is: establish the individual corresponding data recording of a certain heredity and be [c 1, c 2..., c k], the individual fitness of this heredity is fitness=TT/n+TT/ (TT+TF), wherein TT satisfies " (a in the data block 1=c 1) ∧ (a 2=c 2) ∧ ... ∧ (a k=c k) " data recording bar number, TF is that data satisfy " (a in fast 1=c 1) ∧ (a 2=c 2) ∧ ... ∧ (a K-1=c K-1) ∧ (a k≠ c k) " data recording bar number, FT satisfies " ┐ [(a in the data block 1=c 1) ∧ (a 2=c 2) ∧ ... ∧ (a K-1=c K-1)] ∧ (a k=c k) " data recording bar number, FF satisfies " ┐ [(a in the data block 1=c 1) ∧ (a 2=c 2) ∧ ... ∧ (a K-1=c K-1)] ∧ (a k≠ c k) " data recording bar number.
Selection operation is: hereditary individual X iThe selection probability
Figure BDA00001786606100051
Duplicate p i* n X iCopy as the individuality in the genetic manipulation population next time, fitnexss (X i) be X iFitness value.
Interlace operation is: according to crossover probability p cSelect two individual X of heredity at random iAnd X j, a gene position w on the selective staining body at random is with X iAnd X jThe exchange of the gene section of last gene position w back forms two new individualities, as the individuality in the genetic manipulation population next time.Wherein, crossover probability p cBe the numerical value between [0.4,0.9], also can adopt adaptive crossover probability.
Mutation operation is: according to the variation Probability p mSelect an individual X of heredity at random i, select X at random iA last gene position is carried out negate to the binary code on this gene position.Wherein, variation Probability p mBe the numerical value between [0.01,0.1], also can adopt adaptive variation probability.
Control center periodically reads < key, the value>key-value pair in the buffer zone, generates < key; Value list>key-value pair tabulation deposits buffer zone in; Key type of being label in the tabulation of < key, value list>key-value pair, value list is the tabulation of the atomic rule under such label;
Processing unit is accomplished after the operation to all data recording in the data block, sends the message that processing unit operation finishes to control center.
Step 3 is undertaken by following process:
Search for idle dependent server as sorter by control center, the number of sorter is the number of key value in the tabulation of < key, value list>key-value pair, the corresponding key value of each sorter; In the sorter that control center is sent to the record that has the same item label in the tabulation of the atomic rule in < key, value list>key-value pair tabulation and the test sample book, and transmit the operational order of redundant yojan and nicety of grading detection to sorter;
Sorter to same type label a plurality of same atoms rules down only record once, the atomic rule of deletion redundancy obtains the atomic rule after the yojan;
Sorter utilizes the atomic rule after the yojan that test sample book is classified; Whether detect classification results consistent with the class label of test sample book; Suppose by in the test sample book of atomic rule a classification; Have the class label and the classification results of Y bar record consistent, have the class label and the classification results of N bar record inconsistent, then the nicety of grading of atomic rule a is Y/ (Y+N).In the classified excavation mission requirements that the user proposes, nicety of grading requires to be α that the atomic rule that all niceties of grading is not less than α sends control center to as the net result that classifying rules excavates;
Control center gathers the net result that all sorters generate, and summarized results is fed back to the requestor of classifying rules mining task again.

Claims (4)

1. the classifying rules method for digging under the cloud computing environment is characterized in that:
Said cloud computing environment is made up of a plurality of distributed servers; When under said cloud computing environment, implementing the classifying rules excavation, take the master-slave mode institutional framework, said master-slave mode institutional framework is a control center for a station server is set, and other server is a dependent server; Arrange execution, the management and running of whole mining task and coordinate the operation of each dependent server by said control center; Said each dependent server is the concrete performance element of task, and said classifying rules method for digging carries out as follows:
A, by control center will be to be classified data set be divided into training sample and test sample book; Said training sample is evenly divided; Obtain each data block of identical size; For each data block specifies a dependent server of carrying out the classified excavation task as processing unit, with said each data block allocations to each processing unit of correspondence;
The training of the atomic rule that b, the data block enforcement of adopting genetic algorithm that distribution is obtained by said processing unit are used to classify, the atomic rule that training is obtained deposits buffer zone in;
C, the atomic rule in the buffer zone is divided by said control center; Select idle dependent server as sorter; The format classification device carries out the redundant yojan of atomic rule; And detection yojan result's nicety of grading, filter out and satisfy the yojan result that nicety of grading requires, as the net result of classifying rules excavation.
2. according to the classifying rules method for digging under claims 1 described cloud computing environment, it is characterized in that the implementation of said step a is:
Control centre is after the request of receiving the classified excavation task; Data set to be classified is divided into training sample and test sample book; Based on the user requirement of classified excavation task and the characteristic of training sample are proposed; Search for suitable dependent server as processing unit, and training sample evenly is divided into the identical data block of size; If the processing units quantity that satisfies condition is N, the training sample size is M, and the data block of then being divided size is M/N;
The condition that said suitable dependent server satisfies is: storage space is not less than M/N, and the response time is not more than the desired time the latest of user;
Control center visits said training sample, and the block copy of dividing gained to corresponding processing unit, and is utilized the operational order of genetic algorithm training atomic rule to the processing unit transmission.
3. according to the classifying rules method for digging under claims 1 described cloud computing environment, it is characterized in that the training process of the atomic rule among the said step b is:
Processing unit carries out genetic coding to each bar data recording of composition data piece; Loop iteration through genetic manipulation generates atomic rule; Deposit said atomic rule in buffer zone, said < key with the form of < key, value>key-value pair; Value>key type of being label in the key-value pair, value is the atomic rule under such label;
Control center periodically reads < key, the value>key-value pair in the buffer zone, generates < key; Value list>key-value pair tabulation deposits buffer zone in; Key type of being label in the tabulation of said < key, value list>key-value pair, value list is the tabulation of the atomic rule under such label;
Processing unit is accomplished after the operation to all data recording in the data block, sends the message that processing unit operation finishes to control center.
4. according to the classifying rules method for digging under claims 1 described cloud computing environment, it is characterized in that said step c is undertaken by following process:
Search for idle dependent server as sorter by control center, the number of sorter is the number of key value in the tabulation of < key, value list>key-value pair, the corresponding key value of each sorter; In the sorter that control center is sent to the record that has the same item label in the tabulation of the atomic rule in < key, value list>key-value pair tabulation and the test sample book, and transmit the operational order of redundant yojan and nicety of grading detection to sorter;
Sorter to same type label a plurality of same atoms rules down only record once, the atomic rule of deletion redundancy obtains the atomic rule after the yojan;
Sorter utilizes the atomic rule after the yojan that test sample book is classified; Whether detect classification results consistent with the class label of test sample book; Suppose by in the test sample book of atomic rule a classification; Have the class label and the classification results of Y bar record consistent, have the class label and the classification results of N bar record inconsistent, then the nicety of grading of atomic rule a is Y/ (Y+N); Suppose in the classified excavation mission requirements that the user proposes that nicety of grading requires to be α that the atomic rule that all niceties of grading is not less than α sends control center to as the net result that classifying rules excavates;
Control center gathers the net result that all sorters generate, and summarized results is fed back to the requestor of classifying rules mining task again.
CN201210203816.5A 2012-06-19 2012-06-19 Classification rule mining method under cloud computing environment Expired - Fee Related CN102737126B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210203816.5A CN102737126B (en) 2012-06-19 2012-06-19 Classification rule mining method under cloud computing environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210203816.5A CN102737126B (en) 2012-06-19 2012-06-19 Classification rule mining method under cloud computing environment

Publications (2)

Publication Number Publication Date
CN102737126A true CN102737126A (en) 2012-10-17
CN102737126B CN102737126B (en) 2014-03-12

Family

ID=46992627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210203816.5A Expired - Fee Related CN102737126B (en) 2012-06-19 2012-06-19 Classification rule mining method under cloud computing environment

Country Status (1)

Country Link
CN (1) CN102737126B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104113516A (en) * 2013-04-19 2014-10-22 中国移动通信集团设计院有限公司 Method and terminal for recognizing rule conflicts of firewalls
CN104468239A (en) * 2014-12-22 2015-03-25 上海大唐移动通信设备有限公司 Data processing method and device based on rules
WO2015196476A1 (en) * 2014-06-27 2015-12-30 华为技术有限公司 Data processing method and computer system
WO2016078008A1 (en) * 2014-11-19 2016-05-26 华为技术有限公司 Method and apparatus for scheduling data flow task
CN105912674A (en) * 2016-04-13 2016-08-31 精硕世纪科技(北京)有限公司 Method, device and system for noise reduction and classification of data
CN105975331A (en) * 2016-04-26 2016-09-28 浪潮(北京)电子信息产业有限公司 Data parallel processing method and apparatus
CN106372185A (en) * 2016-08-31 2017-02-01 广东京奥信息科技有限公司 Data preprocessing method for heterogeneous data sources
WO2017148314A1 (en) * 2016-03-04 2017-09-08 阿里巴巴集团控股有限公司 Method of training machine learning system, and training system
CN107169513A (en) * 2017-05-05 2017-09-15 第四范式(北京)技术有限公司 Control data uses the distributed machines learning system and its method of order
CN107590263A (en) * 2017-09-22 2018-01-16 辽宁工程技术大学 A kind of distributed big data sorting technique based on multi-variable decision tree-model
CN108763952A (en) * 2018-05-03 2018-11-06 阿里巴巴集团控股有限公司 A kind of data classification method, device and electronic equipment
CN110109753A (en) * 2019-04-25 2019-08-09 成都信息工程大学 Resource regulating method and system based on various dimensions constraint genetic algorithm
US10430429B2 (en) 2015-09-01 2019-10-01 Cognizant Technology Solutions U.S. Corporation Data mining management server
CN111814882A (en) * 2020-07-10 2020-10-23 辽东学院 Data classification system based on computer big data
US11762376B2 (en) 2019-12-03 2023-09-19 Industrial Technology Research Institute Quick dispatching rule screening method and apparatus

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1614630A (en) * 2004-11-29 2005-05-11 南京大学 Rapid study classifying method
CN101556650A (en) * 2009-04-01 2009-10-14 东北大学 Distributed self-adapting pulmonary nodule computer detection method and system thereof
CN101939727A (en) * 2007-11-08 2011-01-05 遗传学金融(巴巴多斯)有限公司 Distributed network for performing complex algorithms
CN102143022A (en) * 2011-03-16 2011-08-03 北京邮电大学 Cloud measurement device and method for IP network
US20110191322A1 (en) * 2009-09-09 2011-08-04 Tapicu, Inc. Stochastic optimization techniques of evolutionary computation search strategies for an information sharing system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1614630A (en) * 2004-11-29 2005-05-11 南京大学 Rapid study classifying method
CN101939727A (en) * 2007-11-08 2011-01-05 遗传学金融(巴巴多斯)有限公司 Distributed network for performing complex algorithms
CN101556650A (en) * 2009-04-01 2009-10-14 东北大学 Distributed self-adapting pulmonary nodule computer detection method and system thereof
US20110191322A1 (en) * 2009-09-09 2011-08-04 Tapicu, Inc. Stochastic optimization techniques of evolutionary computation search strategies for an information sharing system
CN102143022A (en) * 2011-03-16 2011-08-03 北京邮电大学 Cloud measurement device and method for IP network

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104113516A (en) * 2013-04-19 2014-10-22 中国移动通信集团设计院有限公司 Method and terminal for recognizing rule conflicts of firewalls
US9984336B2 (en) 2014-06-27 2018-05-29 Huawei Technologies Co., Ltd. Classification rule sets creation and application to decision making
WO2015196476A1 (en) * 2014-06-27 2015-12-30 华为技术有限公司 Data processing method and computer system
WO2016078008A1 (en) * 2014-11-19 2016-05-26 华为技术有限公司 Method and apparatus for scheduling data flow task
CN105900064A (en) * 2014-11-19 2016-08-24 华为技术有限公司 Method and apparatus for scheduling data flow task
US10558498B2 (en) 2014-11-19 2020-02-11 Huawei Technologies Co., Ltd. Method for scheduling data flow task and apparatus
CN105900064B (en) * 2014-11-19 2019-05-03 华为技术有限公司 The method and apparatus for dispatching data flow task
CN104468239A (en) * 2014-12-22 2015-03-25 上海大唐移动通信设备有限公司 Data processing method and device based on rules
CN104468239B (en) * 2014-12-22 2018-08-21 上海大唐移动通信设备有限公司 A kind of rule-based data processing method and device
US10430429B2 (en) 2015-09-01 2019-10-01 Cognizant Technology Solutions U.S. Corporation Data mining management server
US11151147B1 (en) 2015-09-01 2021-10-19 Cognizant Technology Solutions U.S. Corporation Data mining management server
US11257005B2 (en) 2016-03-04 2022-02-22 Alibaba Group Holding Limited Training method and training system for machine learning system
WO2017148314A1 (en) * 2016-03-04 2017-09-08 阿里巴巴集团控股有限公司 Method of training machine learning system, and training system
TWI740896B (en) * 2016-03-04 2021-10-01 香港商阿里巴巴集團服務有限公司 Training method and training system of machine learning system
CN105912674A (en) * 2016-04-13 2016-08-31 精硕世纪科技(北京)有限公司 Method, device and system for noise reduction and classification of data
CN105975331A (en) * 2016-04-26 2016-09-28 浪潮(北京)电子信息产业有限公司 Data parallel processing method and apparatus
CN106372185A (en) * 2016-08-31 2017-02-01 广东京奥信息科技有限公司 Data preprocessing method for heterogeneous data sources
CN107169513A (en) * 2017-05-05 2017-09-15 第四范式(北京)技术有限公司 Control data uses the distributed machines learning system and its method of order
CN107590263B (en) * 2017-09-22 2020-07-07 辽宁工程技术大学 Distributed big data classification method based on multivariate decision tree model
CN107590263A (en) * 2017-09-22 2018-01-16 辽宁工程技术大学 A kind of distributed big data sorting technique based on multi-variable decision tree-model
CN108763952A (en) * 2018-05-03 2018-11-06 阿里巴巴集团控股有限公司 A kind of data classification method, device and electronic equipment
CN110109753A (en) * 2019-04-25 2019-08-09 成都信息工程大学 Resource regulating method and system based on various dimensions constraint genetic algorithm
US11762376B2 (en) 2019-12-03 2023-09-19 Industrial Technology Research Institute Quick dispatching rule screening method and apparatus
CN111814882A (en) * 2020-07-10 2020-10-23 辽东学院 Data classification system based on computer big data
CN111814882B (en) * 2020-07-10 2021-06-22 辽东学院 Data classification system based on computer big data

Also Published As

Publication number Publication date
CN102737126B (en) 2014-03-12

Similar Documents

Publication Publication Date Title
CN102737126B (en) Classification rule mining method under cloud computing environment
Bifet et al. Extremely fast decision tree mining for evolving data streams
Panda et al. Planet: massively parallel learning of tree ensembles with mapreduce
Luo et al. A parallel dbscan algorithm based on spark
Jia et al. A grid and density-based clustering algorithm for processing data stream
Liu et al. SP-Partitioner: A novel partition method to handle intermediate data skew in spark streaming
CN104809244B (en) Data digging method and device under a kind of big data environment
CN104834557B (en) A kind of data analysing method based on Hadoop
Lin et al. A K-means clustering with optimized initial center based on Hadoop platform
Elagib et al. Big data analysis solutions using MapReduce framework
Han et al. Efficient mining of high-speed uncertain data streams
Jiang et al. SKCompress: compressing sparse and nonuniform gradient in distributed machine learning
Xu Research and implementation of improved random forest algorithm based on Spark
Han et al. SlimML: Removing non-critical input data in large-scale iterative machine learning
Wu et al. Mining of high-utility patterns in big IoT-based databases
Mao et al. A MapReduce-based K-means clustering algorithm
Wang et al. Association rules mining in parallel conditional tree based on grid computing inspired partition algorithm
Sun et al. Key nodes discovery in large-scale logistics network based on MapReduce
Xu et al. Applying an improved elephant herding optimization algorithm with spark-based parallelization to feature selection for intrusion detection
Huang et al. Optimization of K-means Algorithm Base on MapReduce
Zhu et al. Efficient Gaussian Kernel Microcluster Real-Time Clustering Method for Industrial Internet of Things (IIoT) Streams
CN113157915A (en) Naive Bayes text classification method based on cluster environment
Zeng A clustering method of combining grid and genetic algorithm in wireless sensor networks
Hao et al. An association-oriented partitioning approach for streaming graph query
Zhang et al. Emotion-based social computing platform for streaming big-data: Architecture and application

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140312

Termination date: 20210619

CF01 Termination of patent right due to non-payment of annual fee