CN103279332A - Data flow parallel processing method based on GPU-CUDA platform and genetic algorithm - Google Patents

Data flow parallel processing method based on GPU-CUDA platform and genetic algorithm Download PDF

Info

Publication number
CN103279332A
CN103279332A CN2013102290983A CN201310229098A CN103279332A CN 103279332 A CN103279332 A CN 103279332A CN 2013102290983 A CN2013102290983 A CN 2013102290983A CN 201310229098 A CN201310229098 A CN 201310229098A CN 103279332 A CN103279332 A CN 103279332A
Authority
CN
China
Prior art keywords
data
frequent item
item set
frequent
subwindow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013102290983A
Other languages
Chinese (zh)
Inventor
卢晓伟
周勇
韩君
张清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN2013102290983A priority Critical patent/CN103279332A/en
Publication of CN103279332A publication Critical patent/CN103279332A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a data flow parallel processing method based on a GPU-CUDA platform and a genetic algorithm. The data flow parallel processing method comprises the following steps: dynamically mining frequent item sets of newest data, and starting the searching process from a group of initial populations, wherein each individual in the populations can be a possible frequent pattern; adopting a sliding window mode according to the characteristics of a data flow to perform streaming data mining, and adopting a nested child window model based on a sliding window in terms of features of frequent item set mining; performing frequent item set mining by adopting a GPU-CUDA parallel processing technology according to the characteristics that the data flow is large in data amount and requires real-time processing; and finally obtaining the frequent item sets of data in the current sliding window by comprehensively processing the frequent item sets of nested child windows in the sliding window. Compared with the prior art, by means of the data flow parallel processing method, the frequent item sets of the flow data are processed through the strong floating-point calculation capability of a GPU and a CUDA accelerating technology for programming on the GPU, modeling can be performed by adopting a parallel mode of the genetic algorithm, and user operation experience is improved.

Description

A kind of data stream method for parallel processing based on GPU-CUDA platform and genetic algorithm
Technical field
The present invention relates to computer application field, specifically a kind of data stream method for parallel processing based on GPU-CUDA platform and genetic algorithm.
Background technology
In fact data stream is exactly mobile continuously element troop, and element wherein is made up of the set of related data.Make t represent arbitrary timestamp, at is illustrated in the data that this timestamp arrives, flow data can be expressed as ..., at 1, at, and at+1 .... be different from traditional application model, stream data model has following 4 general character: (1) data in real time arrives; (2) data arrive order independently, not controlled by application system; (3) data scale is grand and can not predict its maximal value; (4) data one are treated, unless specially preserve, handled otherwise can not take out again, perhaps extract data again and cost dearly.
Moving window (sliding window) model: moving window does not all have clear and definite given to window starting point and terminal point, only the length W of clear and definite given window.Window keeps certain-length at data stream D={ d 0, d 1, d nGo up and slide, the stream coverage of processing is just determined by this window, along with the slip of window is constantly exported the result who obtains.The length of moving window SW both can be determined by a time interval, also can be comprised the data stream element number by window and determine;
Nested subwindow model: certain is T constantly, and length of window is the interior latest data collection d of the moving window SW of W nDropping into window size is among the nested subwindow S_SW of W2, claims that window SW is nested subwindow.
Shown in Fig. 1, use moving window and describe dynamically updating data set.The window data collection is shown in the middle sign of Fig. 1 (a).When newly-increased data set arrived, moving window moved forward a unit, shown in Fig. 1 (b).
The frequent item set of moving window: for data in the current moving window, establish I={i 1, i 2..., i nBe the set of item, transaction data set (TDS) S={ s 0, s 1, s n, wherein, each affairs s of data centralization is the set of item, s I.If X is s, claim that then X is a collection.If k element arranged among the X, claim that then X is k-item collection.For an item collection X, if its support more than or equal to the given minimum support threshold value of user, then X is frequent item set.
Genetic algorithm: a kind of optimization algorithm based on random search has been successfully applied to fields such as function optimization, control automatically, production scheduling, robotics, image processing, artificial life, machine learning and data mining.Population of disaggregation that may be potential from the representative problem begins, and population then is made up of the individuality through the some of gene code.Each individuality is actually the characteristic entity of chromosome band.Chromosome is as the main carrier of inhereditary material, it is the set of a plurality of genes, its inner performance (being genotype) is certain assortment of genes, and it has determined the external presentation of individual shape, is that certain assortment of genes by this feature of control in the chromosome determines as the feature of dark hair.Therefore, be coding work needing to realize from phenotype at the beginning to genotypic mapping.Owing to copy the work of gene code very complicated, we often simplify, as binary coding, after just producing for population, according to the principle of the survival of the fittest and the survival of the fittest, produce the approximate solution of becoming better and better by the generation evolution, in each generation, select individuality according to fitness size individual in the Problem Areas, and make up intersection and variation by means of the genetic operator of natural genetics, produce the population of the new disaggregation of representative.This process will cause the same back life of evolving naturally of kind of images of a group of characters to be adapted to environment for population more than former generation, and the optimum individual in the last reign of a dynasty population can be used as the problem approximate optimal solution through decoding.
The fundamental operation process of genetic algorithm is as follows, algorithm flow synoptic diagram such as Fig. 2:
A) initialization: evolutionary generation counter t=0 is set, maximum evolutionary generation T is set, generate M individuality at random as initial population P (0)  
B) individual evaluation: calculate the P of colony (t)In each individual fitness.  
C) select computing: will select operator to act on colony.The purpose of selecting is the individuality of optimizing to be genetic directly to the next generation or to produce new individuality by the pairing intersection be genetic to the next generation again.Selecting operation is to be based upon on the fitness of the individual in population assessment basis.  
D) crossing operation: crossover operator is acted on colony.The so-called intersection refers to the part-structure of two parent individualities is replaced reorganization and generated new individual operation.What play the role of a nucleus in the genetic algorithm is exactly crossover operator.  
E) variation computing: mutation operator is acted on colony.Namely be that the genic value on some locus of the individuality string in the colony is changed.The P of colony (t)Through obtaining the P of colony of future generation after the computing of selecting, intersect, make a variation (t 1)  
F) end condition is judged: if t=T, then have maximum adaptation degree individuality and export as optimum solution with resulting in the evolutionary process, stop calculating.
CUDA is a kind of multiple programming model and software environment, and standard program language such as employing C language are operated.This technology has encapsulated the hardware details of GPU, and the core of CUDA has three important abstract concepts: sets of threads hierarchical structure, shared storage, shielding be (barrier synchronization) synchronously.
These abstract fine-grained data parallelization and thread parallelizations of providing are nested among the data parallelization and tasks in parallelization of coarseness, are littler fragment with PROBLEM DECOMPOSITION, in order to solve by the method for cooperation is parallel.Such decomposition has kept language performance, allows thread to cooperate when solving each subproblem, supports transparent scalability simultaneously.Thereby many nuclear properties that this technology can be utilized GPU significantly accelerate the application of parallelization.
But, also do not have a kind of technology that can fast and effeciently obtain the flow data frequent item set to improve user's operating experience at present.
Summary of the invention
The purpose of this invention is to provide a kind of data stream method for parallel processing based on GPU-CUDA platform and genetic algorithm.
The objective of the invention is to realize in the following manner, comprise following content:
Utilize genetic algorithm dynamically to excavate the frequent item set of latest data, since one group of initial population search procedure, each individuality in the population is a possible frequent mode;
Take the moving window pattern according to the characteristics of data stream, carry out the excavation of flow data, and at the characteristics of frequent item set mining, adopt based on the nested subwindow model on the moving window;
Big according to the data flow data amount, need the characteristics of processing in real time, adopt the GPU-CUDA parallel processing technique to carry out frequent item set mining;
Frequent item set in each nested subwindow in the overall treatment moving window finally obtains the frequent item set of data in the current moving window;
Concrete steps are as follows:
One, utilize the concurrency of genetic algorithm to search for the frequent item set of latest data in the nested subwindow;
Step S110: set moving window SW and subwindow S_SW size, be respectively w 1, w 2Import after all kinds of parameters, determine window size according to data stream property, the SW content is that the interest-degree according to the frequent item set of current how many bar affairs decides, subwindow is to determine according to the processing power of data and abandoned legacy data bar number, has also determined demand to require the frequency of adding up;
Given support threshold value S, if certain individual i, its fitness is F i, work as F i〉=S, affairs i are the frequent item set pattern of data set in the moving window;
The attribute kind number of affairs, the span of each attribute and generation initial population size are determined maximum iteration time T.This disposal route is to adopt the subwindow model, avoids after legacy data is eliminated, and the data that exist in the moving window SW are carried out repeatedly re-treatment;
Set crossover probability P, the data in the individual variation probability Q, subwindow are divided into the parallel computation of Z section, and the function at this place adopts GPU CUDA concurrent technique, gives a thread with the data in each subwindow and carries out parallel processing;
Step S120: obtain initial population.Data are in flow process, obtain the data of up-to-date arrival in the subwindow, obtain the frequent 1-item collection of these data simultaneously, frequent 1-item collection is encoded to the real number string, and with frequent 1-item collection nonzero term by original position assembly coding at random, the common initial population of forming in the nested subwindow, individuality is investigated the frequent item set pattern for waiting in this population, and detailed process is as follows:
1) property value of statistics A, B and C is V1, V2, and the number of V3 is respectively as first row, secondary series and the 3rd row;
2) carry out assignment more than or equal to the reservation of threshold value N, and by its corresponding row, less than the assignment 0 of N, and remove;
3) each non-0 value is become delegation separately, and keep its original position of being expert at, all the other positions fill out 0;
4) nonzero term is pressed original position assembly coding at random, forms initial population jointly;
Function is to adopt GPU CUDA programming mode, adopts optimization means such as stream technology and shared storage, and the solution procedure of each attribute is carried out parallel processing;
Step S130: calculate individual support value and be the process of waiting to investigate frequent mode and actual transaction coupling in the initial population, when individual support value during greater than S, this individual mode is added in the current subwindow frequent item set F i=W i/ W Z, F iBe the support of affairs i, W iFor having the affairs bar number of same alike result value, W in the current subwindow ZBe affairs total number in the current subwindow;
Divide Z section PARALLEL MATCHING, though increased memory cost, reduce working time in a large number, have very big meaning, PARALLEL MATCHING for data stream frequent item set mining;
Step S140: select, carry out the wheel disc selection with individual in the population by the support value;
Step S150: intersect, P once intersects with crossover probability;
Step S160: variation, individuality carries out the variation of basic position by variation probability Q;
Step S170: individual support value after the scanning definitive variation, the newly-increased individuality that satisfies condition adds in the frequent item set;
Step S180: judge termination condition, less than T, change step 3 as iterations, behind T the interative computation, termination of iterations and obtain the frequent item set of the interior data of current nested subwindow then;
Two, obtain the frequent item set of data in the current moving window
Step S210: each frequent item set pattern of this acquisition and U before, U=w 1/ w 2The frequent item set pattern that obtains for-1 time is formed initial population jointly, once searches for the individual frequent item set for data in the moving window of the pattern that finally satisfies condition;
1. For i=1:U+1;
2. the frequent mode that each section obtained is combined into the frequent mode group;
3. End;
4. the frequent mode group is carried out parallel search one time in moving window SW;
5. support finally is defined as frequent mode greater than S;
The function of this step adopts the shared programming mode of OpenMP to carry out multi-threaded parallel and handles;
Step S310: along with flowing of data stream, continue to handle the data that newly receive, and abandon data the earliest, change step S102 and continue above operation, till finishing to data stream.
Described method, according to the parallel schema of described employing genetic algorithm, the step that obtains the frequent item set pattern of data stream comprises:
Data in the moving window are divided into the Z section, give a thread with the data in each nested subwindow and carry out parallel processing, obtain initial population, calculating individual support value is the process of waiting to investigate frequent mode and actual transaction coupling in the initial population, select, intersect, variation, individual support value after the scanning definitive variation is judged termination condition.
Described method, the step according to the frequent item set of data in the current moving window of described acquisition comprises:
Each frequent item set pattern that obtains and U(U=w1/w2-1 before) the frequent item set pattern of inferior acquisition forms initial population jointly, once searches for the individual frequent item set for data in the moving window of the pattern that finally satisfies condition.
Described method, the step according to the frequent item set of data in the current moving window of described acquisition comprises:
Along with flowing of data stream, continue to handle the data that newly receive, and abandon data the earliest.
The invention has the beneficial effects as follows: technical matters to be solved by this invention is that a kind of mobile characteristics that adapt to flow data need be provided, and adopts the parallel form of genetic algorithm, and the theoretical foundation and the solution that obtain a kind of parallel processing reach fast and effeciently to obtain the frequent item set of flow data.
Compared with prior art, technical solution of the present invention is by the powerful Floating-point Computation ability of GPU and the CUDA speed technology of programming at GPU, handle the frequent item set of flow data, can adopt the parallel form of genetic algorithm to carry out modeling, promoted user's operating experience.
Description of drawings
Fig. 1 is (a) and (b) the Data Update synoptic diagram concentrated of two window datas;
Fig. 2 genetic algorithm schematic flow sheet;
Fig. 3 obtains the process flow diagram of frequent item set in the current subwindow;
Fig. 4 initial population generates figure;
Fig. 5 calculates individual support synoptic diagram;
Fig. 6 frequent mode group forms figure
Fig. 7 scans and obtains the final frequent item set of current window.
Embodiment
Explain below with reference to Figure of description method of the present invention being done.
Describe embodiments of the present invention in detail below with reference to drawings and Examples, whereby to the present invention how utility theory model and technological means solve technical matters, and the implementation procedure of reaching technique effect can fully understand and implements according to this.
At first, if do not conflict, the mutually combining of each feature among the embodiment of the invention and the embodiment is all within protection scope of the present invention.In addition, can in the computer system such as one group of computer executable instructions, carry out in the step shown in the process flow diagram of accompanying drawing, and, though there is shown logical order in flow process, but in some cases, can carry out step shown or that describe with the order that is different from herein.
Utilize genetic algorithm dynamically to excavate the frequent item set of latest data, since one group of initial population search procedure, each individuality in the population is a possible frequent mode.Genetic algorithm is mainly by intersecting, make a variation, select computing to realize.After some generations selections, obtain final frequent item set.Wherein mutation operation be individual by dynamic, randomly changing in some gene produce new individuality, mutation operation is a major reason that produces global optimum, help to increase the diversity of population, but each required corresponding non-null gene of frequent item set generation all exists in this algorithm, basically can contain all frequent item sets through the gene that interlace operation produces, therefore adopt a lower aberration rate.
The technical method that this patent adopts is divided into three parts:
(1) utilize the concurrency of genetic algorithm to search for the frequent item set of latest data in the nested subwindow;
(2) frequent item set in each nested subwindow in the overall treatment moving window finally obtains the frequent item set of data in the current moving window;
(3) along with the inflow of new data, periodically delete expired flow data, and repeat above two parts operation.
Concrete implementation step is as follows:
Utilize the concurrency of genetic algorithm to search for the frequent item set of latest data in the nested subwindow;
As shown in Figure 3, present embodiment mainly comprises the steps:
Step S110: set moving window SW and subwindow S_SW size, be respectively w 1, w 2Import after all kinds of parameters, determine window size according to data stream property, the SW content is that the interest-degree according to the frequent item set of current how many bar affairs decides, subwindow is to determine according to the processing power of data and abandoned legacy data bar number, has also determined demand to require the frequency of adding up.
Given support threshold value S, if certain individual i, its fitness is F i, work as F i〉=S, affairs i are the frequent item set pattern of data set in the moving window.
The attribute kind number of affairs, the span of each attribute and generation initial population size are determined maximum iteration time T.This disposal route is to adopt the subwindow model, avoids after legacy data is eliminated, and the data that exist in the moving window SW are carried out repeatedly re-treatment.
Set crossover probability P, the data in the individual variation probability Q, subwindow are divided into the parallel computation of Z section.The function at this place adopts GPU CUDA concurrent technique, gives a thread with the data in each subwindow and carries out parallel processing.
Step S120: obtain initial population.Data are in flow process, obtain the data of up-to-date arrival in the subwindow, obtain the frequent 1-item collection of these data simultaneously, frequent 1-item collection is encoded to the real number string, and with frequent 1-item collection nonzero term by original position assembly coding at random, the common initial population of forming in the nested subwindow, individuality is investigated the frequent item set pattern for waiting in this population, and detailed process is as follows:
1. the property value of statistics A, B and C is V1, V2, and the number of V3 is respectively as first row, secondary series and the 3rd row;
2. carry out assignment more than or equal to the reservation of threshold value N, and by its corresponding row, less than the assignment 0 of N, and remove (this routine N gets 3);
3. each non-0 value is become delegation separately, and keep its original position of being expert at, all the other positions fill out 0;
4. nonzero term is pressed original position assembly coding at random, forms initial population jointly.
Process as shown in Figure 4, the function of this step is to adopt GPU CUDA programming mode, adopts optimization means such as stream technology and shared storage, and the solution procedure of each attribute is carried out parallel processing.
Step S130: calculating individual support value is the process of waiting to investigate frequent mode and actual transaction coupling in the initial population.When individual support value during greater than S, this individual mode is added in the current subwindow frequent item set.F i=W i/ W Z, F iBe the support of affairs i, W iFor having the affairs bar number of same alike result value, W in the current subwindow ZBe affairs total number in the current subwindow.
Divide Z section PARALLEL MATCHING, though increased memory cost, reduce working time in a large number, have very big meaning for data stream frequent item set mining.PARALLEL MATCHING, process are as shown in Figure 5.
Step S140: select.Carry out the wheel disc selection with individual in the population by the support value.
Step S150: intersect.P once intersects with crossover probability.
Step S160: variation.Individuality carries out the variation of basic position by variation probability Q.
Step S170: individual support value after the scanning definitive variation, the newly-increased individuality that satisfies condition adds in the frequent item set.
Step S180: judge termination condition.Less than T, change step 3 as iterations, behind T the interative computation, termination of iterations and obtain the frequent item set of the interior data of current nested subwindow then;
Obtain the frequent item set of data in the current moving window
Step S210: each frequent item set pattern of this acquisition and U(U=w before 1/ w 2-1) the frequent item set pattern of inferior acquisition is formed initial population jointly, once searches for, the individual frequent item set for data in the moving window of the pattern that finally satisfies condition.Process such as Fig. 6 and Fig. 7
1. For i=1:U+1;
2. the frequent mode that each section obtained is combined into the frequent mode group;
3. End;
4. the frequent mode group is carried out parallel search one time in moving window SW;
5. support finally is defined as frequent mode greater than S;
The function of this step adopts the shared programming mode of OpenMP to carry out multi-threaded parallel and handles.
Step S310: along with flowing of data stream, continue to handle the data that newly receive, and abandon data the earliest, change step S102 and continue above operation, till finishing to data stream.
Those skilled in the art should be understood that, above-mentioned each module of the present invention or each step can realize with the general calculation device, they can concentrate on the single calculation element, perhaps be distributed on the network that a plurality of calculation elements form, alternatively, they can be realized with the executable program code of calculation element, thereby, they can be stored in the memory storage and be carried out by calculation element, perhaps they are made into each integrated circuit modules respectively, perhaps a plurality of modules in them or step are made into the single integrated circuit module and realize.Like this, the present invention is not restricted to any specific hardware and software combination.
Though the disclosed embodiment of the present invention as above, the embodiment that described content just adopts for the ease of understanding the present invention is not in order to limit the present invention.Technician in any the technical field of the invention; under the prerequisite that does not break away from the disclosed spirit and scope of the present invention; can do any modification and variation in the details of implementing that reaches in form; but scope of patent protection of the present invention still must be as the criterion with the scope that appending claims was defined.Other features and advantages of the present invention will be set forth in the following description, and, partly from instructions, become apparent, perhaps understand by implementing the present invention.Purpose of the present invention and other advantages can realize and obtain by specifically noted structure in instructions, claims and accompanying drawing.
Except the described technical characterictic of instructions, be the known technology of those skilled in the art.

Claims (4)

1. the data stream method for parallel processing based on GPU-CUDA platform and genetic algorithm is characterized in that, comprises following content:
Utilize genetic algorithm dynamically to excavate the frequent item set of latest data, since one group of initial population search procedure, each individuality in the population is a possible frequent mode;
Take the moving window pattern according to the characteristics of data stream, carry out the excavation of flow data, and at the characteristics of frequent item set mining, adopt based on the nested subwindow model on the moving window;
Big according to the data flow data amount, need the characteristics of processing in real time, adopt the GPU-CUDA parallel processing technique to carry out frequent item set mining;
Frequent item set in each nested subwindow in the overall treatment moving window finally obtains the frequent item set of data in the current moving window;
Concrete steps are as follows:
One, utilize the concurrency of genetic algorithm to search for the frequent item set of latest data in the nested subwindow;
Step S110: set moving window SW and subwindow S_SW size, be respectively w 1, w 2Import after all kinds of parameters, determine window size according to data stream property, the SW content is that the interest-degree according to the frequent item set of current how many bar affairs decides, subwindow is to determine according to the processing power of data and abandoned legacy data bar number, has also determined demand to require the frequency of adding up;
Given support threshold value S, if certain individual i, its fitness is F i, work as F i〉=S, affairs i are the frequent item set pattern of data set in the moving window;
The attribute kind number of affairs, the span of each attribute and generation initial population size are determined maximum iteration time T, this disposal route is to adopt the subwindow model, avoid after legacy data is eliminated, the data that exist in the moving window SW are carried out repeatedly re-treatment;
Set crossover probability P, the data in the individual variation probability Q, subwindow are divided into the parallel computation of Z section, and the function at this place adopts GPU CUDA concurrent technique, gives a thread with the data in each subwindow and carries out parallel processing;
Step S120: obtain initial population, data are in flow process, obtain the data of up-to-date arrival in the subwindow, obtain the frequent 1-item collection of these data simultaneously, frequent 1-item collection is encoded to the real number string, and frequent 1-item collection nonzero term is pressed original position assembly coding at random, form the initial population in the nested subwindow jointly, individuality is investigated the frequent item set pattern for waiting in this population, and detailed process is as follows:
1) property value of statistics A, B and C is V1, V2, and the number of V3 is respectively as first row, secondary series and the 3rd row;
2) carry out assignment more than or equal to the reservation of threshold value N, and by its corresponding row, less than the assignment 0 of N, and remove;
3) each non-0 value is become delegation separately, and keep its original position of being expert at, all the other positions fill out 0;
4) nonzero term is pressed original position assembly coding at random, forms initial population jointly;
Function is to adopt GPU CUDA programming mode, adopts optimization means such as stream technology and shared storage, and the solution procedure of each attribute is carried out parallel processing;
Step S130: calculate individual support value and be the process of waiting to investigate frequent mode and actual transaction coupling in the initial population, when individual support value during greater than S, this individual mode is added in the current subwindow frequent item set F i=W i/ W Z, F iBe the support of affairs i, W iFor having the affairs bar number of same alike result value, W in the current subwindow ZBe affairs total number in the current subwindow;
Divide Z section PARALLEL MATCHING, though increased memory cost, reduce working time in a large number, have very big meaning, PARALLEL MATCHING for data stream frequent item set mining;
Step S140: select, carry out the wheel disc selection with individual in the population by the support value;
Step S150: intersect, P once intersects with crossover probability;
Step S160: variation, individuality carries out the variation of basic position by variation probability Q;
Step S170: individual support value after the scanning definitive variation, the newly-increased individuality that satisfies condition adds in the frequent item set;
Step S180: judge termination condition, less than T, change step 3 as iterations, behind T the interative computation, termination of iterations and obtain the frequent item set of the interior data of current nested subwindow then;
Two, obtain the frequent item set of data in the current moving window
Step S210: each frequent item set pattern of this acquisition and U before, U=w 1/ w 2The frequent item set pattern that obtains for-1 time is formed initial population jointly, once searches for the individual frequent item set for data in the moving window of the pattern that finally satisfies condition;
For i=1:U+1;
The frequent mode that each section obtained is combined into the frequent mode group;
End;
The frequent mode group is carried out parallel search one time in moving window SW;
Support finally is defined as frequent mode greater than S's;
The function of this step adopts the shared programming mode of OpenMP to carry out multi-threaded parallel and handles;
Step S310: along with flowing of data stream, continue to handle the data that newly receive, and abandon data the earliest, change step S102 and continue above operation, till finishing to data stream.
2. method according to claim 1 is characterized in that, according to the parallel schema of described employing genetic algorithm, the step that obtains the frequent item set pattern of data stream comprises:
Data in the moving window are divided into the Z section, give a thread with the data in each nested subwindow and carry out parallel processing, obtain initial population, calculating individual support value is the process of waiting to investigate frequent mode and actual transaction coupling in the initial population, select, intersect, variation, individual support value after the scanning definitive variation is judged termination condition.
3. method according to claim 1 is characterized in that, the step according to the frequent item set of data in the current moving window of described acquisition comprises:
Each frequent item set pattern that obtains and U(U=w1/w2-1 before) the frequent item set pattern of inferior acquisition forms initial population jointly, once searches for the individual frequent item set for data in the moving window of the pattern that finally satisfies condition.
4. method according to claim 1 is characterized in that, the step according to the frequent item set of data in the current moving window of described acquisition comprises:
Along with flowing of data stream, continue to handle the data that newly receive, and abandon data the earliest.
CN2013102290983A 2013-06-09 2013-06-09 Data flow parallel processing method based on GPU-CUDA platform and genetic algorithm Pending CN103279332A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013102290983A CN103279332A (en) 2013-06-09 2013-06-09 Data flow parallel processing method based on GPU-CUDA platform and genetic algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013102290983A CN103279332A (en) 2013-06-09 2013-06-09 Data flow parallel processing method based on GPU-CUDA platform and genetic algorithm

Publications (1)

Publication Number Publication Date
CN103279332A true CN103279332A (en) 2013-09-04

Family

ID=49061875

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013102290983A Pending CN103279332A (en) 2013-06-09 2013-06-09 Data flow parallel processing method based on GPU-CUDA platform and genetic algorithm

Country Status (1)

Country Link
CN (1) CN103279332A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103995690A (en) * 2014-04-25 2014-08-20 清华大学深圳研究生院 Parallel time sequence mining method based on GPU
CN104102476A (en) * 2014-08-04 2014-10-15 浪潮(北京)电子信息产业有限公司 High-dimensional data stream canonical correlation parallel computation method and high-dimensional data stream canonical correlation parallel computation device in irregular steam
CN105706057A (en) * 2013-10-14 2016-06-22 微软技术许可有限责任公司 Parallel dynamic programming through rank convergence
CN105740457A (en) * 2016-02-15 2016-07-06 浪潮电子信息产业股份有限公司 Recent data stream frequent item set mining method based on CPU+MIC (Central Processing Unit+ Many Integrated Core) cooperative computing
CN106371917A (en) * 2016-08-23 2017-02-01 清华大学 Real-time frequent item set mining-oriented acceleration system and method
CN106919694A (en) * 2017-03-07 2017-07-04 郑州云海信息技术有限公司 A kind of Recent data stream item set mining method and apparatus based on KNL clusters
CN108334932A (en) * 2017-11-27 2018-07-27 中科观世(北京)科技有限公司 Frequency separation method based on echo signal feature
CN108520027A (en) * 2018-03-20 2018-09-11 大连理工大学 A kind of Frequent Itemsets Mining Algorithm that the GPU based on CUDA frames accelerates
CN108768940A (en) * 2018-04-19 2018-11-06 丙申南京网络技术有限公司 A kind of data digging system and method separating protection parallel based on computer network security
CN108958999A (en) * 2018-06-13 2018-12-07 郑州云海信息技术有限公司 A kind of method and system for testing GPU floating-point operation performance
CN109213793A (en) * 2018-08-07 2019-01-15 泾县麦蓝网络技术服务有限公司 A kind of stream data processing method and system
CN113791908A (en) * 2021-09-16 2021-12-14 脸萌有限公司 Service operation method and device and electronic equipment
CN113918356A (en) * 2021-12-13 2022-01-11 广东睿江云计算股份有限公司 Method and device for quickly synchronizing data based on CUDA (compute unified device architecture), computer equipment and storage medium
CN117010991A (en) * 2023-07-31 2023-11-07 江南大学 High-profit commodity combination mining method based on GPU (graphic processing Unit) parallel improved genetic algorithm
CN117036061A (en) * 2023-10-07 2023-11-10 国任财产保险股份有限公司 Intelligent solution providing method and system for intelligent agricultural insurance
CN117010991B (en) * 2023-07-31 2024-05-03 江南大学 High-profit commodity combination mining method based on GPU (graphic processing Unit) parallel improved genetic algorithm

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7328270B1 (en) * 1999-02-25 2008-02-05 Advanced Micro Devices, Inc. Communication protocol processor having multiple microprocessor cores connected in series and dynamically reprogrammed during operation via instructions transmitted along the same data paths used to convey communication data
CN101789044A (en) * 2010-01-27 2010-07-28 武汉大学 Method of implementing cooperative work of software and hardware of genetic algorithm
CN102662642A (en) * 2012-04-20 2012-09-12 浪潮电子信息产业股份有限公司 Parallel processing method based on nested sliding window and genetic algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7328270B1 (en) * 1999-02-25 2008-02-05 Advanced Micro Devices, Inc. Communication protocol processor having multiple microprocessor cores connected in series and dynamically reprogrammed during operation via instructions transmitted along the same data paths used to convey communication data
CN101789044A (en) * 2010-01-27 2010-07-28 武汉大学 Method of implementing cooperative work of software and hardware of genetic algorithm
CN102662642A (en) * 2012-04-20 2012-09-12 浪潮电子信息产业股份有限公司 Parallel processing method based on nested sliding window and genetic algorithm

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105706057B (en) * 2013-10-14 2019-06-18 微软技术许可有限责任公司 It is a kind of for concurrently controlling the equipment, method, system of multiple processing units
CN105706057A (en) * 2013-10-14 2016-06-22 微软技术许可有限责任公司 Parallel dynamic programming through rank convergence
CN103995690B (en) * 2014-04-25 2016-08-17 清华大学深圳研究生院 A kind of parallel time sequential mining method based on GPU
CN103995690A (en) * 2014-04-25 2014-08-20 清华大学深圳研究生院 Parallel time sequence mining method based on GPU
CN104102476A (en) * 2014-08-04 2014-10-15 浪潮(北京)电子信息产业有限公司 High-dimensional data stream canonical correlation parallel computation method and high-dimensional data stream canonical correlation parallel computation device in irregular steam
CN105740457A (en) * 2016-02-15 2016-07-06 浪潮电子信息产业股份有限公司 Recent data stream frequent item set mining method based on CPU+MIC (Central Processing Unit+ Many Integrated Core) cooperative computing
CN106371917A (en) * 2016-08-23 2017-02-01 清华大学 Real-time frequent item set mining-oriented acceleration system and method
CN106371917B (en) * 2016-08-23 2019-07-02 清华大学 Acceleration system and method towards real-time frequent item set mining
CN106919694A (en) * 2017-03-07 2017-07-04 郑州云海信息技术有限公司 A kind of Recent data stream item set mining method and apparatus based on KNL clusters
CN108334932A (en) * 2017-11-27 2018-07-27 中科观世(北京)科技有限公司 Frequency separation method based on echo signal feature
CN108334932B (en) * 2017-11-27 2022-03-29 中科观世(北京)科技有限公司 Frequency distinguishing method based on target signal characteristics
CN108520027A (en) * 2018-03-20 2018-09-11 大连理工大学 A kind of Frequent Itemsets Mining Algorithm that the GPU based on CUDA frames accelerates
CN108520027B (en) * 2018-03-20 2020-09-29 大连理工大学 GPU accelerated frequent item set mining method based on CUDA framework
CN108768940A (en) * 2018-04-19 2018-11-06 丙申南京网络技术有限公司 A kind of data digging system and method separating protection parallel based on computer network security
CN108958999A (en) * 2018-06-13 2018-12-07 郑州云海信息技术有限公司 A kind of method and system for testing GPU floating-point operation performance
CN109213793A (en) * 2018-08-07 2019-01-15 泾县麦蓝网络技术服务有限公司 A kind of stream data processing method and system
WO2023043366A3 (en) * 2021-09-16 2023-05-11 脸萌有限公司 Service running method and apparatus, and electronic device
CN113791908A (en) * 2021-09-16 2021-12-14 脸萌有限公司 Service operation method and device and electronic equipment
CN113791908B (en) * 2021-09-16 2024-03-29 脸萌有限公司 Service running method and device and electronic equipment
CN113918356B (en) * 2021-12-13 2022-02-18 广东睿江云计算股份有限公司 Method and device for quickly synchronizing data based on CUDA (compute unified device architecture), computer equipment and storage medium
CN113918356A (en) * 2021-12-13 2022-01-11 广东睿江云计算股份有限公司 Method and device for quickly synchronizing data based on CUDA (compute unified device architecture), computer equipment and storage medium
CN117010991A (en) * 2023-07-31 2023-11-07 江南大学 High-profit commodity combination mining method based on GPU (graphic processing Unit) parallel improved genetic algorithm
CN117010991B (en) * 2023-07-31 2024-05-03 江南大学 High-profit commodity combination mining method based on GPU (graphic processing Unit) parallel improved genetic algorithm
CN117036061A (en) * 2023-10-07 2023-11-10 国任财产保险股份有限公司 Intelligent solution providing method and system for intelligent agricultural insurance
CN117036061B (en) * 2023-10-07 2023-12-12 国任财产保险股份有限公司 Intelligent solution providing method and system for intelligent agricultural insurance

Similar Documents

Publication Publication Date Title
CN103279332A (en) Data flow parallel processing method based on GPU-CUDA platform and genetic algorithm
CN102662642A (en) Parallel processing method based on nested sliding window and genetic algorithm
Frutos et al. A memetic algorithm based on a NSGAII scheme for the flexible job-shop scheduling problem
Truong et al. Chemical reaction optimization with greedy strategy for the 0–1 knapsack problem
CN103235974B (en) A kind of method improving massive spatial data treatment effeciency
CN104866904A (en) Parallelization method of BP neural network optimized by genetic algorithm based on spark
CN104636813A (en) Hybrid genetic simulated annealing algorithm for solving job shop scheduling problem
CN103530702A (en) Large-scale operation workshop scheduling method based on bottleneck equipment decomposition
Yadav et al. An overview of genetic algorithm and modeling
CN103116693B (en) Based on the Method for HW/SW partitioning of artificial bee colony
CN108460463A (en) High-end equipment flow line production dispatching method based on improved adaptive GA-IAGA
CN1450493A (en) Nerve network system for realizing genetic algorithm
CN101256648A (en) Genetic operation operator based on indent structure for producing quening system
CN105740457A (en) Recent data stream frequent item set mining method based on CPU+MIC (Central Processing Unit+ Many Integrated Core) cooperative computing
CN109582985A (en) A kind of NoC mapping method of improved genetic Annealing
Mendez et al. Proposal and comparative study of evolutionary algorithms for optimum design of a gear system
Hu An improved flower pollination algorithm for optimization of intelligent logistics distribution center
Sung et al. An adaptive evolutionary algorithm for traveling salesman problem with precedence constraints
Akkus et al. Automated land reallotment using genetic algorithm
Yang et al. Cultural-based genetic tabu algorithm for multiobjective job shop scheduling
Rangarajan et al. Multi‐objective optimization of root phenotypes for nutrient capture using evolutionary algorithms
Liu et al. NeuroCrossover: An intelligent genetic locus selection scheme for genetic algorithm using reinforcement learning
CN103279796A (en) Method for optimizing genetic algorithm evolution quality
Lu et al. Multi-center variable-scale search algorithm for combinatorial optimization problems with the multimodal property
CN109885401B (en) Structured grid load balancing method based on LPT local optimization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130904

WD01 Invention patent application deemed withdrawn after publication