CN103106253A - Data balance method based on genetic algorithm in MapReduce calculation module - Google Patents
Data balance method based on genetic algorithm in MapReduce calculation module Download PDFInfo
- Publication number
- CN103106253A CN103106253A CN2013100159884A CN201310015988A CN103106253A CN 103106253 A CN103106253 A CN 103106253A CN 2013100159884 A CN2013100159884 A CN 2013100159884A CN 201310015988 A CN201310015988 A CN 201310015988A CN 103106253 A CN103106253 A CN 103106253A
- Authority
- CN
- China
- Prior art keywords
- gene
- map
- task
- metadata
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Provided is a data balance method based on genetic algorithm in a MapReduce calculation module. The data balance method based on the genetic algorithm in the MapReduce calculation module includes: obtaining global Map output information, utilizing the genetic algorithm to conduct combination optimization, collecting and coding metadata, conducting multiple random partition on population, forming a genome through each partition, calculating fitness function values of all subsets in each gene, applying a selection operator to a genome on the basis of evaluating fitness of each gene, utilizing a roulette algorithm to choose a plurality of high quality genes in the genome at random, conducting cross operation on the chosen genes, conducting mutation operation, choosing retained genes according to an elitism strategy after multiple evolutions, decoding the genes to obtain a optical combination of the metadata and guaranteeing that each data quantity which is processed by the reducer is approximate equal. The data balance method based on genetic algorithm in the MapReduce calculation module solves the problem of unbalance input data in the reduce phrase, saves calculation resource and reduces calculation cost.
Description
Technical field
The invention belongs to computing machine MapReduce computation model technical field, be specifically related in a kind of MapReduce computation model the data balancing method based on genetic algorithm.
Background technology
Hadoop is by increase income storage and a distributed paralleling calculation platform with high reliability and enhanced scalability of organization development of Apache, develop as the basic platform of the search engine project Nutch that increases income the earliest, independent from the Nutch project afterwards, become one of the cloud computing platform of typically increasing income.The Hadoop core has realized by the distributed file system (Hadoop Distributed File System, HDFS) of piece storage and the MapReduce computation model that is used for Distributed Calculation.
The processing stage that the MapReduce computation model being divided into the large task of two of Map and Reduce.In the MapReduce processing procedure, the Map stage will be inputted data and change into<Key, Value〉data mode of key-value pair, offer the Reduce stage to be further processed.Before Reduce accepts the key-value pair data of Map output and it is processed, also need through a Shuffle stage.The Shuffle stage mainly shuffles the output data of each Map task, and collects in these Map tasks output data the data that need to be processed by same reduce task.Because the data scale of collecting may be larger, the Shuffle stage can merge data in the local file system store reduce task place node into, thereby reduces the memory headroom occupancy.
Each Map task will export according to the quantity of reduce task the subregion number that data are divided into equal parts, single reduce task is collected corresponding with it partition data from all Map tasks, all Map output key-value pairs that possess identical key value will be assigned to same reduce task and process, thereby the final process result that guarantees each reduce is based upon on global scope.
The characteristics in Shuffle stage have determined that the data volume that each reduce task of Reduce stage is accepted might be extremely uneven, thereby cause the Reduce stage to calculate the problem of inclination.
1) Reduce that is caused by the User Defined partitioning strategies calculates
When the MapReduce operation is submitted to, according to the partitioning strategies of appointment, the Map stage need to be divided the number of output subregion, sets up the corresponding relation between Map output and reduce input.User-defined partitioning strategies is according to practical application request, the data that will be mutually related are divided in same subregion, complete processing by same reduce task, guarantee the correctness of net result, but also may cause each reduce task deal with data amount uneven simultaneously.
When the concrete subregion of data is indifferent in the MapReduce operation, for completing fast minute Division of Map output data, what usually adopt is hash subregion method, hash value by Key is determined whole<Key, Value〉the affiliated partition number of key-value pair, i.e. partition number partitionNum=hashCode (Key) %REDUCER_NUM.This method is limited by hash and calculates the factors such as conflict and reduce Limited Number, a large amount of key probably occurs to converge on same subregion, causes the data volume on each reduce task uneven.
2) Reduce that is caused by input data unique characteristics calculates
Due to division operation at each Map<Key, Value carry out after key-value pair data output, determine its district location according to some characteristic of Key often, lack the global statistics information of Key correspondence Value data scale.Therefore, even the quantity that partitioning strategies can guarantee key in each subregion is balance roughly, but input the own characteristic of data due to the Map stage, the corresponding Value data volume of some specific key is measured much larger than Value corresponding to other key, thereby causes part reduce task data volume to be processed excessive.This phenomenon comes across the situation that has some hot spot datas in the input data usually.Generally, the input data skew in Reduce stage will make some reduce task increase with respect to other reduce task execution times, extend the working time in whole Reduce stage, finally affect the deadline of whole MapReduce operation.
Summary of the invention
In order to overcome the shortcoming of above-mentioned prior art, the object of the present invention is to provide in a kind of MapReduce computation model the data balancing method based on genetic algorithm, reduced the processing time of task reducer, and then reduced processing time of whole MapReduce, can well save computational resource and minimizing assesses the cost.
In order to achieve the above object, the technical scheme taked of the present invention is:
Based on the data balancing method of genetic algorithm, comprise the following steps in a kind of MapReduce computation model:
1), obtain overall Map output information, obtain the metadata information of the subregion that the reduce task processes, the acquisition process of Reduce metadata is:
1.1, each Map task after completing processing procedure and Output rusults write local disk, can utilize heartbeat message transmission task to complete message to JobTracker by TaskTracker;
1.2, JobTracker safeguards a Map task for each MapReduce operation and completes message queue, when certain moves the TaskTracker acquisition request Map task of reduce task, according to the operation under this reduce task, take out message and pass to TaskTracker from respective queue;
1.3, the reduce task in same operation obtains the Map task from the TaskTracker at place and completes message, the information during operation of therefrom extracting the Map task, comprise the Map mission number, XM, utilize these information, the reduce task creation is connected with HTTP between XM, and the metadata information of request Map task output;
1.4, TaskTracker is according to the Map mission number of request, read the index file of corresponding Map task output from local file system, and send to the reduce task of request;
1.5, the identical numbering virtual partition in reduce task merging different index file, gather all same kind<Key in each virtual partition, Value〉data volume of key-value pair, to obtain the metadata information of all map tasks outputs due to each reduce task;
2), the output data of Map are processed, the reduce task is obtained the subregion raw data of each map task output; Metadata after gathering is submitted to the repartition device, adopts genetic algorithm to carry out equilibrium to metadata, and genetic algorithm is that bit string is operated, and its concrete steps are as follows:
2.1, metadata collecting that Map is exported data gets up to be placed in a set, as a population, each element in population is encoded, each element of coded representation that so-called coding uses " 0,1 " to form exactly, the coded system that adopts is to represent the element place subscript in gathering with 1 number, this population is carried out random division, be divided into the N subset, wherein N is corresponding with the number of reduce, division each time forms a gene, after repeatedly dividing, form a genome;
2.2, fitness function is to weigh the individual adaptedness for living environment of heredity in genetic algorithm, the individuality that fitness is higher obtains more duplicator meeting, vice versa, therefore, defines a fitness function
, wherein,
Be whole mean value of the element sum of subsets, what in formula (1), objective function was described is the mean distance that each subset is incorporated into mean value, utilize this formula (1), each gene is calculated its fitness function, form a new set, then obtain the probability of each Gene sufficiency function, namely the value of the fitness function of a gene is divided by whole genomic fitness function value sum;
2.3, will select operator to be applied to genome, the selection operator that adopts is roulette wheel selection, utilize random function to produce one [0,1] random number between, judge the position in its fitness probability sequence in genome, if its multipotency represents that greater than m in sequence value the m gene is selected, freely specify the number that needs the gene selected;
2.4, carry out crossing operation to electing gene, namely the part-structure of Fineness gene is replaced to reconfigure and formed new gene, adopt the single-point crossover operator, concrete operations are: set at random a point of crossing, the gene that corresponding roulette selection algorithm chooses, intersect, namely the part-structure of two genes before and after this point of crossing exchanges, and generate two new individual, and the genome after guaranteeing to exchange can not have the situation of null set, set a nullGen sign, genome after traversal is intersected, if find to have null set to exist, be about to the nullGen sign and be set to false, and identify the gene of this deletion with this,
2.5, to the computing that makes a variation of the gene after intersecting, thereby the variation computing is according to the variation probability, some gene in genome to be replaced with other gene to form a new individuality, adopt the fixed bit mutation operator, and the probability that will make a variation is made as 0.1, to obtaining optimum solution, the fixed bit mutation operator refers to a certain position or a few genes of the fixing appointment of individual gene are made mutation operation: original gene is 0, become 1, original gene is 1, become 0, through after mutation operation, gene after variation is carried out non-NULLCHECK, guarantee that the gene after compiling still has the N subset,
2.6, abovely described one and taken turns evolutionary process, after evolving, too much wheel selects the gene of reservation according to elite's retention strategy, the gene retention strategy that adopts is: through after above step, calculate the target function value of each gene, and it is compared with the target function value of all genes in genome, the former is remained less than the latter's gene;
2.7, the gene that remains is decoded, just can obtain the combination to an optimization of metadata, be about to metadata and be divided into N the subset that size is substantially suitable, then, on the data allocations to that every subset is a corresponding reducer, so just guarantee that the handled data volume of each reducer is approximately equalised.
The invention has the beneficial effects as follows:
Calculate tilt problem for the Reduce stage that exists in the MapReduce platform, solution has been proposed, the method utilizes genetic algorithm to carry out repartition by being exported data the Map stage, the data volume of guaranteeing each subregion is unanimous on the whole, make the reduce task use more efficiently the resource of system, avoided because reducer inputs the inconsistent of uneven processing time of causing of data volume, thereby reduced the processing time of task reducer, and then reduced the processing time of whole MapReduce.From the business aspect, new method can well save computational resource and minimizing assesses the cost.
Description of drawings
Fig. 1 Reduce metadata is obtained process flow diagram.
Fig. 2 Map output metadata acquisition module class figure.
Fig. 3 is based on the process flow diagram of the data balancing method of genetic algorithm.
Embodiment
The present invention is described in detail below in conjunction with accompanying drawing.
Based on the data balancing method of genetic algorithm, comprise the following steps in a kind of MapReduce computation model:
1), obtain overall Map output information, obtain the metadata information of the subregion that the reduce task processes, the acquisition process of Reduce metadata as shown in Figure 1:
1.1, each Map task after completing processing procedure and Output rusults write local disk, can utilize heartbeat message transmission task to complete message to JobTracker by TaskTracker;
1.2, JobTracker safeguards a Map task for each MapReduce operation and completes message queue, when certain moves the TaskTracker acquisition request Map task of reduce task, according to the operation under this reduce task, take out message and pass to TaskTracker from respective queue;
1.3, the reduce task in same operation obtains the Map task from the TaskTracker at place and completes message, the information during operation of therefrom extracting the Map task, comprise the Map mission number, XM, utilize these information, the reduce task creation is connected with HTTP between XM, and the metadata information of request Map task output;
1.4, TaskTracker is according to the Map mission number of request, read the index file of corresponding Map task output from local file system, and send to the reduce task of request;
1.5, the identical numbering virtual partition in reduce task merging different index file, gather all same kind<Key in each virtual partition, Value〉data volume of key-value pair, to obtain the metadata information of all map task outputs due to each reduce task, consider in practical situation, map task number is usually more, and be distributed on a plurality of computing nodes, accelerate the metadata acquisition process for raising the efficiency, adopt multithreading to complete this process in can realizing, the main class formation of Map output metadata acquisition module as shown in Figure 2;
2), the output data of Map are processed, the reduce task is obtained the subregion raw data of each map task output; Metadata after gathering is submitted to the repartition device, in order to make the big or small basically identical of input data volume that each reducer obtains, the present invention adopts genetic algorithm, metadata is carried out equilibrium, genetic algorithm is that bit string is operated, rather than to data itself, its concrete steps are as follows:
2.1, metadata collecting that Map is exported data gets up to be placed in a set, as a population, each element in population is encoded, each element of coded representation that so-called coding uses " 0,1 " to form exactly, the coded system that the present invention adopts is to represent the element place subscript in gathering with 1 number, this population is carried out random division, be divided into the N subset, wherein N is corresponding with the number of reduce, division each time forms a gene, after repeatedly dividing, form a genome;
2.2, fitness function is to weigh the individual adaptedness for living environment of heredity in genetic algorithm, the individuality that fitness is higher obtains more duplicator meeting, vice versa, therefore, the present invention defines a fitness function
, wherein,
Be whole mean value of the element sum of subsets, what in formula (1), objective function was described is the mean distance that each subset is incorporated into mean value, utilize this formula (1), each gene is calculated its fitness function, form a new set, then obtain the probability of each Gene sufficiency function, namely the value of the fitness function of a gene is divided by whole genomic fitness function value sum;
2.3, to select operator to be applied to genome, the selection operator that the present invention adopts is roulette wheel selection, roulette wheel selection is a kind of random system of selection commonly used, be similar to the roulette in the gambling game, its main thought is the probability that the ideal adaptation degree is converted to selection in proportion, the ratio shared by individuality carries out ratio cut partition on disk, each rotary disk, treat that it is the individuality of choosing that disk stops individuality corresponding to backpointer stop sector, adopt the benefit of this selection algorithm to be, individual probability is larger, the area occupied ratio of this individuality in disk is also larger, selected probability is also just larger, utilize this thought, specific implementation of the present invention is: utilize random function to produce one [0, 1] random number between, judge the position in its fitness probability sequence in genome, if its multipotency is greater than m in sequence value, represent that the m gene is selected, generally can freely specify the number of the gene that needs selection,
2.4, several genes of electing are carried out crossing operation, namely the part-structure of Fineness gene is replaced to reconfigure and formed new gene, crossing operation is the key character that genetic algorithm is different from other evolution algorithms, the present invention adopts the single-point crossover operator, concrete operations are: set at random a point of crossing, the gene that corresponding roulette selection algorithm chooses, intersect, namely the part-structure of two genes before and after this point of crossing exchanges, and generate two new individual, and the genome after guaranteeing to exchange can not have the situation of null set, set a nullGen sign, genome after traversal is intersected, if find to have null set to exist, be about to the nullGen sign and be set to false, and identify the gene of this deletion with this,
2.5, to the computing that makes a variation of the gene after intersecting, thereby the variation computing is according to the variation probability, some gene in genome to be replaced with other gene to form a new individuality, the purpose that genetic algorithm is introduced variation has two: the one, and make genetic algorithm have local random searching ability, when genetic algorithm by crossover operator during near optimal solution neighborhood, utilize this local random searching ability of mutation operator can accelerate to restrain to optimum solution, obviously, variation probability in such cases should be got smaller value, otherwise the building block near optimum solution can be destroyed because of variation, the 2nd, make genetic algorithm can keep population diversity, to prevent the prematurity Convergent Phenomenon, this moment, convergent probability should be got higher value, based on above consideration, the present invention adopts the fixed bit mutation operator, and the probability that will make a variation is made as 0.1, to obtaining optimum solution, the fixed bit mutation operator refers to a certain position or a few genes of the fixing appointment of individual gene are made mutation operation: original gene is 0, become 1, original gene is 1, become 0, through after mutation operation, gene after variation is carried out non-NULLCHECK, guarantee that the gene after compiling still has the N subset,
2.6, abovely described one and taken turns evolutionary process, after evolving, too much wheel selects the gene of reservation according to elite's retention strategy, the gene retention strategy that the present invention adopts is: through after above step, calculate the target function value of each gene, and it is compared with the target function value of all genes in genome, the former is remained less than the latter's gene;
2.7, the gene that remains is decoded, just can obtain the combination to an optimization of metadata, be about to metadata and be divided into N the subset that size is substantially suitable, then, on the data allocations to that every subset is a corresponding reducer, so just can guarantee that the handled data volume of each reducer is suitable, well solve the problem that the reduce stage inputs data skew.In the MapReduce computation model, a kind of process flow diagram of the data balancing method based on genetic algorithm as shown in Figure 3.
Claims (1)
- In a MapReduce computation model based on the data balancing method of genetic algorithm, it is characterized in that, comprise the following steps:1), obtain overall Map output information, obtain the metadata information of the subregion that the reduce task processes, the acquisition process of Reduce metadata is:1.1, each Map task after completing processing procedure and Output rusults write local disk, can utilize heartbeat message transmission task to complete message to JobTracker by TaskTracker;1.2, JobTracker safeguards a Map task for each MapReduce operation and completes message queue, when certain moves the TaskTracker acquisition request Map task of reduce task, according to the operation under this reduce task, take out message and pass to TaskTracker from respective queue;1.3, the reduce task in same operation obtains the Map task from the TaskTracker at place and completes message, the information during operation of therefrom extracting the Map task, comprise the Map mission number, XM, utilize these information, the reduce task creation is connected with HTTP between XM, and the metadata information of request Map task output;1.4, TaskTracker is according to the Map mission number of request, read the index file of corresponding Map task output from local file system, and send to the reduce task of request;1.5, the identical numbering virtual partition in reduce task merging different index file, gather all same kind<Key in each virtual partition, Value〉data volume of key-value pair, to obtain the metadata information of all map tasks outputs due to each reduce task;2), the output data of Map are processed, the reduce task is obtained the subregion raw data of each map task output; Metadata after gathering is submitted to the repartition device, adopts genetic algorithm to carry out equilibrium to metadata, and genetic algorithm is that bit string is operated, and its concrete steps are as follows:2.1, metadata collecting that Map is exported data gets up to be placed in a set, as a population, each element in population is encoded, each element of coded representation that so-called coding uses " 0,1 " to form exactly, the coded system that the present invention adopts is to represent the element place subscript in gathering with 1 number, this population is carried out random division, be divided into the N subset, wherein N is corresponding with the number of reduce, division each time forms a gene, after repeatedly dividing, form a genome;2.2, fitness function is to weigh the individual adaptedness for living environment of heredity in genetic algorithm, the individuality that fitness is higher obtains more duplicator meeting, vice versa, therefore, defines a fitness function, wherein, Be whole mean value of the element sum of subsets, what in formula (1), objective function was described is the mean distance that each subset is incorporated into mean value, utilize this formula (1), each gene is calculated its fitness function, form a new set, then obtain the probability of each Gene sufficiency function, namely the value of the fitness function of a gene is divided by whole genomic fitness function value sum;2.3, will select operator to be applied to genome, the selection operator that adopts is roulette wheel selection, utilize random function to produce one [0,1] random number between, judge the position in its fitness probability sequence in genome, if its multipotency represents that greater than m in sequence value the m gene is selected, freely specify the number that needs the gene selected;2.4, carry out crossing operation to electing gene, namely the part-structure of Fineness gene is replaced to reconfigure and formed new gene, adopt the single-point crossover operator, concrete operations are: set at random a point of crossing, the gene that corresponding roulette selection algorithm chooses, intersect, namely the part-structure of two genes before and after this point of crossing exchanges, and generate two new individual, and the genome after guaranteeing to exchange can not have the situation of null set, set a nullGen sign, genome after traversal is intersected, if find to have null set to exist, be about to the nullGen sign and be set to false, and identify the gene of this deletion with this,2.5, to the computing that makes a variation of the gene after intersecting, thereby the variation computing is according to the variation probability, some gene in genome to be replaced with other gene to form a new individuality, adopt the fixed bit mutation operator, and the probability that will make a variation is made as 0.1, to obtaining optimum solution, the fixed bit mutation operator refers to a certain position or a few genes of the fixing appointment of individual gene are made mutation operation: original gene is 0, become 1, original gene is 1, become 0, through after mutation operation, gene after variation is carried out non-NULLCHECK, guarantee that the gene after compiling still has the N subset,2.6, abovely described one and taken turns evolutionary process, after evolving, too much wheel selects the gene of reservation according to elite's retention strategy, the gene retention strategy that adopts is: through after above step, calculate the target function value of each gene, and it is compared with the target function value of all genes in genome, the former is remained less than the latter's gene;2.7, the gene that remains is decoded, just can obtain the combination to an optimization of metadata, be about to metadata and be divided into N the subset that size is substantially suitable, then, on the data allocations to that every subset is a corresponding reducer, so just guarantee that the handled data volume of each reducer is suitable.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310015988.4A CN103106253B (en) | 2013-01-16 | 2013-01-16 | A kind of data balancing method based on genetic algorithm in MapReduce computation model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310015988.4A CN103106253B (en) | 2013-01-16 | 2013-01-16 | A kind of data balancing method based on genetic algorithm in MapReduce computation model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103106253A true CN103106253A (en) | 2013-05-15 |
CN103106253B CN103106253B (en) | 2016-05-04 |
Family
ID=48314108
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310015988.4A Expired - Fee Related CN103106253B (en) | 2013-01-16 | 2013-01-16 | A kind of data balancing method based on genetic algorithm in MapReduce computation model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103106253B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103401626A (en) * | 2013-08-23 | 2013-11-20 | 西安电子科技大学 | Genetic algorithm based cooperative spectrum sensing optimization method |
CN104102707A (en) * | 2014-07-10 | 2014-10-15 | 西安交通大学 | Geographical attribution information inquiry method oriented to MapReduce frame |
CN104239529A (en) * | 2014-09-19 | 2014-12-24 | 浪潮(北京)电子信息产业有限公司 | Method and device for preventing Hive data from being inclined |
CN105260324A (en) * | 2015-10-14 | 2016-01-20 | 北京百度网讯科技有限公司 | Key-value pair data operation method and apparatus for distributed cache system |
CN110032559A (en) * | 2019-04-19 | 2019-07-19 | 成都四方伟业软件股份有限公司 | A kind of data pick-up method and device |
CN110109753A (en) * | 2019-04-25 | 2019-08-09 | 成都信息工程大学 | Resource regulating method and system based on various dimensions constraint genetic algorithm |
CN111104225A (en) * | 2019-12-23 | 2020-05-05 | 杭州安恒信息技术股份有限公司 | Data processing method, device, equipment and medium based on MapReduce |
CN112307008A (en) * | 2020-12-14 | 2021-02-02 | 湖南蚁坊软件股份有限公司 | Druid compaction method |
CN112769522A (en) * | 2021-01-20 | 2021-05-07 | 广西师范大学 | Partition structure-based encoding distributed computing method |
CN113098773A (en) * | 2018-03-05 | 2021-07-09 | 华为技术有限公司 | Data processing method, device and system |
CN113434299A (en) * | 2021-07-05 | 2021-09-24 | 广西师范大学 | Encoding distributed computing method based on MapReduce framework |
CN114461381A (en) * | 2021-12-27 | 2022-05-10 | 天翼云科技有限公司 | Log data processing method and device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101183368A (en) * | 2007-12-06 | 2008-05-21 | 华南理工大学 | Method and system for distributed calculating and enquiring magnanimity data in on-line analysis processing |
CN101764835A (en) * | 2008-12-25 | 2010-06-30 | 华为技术有限公司 | Task allocation method and device based on MapReduce programming framework |
US20110208947A1 (en) * | 2010-01-29 | 2011-08-25 | International Business Machines Corporation | System and Method for Simplifying Transmission in Parallel Computing System |
-
2013
- 2013-01-16 CN CN201310015988.4A patent/CN103106253B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101183368A (en) * | 2007-12-06 | 2008-05-21 | 华南理工大学 | Method and system for distributed calculating and enquiring magnanimity data in on-line analysis processing |
CN101764835A (en) * | 2008-12-25 | 2010-06-30 | 华为技术有限公司 | Task allocation method and device based on MapReduce programming framework |
US20110208947A1 (en) * | 2010-01-29 | 2011-08-25 | International Business Machines Corporation | System and Method for Simplifying Transmission in Parallel Computing System |
Non-Patent Citations (2)
Title |
---|
李东等: "《一种适用于大规模变量的并行遗传算法研究》", 《计算机科学》 * |
李震等: "《云计算环境下的改进型Map-Reduce模型》", 《计算机工程》 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103401626B (en) * | 2013-08-23 | 2016-03-16 | 西安电子科技大学 | Based on the collaborative spectrum sensing optimization method of genetic algorithm |
CN103401626A (en) * | 2013-08-23 | 2013-11-20 | 西安电子科技大学 | Genetic algorithm based cooperative spectrum sensing optimization method |
CN104102707A (en) * | 2014-07-10 | 2014-10-15 | 西安交通大学 | Geographical attribution information inquiry method oriented to MapReduce frame |
CN104102707B (en) * | 2014-07-10 | 2016-03-30 | 西安交通大学 | A kind of geographical attaching information querying method towards MapReduce framework |
CN104239529A (en) * | 2014-09-19 | 2014-12-24 | 浪潮(北京)电子信息产业有限公司 | Method and device for preventing Hive data from being inclined |
CN105260324A (en) * | 2015-10-14 | 2016-01-20 | 北京百度网讯科技有限公司 | Key-value pair data operation method and apparatus for distributed cache system |
US11522789B2 (en) | 2018-03-05 | 2022-12-06 | Huawei Technologies Co., Ltd. | Data processing method, apparatus, and system for combining data for a distributed calculation task in a data center network |
US11855880B2 (en) | 2018-03-05 | 2023-12-26 | Huawei Technologies Co., Ltd. | Data processing method, apparatus, and system for combining data for a distributed calculation task in a data center network |
CN113098773B (en) * | 2018-03-05 | 2022-12-30 | 华为技术有限公司 | Data processing method, device and system |
CN113098773A (en) * | 2018-03-05 | 2021-07-09 | 华为技术有限公司 | Data processing method, device and system |
CN110032559A (en) * | 2019-04-19 | 2019-07-19 | 成都四方伟业软件股份有限公司 | A kind of data pick-up method and device |
CN110109753A (en) * | 2019-04-25 | 2019-08-09 | 成都信息工程大学 | Resource regulating method and system based on various dimensions constraint genetic algorithm |
CN111104225A (en) * | 2019-12-23 | 2020-05-05 | 杭州安恒信息技术股份有限公司 | Data processing method, device, equipment and medium based on MapReduce |
CN112307008B (en) * | 2020-12-14 | 2023-12-08 | 湖南蚁坊软件股份有限公司 | Druid compacting method |
CN112307008A (en) * | 2020-12-14 | 2021-02-02 | 湖南蚁坊软件股份有限公司 | Druid compaction method |
CN112769522A (en) * | 2021-01-20 | 2021-05-07 | 广西师范大学 | Partition structure-based encoding distributed computing method |
CN113434299A (en) * | 2021-07-05 | 2021-09-24 | 广西师范大学 | Encoding distributed computing method based on MapReduce framework |
CN113434299B (en) * | 2021-07-05 | 2024-02-06 | 广西师范大学 | Coding distributed computing method based on MapReduce framework |
CN114461381A (en) * | 2021-12-27 | 2022-05-10 | 天翼云科技有限公司 | Log data processing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN103106253B (en) | 2016-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103106253A (en) | Data balance method based on genetic algorithm in MapReduce calculation module | |
CN102063339B (en) | Resource load balancing method and equipment based on cloud computing system | |
CN107169560A (en) | The depth convolutional neural networks computational methods and device of a kind of adaptive reconfigurable | |
CN103593452B (en) | A kind of data-intensive Cost Optimization Approach based on MapReduce mechanism | |
US20060003823A1 (en) | Dynamic player groups for interest management in multi-character virtual environments | |
CN104952032B (en) | Processing method, device and the rasterizing of figure represent and storage method | |
CN107645403A (en) | Terminal rule engine apparatus, terminal rule operation method | |
CN109543726A (en) | A kind of method and device of training pattern | |
CN109522104B (en) | Method for optimizing scheduling of two target tasks of Iaas by using differential evolution algorithm | |
CN103814358A (en) | Virtual machine placement within server farm | |
JP7349178B2 (en) | Optimization system and method for parameter settings of wave energy devices | |
TW201235865A (en) | Data structure for tiling and packetizing a sparse matrix | |
CN103226762A (en) | Logistic distribution method based on cloud computing platform | |
CN106202092A (en) | The method and system that data process | |
CN105184368A (en) | Distributed extreme learning machine optimization integrated framework system and method | |
JP2023546040A (en) | Data processing methods, devices, electronic devices, and computer programs | |
Peng et al. | Modeling and combined application of orthogonal chaotic NSGA-II and improved TOPSIS to optimize a conceptual hydrological model | |
Jiang et al. | Parallel K-Medoids clustering algorithm based on Hadoop | |
Wang et al. | A Task Scheduling Strategy in Edge‐Cloud Collaborative Scenario Based on Deadline | |
CN113469372A (en) | Reinforcement learning training method, device, electronic equipment and storage medium | |
CN104580518A (en) | Load balance control method used for storage system | |
CN106708609B (en) | Feature generation method and system | |
CN111078380A (en) | Multi-target task scheduling method and system | |
CN117155791B (en) | Model deployment method, system, equipment and medium based on cluster topology structure | |
CN104598600B (en) | A kind of parallel analysis of digital terrain optimization method based on distributed memory |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160504 Termination date: 20220116 |