CN103761146A - Method for dynamically setting quantities of slots for MapReduce - Google Patents

Method for dynamically setting quantities of slots for MapReduce Download PDF

Info

Publication number
CN103761146A
CN103761146A CN201410004521.4A CN201410004521A CN103761146A CN 103761146 A CN103761146 A CN 103761146A CN 201410004521 A CN201410004521 A CN 201410004521A CN 103761146 A CN103761146 A CN 103761146A
Authority
CN
China
Prior art keywords
node
slots
tasktracker
task
slot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410004521.4A
Other languages
Chinese (zh)
Other versions
CN103761146B (en
Inventor
宗栋瑞
郭美思
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410004521.4A priority Critical patent/CN103761146B/en
Publication of CN103761146A publication Critical patent/CN103761146A/en
Application granted granted Critical
Publication of CN103761146B publication Critical patent/CN103761146B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method for dynamically setting the quantities of slots for MapReduce. The method includes steps of firstly, setting the quantities of the slots according to computing power of nodes in clusters; secondly, properly adjusting the quantity of the corresponding slots according to the condition of memories in each node. Compared with the prior art, the method for dynamically setting the quantities of the slots for the MapReduce has the advantages that the running performance of mapreduce programs can be improved, reasonable utilization of resources can be optimized, and the method is high in practicality and easy to popularize.

Description

A kind of method of MapReduce dynamic setting slots quantity
Technical field
The present invention relates to field of computer technology, specifically a kind of method of MapReduce dynamic setting slots quantity.
Background technology
Internet technology development of today, data become explosive growth, and on network, data scale sharply increases, and in chaotic data, is containing huge business opportunity, can be worth from the extracting data of magnanimity.But the data-handling capacity that thing followed problem is unit cannot meet the processing requirements of current mass data application, and the Distributed Calculation based on large-scale calculations cluster becomes the main path of Future Data performance boost.Core technology MapReduce computation model for Hadoop is studied, and has proposed a kind of strategy of MapReduce dynamic setting slots quantity for map, the reduce quantity problem of default setting same number in each node in MapReduce.According to the hardware configuration difference of different nodes in cluster, different map quantity and reduce quantity are set.
For map number in mapreduce and reduce number, be set as follows at present: the quantity of map task is the parameter value of mapred.tasktracker.map.tasks.maximu, but a TaskTracker can configure how many slot, or relevant with its physical environment.Each task is independently carried out by the JVM newly starting, and just has a plurality of JVM when having a plurality of task, and each JVM consumes a part of internal memory, adds the memory consumption of DataNode and TaskTracker, and machine internal memory possibility will be not enough.Except considering the internal memory restriction of each new startup JVM of allotment, must close to pour down, need on earth how many new JVM, the namely numbers of map slot and reduce slot of starting like this.Their setting is also relevant with the processor number of machine.Concrete configuration must carry out observation and analysis from the actual motion effect of cluster.The size of Input Split, has determined that a Job has how many map.Yet if the data volume of input is huge, the block of acquiescence has several ten thousand Map Task of hundreds of thousands even so, the Internet Transmission of cluster can be very large, and the most serious is to Job Tracker scheduling, queue, internal memory all can bring very large pressure.Therefore to set the slots quantity that suitably meets machine computing power.
In Hadoop, use slot to represent the resource on each TaskTraker, a slot represents fixing combination of resources, when carrying out mapreduce program, the Map slot number on each TaskTracker and Reduce slot number are to be configured by mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum.Once after these two parameter configurations, can not on-the-fly modify.Because the stock number that task of same-action does not need is different, the node hardware configuration in cluster is also not quite similar, and therefore, for the difference of node resource, proposes a kind of strategy of MapReduce dynamic setting slots quantity.This strategy can, according to node computing power dynamic setting slot quantity, improve the performance that MapReduce program is carried out.
Summary of the invention
Technical assignment of the present invention is to solve the deficiencies in the prior art, and a kind of method of MapReduce dynamic setting slots quantity is provided.
Technical scheme of the present invention realizes in the following manner, the method for this kind of MapReduce dynamic setting slots quantity, and its concrete assignment procedure is:
First determine the quantity of CPU in clustered node, then according to the quantity of the core of CPU in each node, by master slave mode framework MapReduce dynamic setting, determine slots quantity: according to the resource situation of job queue and TaskTracker node as input, wherein the resource situation of TaskTracker comprises the core amounts of CPU and the memory size of node, and then sets slots quantity according to the computing power of node;
On the host node of MS master-slave pattern framework MapReduce, move JobTracker, it is responsible for monitoring a group of planes, task scheduling; From node, move TaskTracker, it is responsible for monitor task and carries out, report progress;
TaskTracker regularly sends heartbeat message, the resource service condition of carrying this node in this information to JobTracker;
When heartbeat arrives, the scheduling in host node occurs, if the own available free resource of TaskTracker report, JobTracker is used dispatching algorithm to select a task to be transmitted into this node operation.
When setting slots quantity, need to design two variablees, one is map slot, one is reduce slot: first revise the code in TaskTracker, by map slot quantity initial setting, be the core amounts of CPU in node, reduce slot quantity initial setting is half of core amounts of CPU in node; Then in class methods, according to slots quantity, decide the size of application internal memory, total Memory Allocation size of task equals in map slot quantity and TaskTracker that single map slot memory size is long-pendingly adds in resuce slot quantity and TaskTracker that single reduce slot memory size is to be amassed; If it is little that total Memory Allocation of task is compared with the free memory of respective nodes in cluster, slots is set as to this value; If the free memory of respective nodes is little in total Memory Allocation of task and cluster, reduce map slot quantity or reduce slot quantity, the less slots quantity replacing, until meet internal memory condition in node.
The beneficial effect that the present invention compared with prior art produced is:
The method of a kind of MapReduce dynamic setting slots quantity of the present invention is by analyzing the computing power of node in Hadoop cluster, utilize CPU and the internal memory situation of each node to determine slots quantity, then according to this quantity, obtain rational map quantity and reduce quantity, the performance that this strategy makes whole cluster process MapReduce task promotes greatly, and optimize the reasonable utilization of resource, practical, be easy to promote.
Accompanying drawing explanation
Accompanying drawing 1 is operation job flowchart of the present invention.
Accompanying drawing 2 is process flow diagrams of setting slots quantity of the present invention.
Embodiment
Below in conjunction with accompanying drawing, the method for a kind of MapReduce dynamic setting slots quantity of the present invention is described in detail below.
The present invention relates to MapReduce in current large data Hadoop cluster and be badly in need of the major issue solving, according to node hardware configuration difference, computing power Different Dynamic in cluster, set the problem of map, reduce quantity.The strategy of the MapReduce dynamic setting slots quantity proposing by this method, this strategy can effectively solve the problem of dynamic setting slots quantity, and the performance that makes whole cluster process MapReduce task promotes greatly.
The present invention depends on MS master-slave pattern framework MapReduce, and this framework adopts the framework of Master/Slave, and it mainly contains following 4 parts and forms:
1)Client。
2) JobTracker:JobTracke is responsible for monitoring resource and job scheduling.JobTracker monitors the health status of all TaskTracker and job, once find unsuccessfully, just corresponding task transfers is arrived to other nodes; Meanwhile, the information such as the implementation progress of JobTracker meeting tracing task, resource use amount, and tell task dispatcher by these information, and scheduler can, when the free time appears in resource, select suitable task to use these resources.In Hadoop, task dispatcher is a pluggable module, and user can design corresponding scheduler according to the needs of oneself.
3) TaskTracker:TaskTracker can periodically report the operation progress of the service condition of resource on this node and task to JobTracker by Heartbeat, receives order the corresponding operation of execution (as started new task, killing task dispatching) that JobTracker sends over simultaneously.TaskTracker is used " slot " equivalent to divide the stock number on this node." slot " represents computational resource (CPU, internal memory etc.).A Task just has an opportunity to move after getting a slot, and the effect of Hadoop scheduler is exactly the idle slot on each TaskTracker to be distributed to Task use.Slot is divided into two kinds of Map slot and Reduce slot, respectively for MapTask and Reduce Task.TaskTracker limits the concurrency of Task by slot number (configurable parameter).
4) Task:Task is divided into two kinds of Map Task and Reduce Task, by TaskTracker, starts.HDFS be take the block of fixed size and is base unit storage data, and for MapReduce, it processes unit is split.Split is a logical concept, and it only comprises some metadata informations, such as data reference position, data length, data place node etc.Its division methods is determined by user oneself completely.But it should be noted that split number determined the number of Map Task because each split only can give a Map Task, process.
As shown in accompanying drawing 1, Fig. 2, the method of a kind of MapReduce dynamic setting slots quantity provided by the invention, this strategy is mainly to set slots quantity according to computing power in clustered node, and node computing power is determined according to CPU number and two factors of internal memory.First determine the quantity of CPU in clustered node, then according to the quantity of the core of CPU in each node, determine slots quantity, can carry out Processing tasks according to different node computing powers like this, mapreduce task is carried out more efficiently, improve performance.Internal memory factor in the strategy of MapReduce dynamic setting slots quantity, according to slots quantity, to decide the size of application internal memory, according to the internal memory situation of node, adjust accordingly slots quantity again, if can reduce slots quantity during low memory in application process, know the internal memory condition that reaches, otherwise slots quantity is set as to the slots quantity of setting according to CPU quantity, finally according to slots quantity, determines map, reduce quantity.Its concrete assignment procedure is:
First determine the quantity of CPU in clustered node, then according to the quantity of the core of CPU in each node, by master slave mode framework MapReduce dynamic setting, determine slots quantity: according to the resource situation of job queue and TaskTracker node as input, wherein the resource situation of TaskTracker comprises the core amounts of CPU and the memory size of node, and then sets slots quantity according to the computing power of node;
On the host node of MS master-slave pattern framework MapReduce, move JobTracker, it is responsible for monitoring a group of planes, task scheduling; From node, move TaskTracker, it is responsible for monitor task and carries out, report progress;
TaskTracker regularly sends heartbeat message, the resource service condition of carrying this node in this information to JobTracker;
When heartbeat arrives, the scheduling in host node occurs, if the own available free resource of TaskTracker report, JobTracker is used dispatching algorithm to select a task to be transmitted into this node operation.
When setting slots quantity, need to design two variablees, one is map slot, one is reduce slot: first revise the code in TaskTracker, by map slot quantity initial setting, be the core amounts of CPU in node, reduce slot quantity initial setting is half of core amounts of CPU in node; Then in class methods, according to slots quantity, decide the size of application internal memory, total Memory Allocation size of task equals in map slot quantity and TaskTracker that single map slot memory size is long-pendingly adds in resuce slot quantity and TaskTracker that single reduce slot memory size is to be amassed; If it is little that total Memory Allocation of task is compared with the free memory of respective nodes in cluster, slots is set as to this value; If the free memory of respective nodes is little in total Memory Allocation of task and cluster, reduce map slot quantity or reduce slot quantity, the less slots quantity replacing, until meet internal memory condition in node.
The object of the invention is to carry out dynamic setting slots quantity for distributed computing framework.This tactful thought is to carry out dynamic setting slots quantity according to the computing power difference of each node in Hadoop cluster.The CPU having from node and internal memory situation are set map quantity and reduce quantity, and this technical matters is connecting CPU quantity in node and slots reasonable quantity; By the restriction of internal memory, retrain the quantity of slots, make to meet the processing power of node in cluster, make task more efficient.
In node in the contacting of CPU quantity and slots reasonable quantity, add up the CPU quantity of each node, slots quantity is arranged to the core quantity of CPU in node, because each core can process separately a Task, and need not wait for, when map Task or reduce Task execution, can carry out fast.
At internal memory, limit approximately intrafascicular, can decide according to slots quantity the size of application internal memory, according to the internal memory situation of node, adjust accordingly slots quantity again, if can reduce slots quantity during low memory in application process, until reach the requirement of internal memory restriction, otherwise slots quantity is set as to the slots quantity according to the setting of CPU quantity.
With reference to the accompanying drawings 1 and accompanying drawing 2, content of the present invention is described in detail with an instantiation.
First dispose distributed type assemblies environment, use has a Hadoop group of planes for 11 nodes, one of them node is as master, all the other ten as slave. wherein 10 nodes all adopt Xeon E5-2620@2.00GHz CPU, the quantity of core is 24,96GB internal memory, 12*2T hard disk, operating system is centos6.3, the configuration of another one node is Xeon E7-8837@2.67GHz CPU, and the quantity of core is 128,500GB internal memory, 5*2T hard disk, operating system is centos6.3.In operating system, be according to official's document, hadoop assembly to be installed on centos6.3.Then hdfs, mapreduce are served to unlatching.
Operation job flowchart as shown in Figure 1, first determines that input file or the catalogue of MapReduce should exist on File system, if MapReduce depends on HDFS, must first local file be uploaded on HDFS.Client can apply for that a Jobid is used as the identifier of job to JobTracker.Then MapReduce just need to carry out job necessary resource file and copies on HDFS.Next be only operation job and submit process to, input file is done to data fragmentation (input split).Data fragmentation is just to determine the scope of its deal with data for before carrying out at mapper, and the quantity of the quantity of burst decision map task, corresponding one by one between them.This data fragmentation (split) is logic burst just, and record it and should access which block, and the initial index on this block and the information of data length.Then initialization operation, JobTracker will be responsible for distributed tasks to TaskTracker, TaskTracker can periodically send heartbeat request to JobTracker when operation, reports the upper task executing state of status data, TaskTracker of TaskTracker and wishes to obtain the task that can carry out from JobTracker.And the map quantity of moving in real TaskTracker node and reduce quantity are determined by map slots and reduce slots quantity.Therefore, according to the computing power of respective nodes in cluster, determine that in each node, map slots and reduce slots quantity are very important, directly affect the operational efficiency of task.
Set the process flow diagram of slots quantity as shown in Figure 2, first obtain the core quantity of the CPU of each node in cluster, map slot quantity initial setting is the core quantity of CPU in node, and reduce slot quantity initial setting is half of core quantity of CPU in node; Then obtain the free memory size in each node, in class methods initializeMemoryManagement (), according to slots quantity, decide the size of application internal memory, total Memory Allocation size of task equals in map slot quantity and TaskTracker that single map slot memory size is long-pendingly adds in reduce slot quantity and TaskTracker that single reduce slot memory size is to be amassed.If it is little that total Memory Allocation of task is compared with the free memory of respective nodes in cluster, map slots is set as to the core quantity of CPU in node, reduce slot quantity is half of map slot quantity; In total Memory Allocation of task and cluster, the free memory of respective nodes is little else if, reduce map slot quantity or reduce slot quantity, less slots quantity alternately, until meet internal memory condition in node, at this moment the map slots quantity that map slots is set as satisfying condition, reduce slot quantity is the reduce slots quantity satisfying condition.Then according to according to two TaskLauncher threads in class methods TaskTracker.initialize (), be responsible for respectively starting Mapper and Reduce task, in TaskLauncher, need to import into corresponding slots quantity, then carry out corresponding Task, as map task or reduce task.After execution finishes, discharge the resource of occupying.The method decides the computing power of node with the core quantity of CPU and memory size, many and large larger map and the reduce quantity of Node configuration of internal memory for the core quantity of CPU in node, for less map and the reduce quantity of core quantity Node configuration few and that internal memory is relatively less of CPU in some nodes.In this cluster, adopt Xeon E5-2620@2.00GHz CPU, the quantity of core be 24,96GB internal memory 10 nodes all map be set to 24, reduce and be set to 12.Another node configuration is Xeon E7-8837@2.67GHz CPU, and the quantity of core is 128,500GB internal memory, and it is 64 that map is set to 128, reduce.Arranging is like this higher than the tasks carrying efficiency of the map quantity of each machine Node configuration and reduce quantity, reaches the reasonable utilization of optimizing resource simultaneously.
The foregoing is only embodiments of the invention, within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (2)

1. a method for MapReduce dynamic setting slots quantity, is characterized in that its concrete assignment procedure is:
First determine the quantity of CPU in clustered node, then according to the quantity of the core of CPU in each node, by master slave mode framework MapReduce dynamic setting, determine slots quantity: according to the resource situation of job queue and TaskTracker node as input, wherein the resource situation of TaskTracker comprises the core amounts of CPU and the memory size of node, and then sets slots quantity according to the computing power of node;
On the host node of MS master-slave pattern framework MapReduce, move JobTracker, it is responsible for monitoring a group of planes, task scheduling; From node, move TaskTracker, it is responsible for monitor task and carries out, report progress;
TaskTracker regularly sends heartbeat message, the resource service condition of carrying this node in this information to JobTracker;
When heartbeat arrives, the scheduling in host node occurs, if the own available free resource of TaskTracker report, JobTracker is used dispatching algorithm to select a task to be transmitted into this node operation.
2. the method for a kind of MapReduce dynamic setting slots quantity according to claim 1, it is characterized in that: when setting slots quantity, need to design two variablees, one is map slot, one is reduce slot: first revise the code in TaskTracker, by map slot quantity initial setting, be the core amounts of CPU in node, reduce slot quantity initial setting is half of core amounts of CPU in node; Then in class methods, according to slots quantity, decide the size of application internal memory, total Memory Allocation size of task equals in map slot quantity and TaskTracker that single map slot memory size is long-pendingly adds in resuce slot quantity and TaskTracker that single reduce slot memory size is to be amassed; If it is little that total Memory Allocation of task is compared with the free memory of respective nodes in cluster, slots is set as to this value; If the free memory of respective nodes is little in total Memory Allocation of task and cluster, reduce map slot quantity or reduce slot quantity, the less slots quantity replacing, until meet internal memory condition in node.
CN201410004521.4A 2014-01-06 2014-01-06 A kind of method that MapReduce dynamically sets slots quantity Active CN103761146B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410004521.4A CN103761146B (en) 2014-01-06 2014-01-06 A kind of method that MapReduce dynamically sets slots quantity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410004521.4A CN103761146B (en) 2014-01-06 2014-01-06 A kind of method that MapReduce dynamically sets slots quantity

Publications (2)

Publication Number Publication Date
CN103761146A true CN103761146A (en) 2014-04-30
CN103761146B CN103761146B (en) 2017-10-31

Family

ID=50528389

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410004521.4A Active CN103761146B (en) 2014-01-06 2014-01-06 A kind of method that MapReduce dynamically sets slots quantity

Country Status (1)

Country Link
CN (1) CN103761146B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104270412A (en) * 2014-06-24 2015-01-07 南京邮电大学 Three-level caching method based on Hadoop distributed file system
CN105607955A (en) * 2015-12-23 2016-05-25 浪潮集团有限公司 Calculation task distribution method and apparatus
CN105868025A (en) * 2016-03-30 2016-08-17 华中科技大学 System for settling fierce competition of memory resources in big data processing system
WO2017107456A1 (en) * 2015-12-25 2017-06-29 乐视控股(北京)有限公司 Method and apparatus for determining resources consumed by task
CN107203422A (en) * 2016-08-28 2017-09-26 深圳晶泰科技有限公司 A kind of job scheduling method towards high-performance calculation cloud platform
CN107766138A (en) * 2017-10-20 2018-03-06 北京集奥聚合科技有限公司 A kind of hadoop Mission Monitors method and system
CN108170530A (en) * 2017-12-26 2018-06-15 北京工业大学 A kind of Hadoop Load Balancing Task Scheduling methods based on mixing meta-heuristic algorithm
CN108268316A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 The method and device of job scheduling
CN110618865A (en) * 2019-09-20 2019-12-27 中国银行股份有限公司 Hadoop task scheduling method and device
CN111914007A (en) * 2020-06-15 2020-11-10 武汉达梦数据库有限公司 Method and device for Hadoop cluster to run ETL process

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073546A (en) * 2010-12-13 2011-05-25 北京航空航天大学 Task-dynamic dispatching method under distributed computation mode in cloud computing environment
CN102541645A (en) * 2012-01-04 2012-07-04 北京航空航天大学 Dynamic adjustment method for node task slot based on node state feedbacks
CN102609303A (en) * 2012-01-18 2012-07-25 华为技术有限公司 Slow-task dispatching method and slow-task dispatching device of Map Reduce system
US20120304186A1 (en) * 2011-05-26 2012-11-29 International Business Machines Corporation Scheduling Mapreduce Jobs in the Presence of Priority Classes
US20130191843A1 (en) * 2011-08-23 2013-07-25 Infosys Limited System and method for job scheduling optimization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073546A (en) * 2010-12-13 2011-05-25 北京航空航天大学 Task-dynamic dispatching method under distributed computation mode in cloud computing environment
US20120304186A1 (en) * 2011-05-26 2012-11-29 International Business Machines Corporation Scheduling Mapreduce Jobs in the Presence of Priority Classes
US20130191843A1 (en) * 2011-08-23 2013-07-25 Infosys Limited System and method for job scheduling optimization
CN102541645A (en) * 2012-01-04 2012-07-04 北京航空航天大学 Dynamic adjustment method for node task slot based on node state feedbacks
CN102609303A (en) * 2012-01-18 2012-07-25 华为技术有限公司 Slow-task dispatching method and slow-task dispatching device of Map Reduce system

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104270412A (en) * 2014-06-24 2015-01-07 南京邮电大学 Three-level caching method based on Hadoop distributed file system
CN105607955A (en) * 2015-12-23 2016-05-25 浪潮集团有限公司 Calculation task distribution method and apparatus
WO2017107456A1 (en) * 2015-12-25 2017-06-29 乐视控股(北京)有限公司 Method and apparatus for determining resources consumed by task
CN105868025A (en) * 2016-03-30 2016-08-17 华中科技大学 System for settling fierce competition of memory resources in big data processing system
CN105868025B (en) * 2016-03-30 2019-05-10 华中科技大学 A kind of system solving memory source keen competition in big data processing system
CN107203422B (en) * 2016-08-28 2020-09-01 深圳晶泰科技有限公司 Job scheduling method for high-performance computing cloud platform
CN107203422A (en) * 2016-08-28 2017-09-26 深圳晶泰科技有限公司 A kind of job scheduling method towards high-performance calculation cloud platform
CN108268316A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 The method and device of job scheduling
CN107766138A (en) * 2017-10-20 2018-03-06 北京集奥聚合科技有限公司 A kind of hadoop Mission Monitors method and system
CN108170530A (en) * 2017-12-26 2018-06-15 北京工业大学 A kind of Hadoop Load Balancing Task Scheduling methods based on mixing meta-heuristic algorithm
CN108170530B (en) * 2017-12-26 2021-08-17 北京工业大学 Hadoop load balancing task scheduling method based on mixed element heuristic algorithm
CN110618865A (en) * 2019-09-20 2019-12-27 中国银行股份有限公司 Hadoop task scheduling method and device
CN110618865B (en) * 2019-09-20 2022-07-05 中国银行股份有限公司 Hadoop task scheduling method and device
CN111914007A (en) * 2020-06-15 2020-11-10 武汉达梦数据库有限公司 Method and device for Hadoop cluster to run ETL process
CN111914007B (en) * 2020-06-15 2024-02-02 武汉达梦数据库股份有限公司 Method and device for hadoop cluster to run ETL flow

Also Published As

Publication number Publication date
CN103761146B (en) 2017-10-31

Similar Documents

Publication Publication Date Title
CN103761146A (en) Method for dynamically setting quantities of slots for MapReduce
US20190324819A1 (en) Distributed-system task assignment method and apparatus
US9104498B2 (en) Maximizing server utilization within a datacenter
CN104915407B (en) A kind of resource regulating method based under Hadoop multi-job environment
Cho et al. Natjam: Design and evaluation of eviction policies for supporting priorities and deadlines in mapreduce clusters
Xu et al. Adaptive task scheduling strategy based on dynamic workload adjustment for heterogeneous Hadoop clusters
Rao et al. Performance issues of heterogeneous hadoop clusters in cloud computing
CN110221920B (en) Deployment method, device, storage medium and system
CN109564528B (en) System and method for computing resource allocation in distributed computing
CN109614227B (en) Task resource allocation method and device, electronic equipment and computer readable medium
CN107291536B (en) Application task flow scheduling method in cloud computing environment
CN106656525B (en) Data broadcasting system, data broadcasting method and equipment
CN111459641B (en) Method and device for task scheduling and task processing across machine room
US20220164208A1 (en) Coordinated container scheduling for improved resource allocation in virtual computing environment
CN103176849A (en) Virtual machine clustering deployment method based on resource classification
Song et al. Modulo based data placement algorithm for energy consumption optimization of MapReduce system
CN104112049A (en) P2P (peer-to-peer) architecture based cross-data-center MapReduce task scheduling system and P2P architecture based cross-data-center MapReduce task scheduling method
Wang et al. Dependency-aware network adaptive scheduling of data-intensive parallel jobs
US11521042B2 (en) System and method to dynamically and automatically sharing resources of coprocessor AI accelerators
CN107528871B (en) Data analysis in storage systems
CN114356714A (en) Resource integration monitoring and scheduling device based on Kubernetes intelligent board card cluster
JP5810918B2 (en) Scheduling apparatus, scheduling method and program
Wu et al. Abp scheduler: Speeding up service spread in docker swarm
CN105930202B (en) A kind of virtual machine migration method of three threshold values
Phan et al. On Understanding the energy impact of speculative execution in Hadoop

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant