WO2016145904A1 - 一种资源管理方法、装置和系统 - Google Patents

一种资源管理方法、装置和系统 Download PDF

Info

Publication number
WO2016145904A1
WO2016145904A1 PCT/CN2015/095196 CN2015095196W WO2016145904A1 WO 2016145904 A1 WO2016145904 A1 WO 2016145904A1 CN 2015095196 W CN2015095196 W CN 2015095196W WO 2016145904 A1 WO2016145904 A1 WO 2016145904A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
user
queue
run
resource slot
Prior art date
Application number
PCT/CN2015/095196
Other languages
English (en)
French (fr)
Inventor
郑鹏飞
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016145904A1 publication Critical patent/WO2016145904A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • This application relates to, but is not limited to, the field of computer technology.
  • the Hadoop system is a widely used distributed system for processing large-scale data.
  • a Hadoop cluster consists of a master node and multiple slave nodes, each of which can be a computer or a virtual machine.
  • the master node is used to manage the Hadoop Distributed File System (HDFS) and the processing of each job (ie, the MapReduce computing framework).
  • the slave node is responsible for data storage and processing of job data.
  • Hadoop uses the MapReduce parallel processing framework proposed by Google.
  • the master node is called JobTracker in MapReduce and is responsible for the processing of the job.
  • the slave node is called TaskTracker in the MapReduce framework and is responsible for the execution of the job task.
  • the input data of a Hadoop job is divided into a plurality of data blocks of the same size distributed in a computer cluster, and the input data is processed in parallel by a plurality of nodes to speed up the processing time of the job.
  • a node can simultaneously store and process multiple data blocks by configuration, each data block corresponding to one task.
  • the execution of the job is divided into two phases: the first phase is the map phase, each node processes the map task distributed in the cluster; the second phase is the Reduce phase, that is, through the reduce task pair.
  • the map task processing results distributed at each node are summarized to form the final job processing result.
  • FIG. 1 is a schematic diagram of a compute node and resource slot.
  • each job includes a map task set and a reduce task set.
  • Each task corresponds to one resource slot (map task corresponds to map slot, reduce task corresponds to reduce slot), and there are two strict restrictions on the execution of the job task: 1) The reduce task must be started after all the map tasks are completed; (2) the map task can only run on the map slot, and the reduce task can only run on the reduce slot.
  • the result of these two limitations is that the cluster resource utilization and performance are different under different job load and resource slot configurations, even under the optimal job submission order and optimal configuration resource slot.
  • the number of map tasks and reduce tasks is constantly changing over time, the number of resource slots allocated to a map (or reduce) task may exceed the number of map (or reduce) tasks. Therefore, under the dynamic load of the MapReduce cluster, there may be a resource slot overload and another resource slot idle, resulting in waste of resources.
  • This paper provides a resource management method, device and system that can improve the resource utilization of the Hadoop system.
  • the embodiment of the invention provides a resource management method, which is applied to a master node of a Hadoop system, and the method includes:
  • the user is selected from the queue of the user waiting for the resource allocation.
  • the task to be run is selected from the task queue to be run, including: according to the idle resource slot type information in the idle resource slot information, priority is given. Selecting a to-be-running task that matches the idle resource slot type from the task queue to be run of the user, and selecting a to-be-running task that matches the idle resource slot type, selecting a different type from the idle resource slot type Task to be run;
  • the to-be-run task is assigned to the slave node.
  • the selecting a user from a queue of users waiting for resource allocation includes:
  • Each time a user is scanned it is determined whether the user satisfies an allocation condition. If the user satisfies the allocation condition, the scanning is terminated. If the user does not satisfy the allocation condition, the next user is scanned.
  • the allocating condition includes: the user has a task to be run that meets a data locality requirement.
  • the method further includes:
  • the data locality requirement is removed from the allocation condition, and the user queue is scanned again from the head of the user queue waiting for resource allocation, and each scan is performed. Go to a user to determine whether the user has a task to be run. If the user has a task to be run, the scan is terminated, and the task to be run is selected from the task queue to be run of the user, for example, the user has no pending task. , then scan the next user.
  • the embodiment of the invention further provides a resource management method, which is applied to a slave node of a Hadoop system, and the method includes:
  • the notification information of the idle resource slot information is sent to the primary node, where the idle resource slot information includes type information of the idle resource slot of the local node;
  • the task to be run is taken out from the task startup queue to start.
  • the receiving the to-be-running task assigned by the primary node and placing the received to-be-running task into the task startup queue includes:
  • the task to be run is started from the task start queue, and includes:
  • the reduce task startup queue is empty and the map task startup queue is non-empty and there is currently a free resource slot, the task to be run is taken out from the map task startup queue to start.
  • the embodiment of the invention further provides a resource management device, which is applied to a master node of a Hadoop system, and includes:
  • the information receiving module is configured to: obtain idle resource slot information of the slave node;
  • the task scheduling module is configured to: select a user from a queue of users waiting for resource allocation, and select a task to be run from the queue of the user to be run after the user is selected, including: according to the information of the idle resource slot
  • the information of the idle resource slot type is selected from the task queue to be run of the user, and the to-be-running task matching the idle resource slot type is selected, and when there is no pending task matching the idle resource slot type,
  • the tasks to be run with different types of idle resource slots are configured to: select a user from a queue of users waiting for resource allocation, and select a task to be run from the queue of the user to be run after the user is selected, including: according to the information of the idle resource slot
  • the information of the idle resource slot type is selected from the task queue to be run of the user, and the to-be-running task matching the idle resource slot type is selected, and when there is no pending task matching the idle resource slot type,
  • the tasks to be run with different types of idle resource slots are configured to: select
  • the information sending module is configured to: after successfully selecting the task to be run, assign the to-be-running task to the slave node.
  • the task scheduling module is configured to:
  • Each time a user is scanned it is determined whether the user satisfies an allocation condition. If the user satisfies the allocation condition, the scanning is terminated. If the user does not satisfy the allocation condition, the next user is scanned.
  • the allocating condition includes: the user has a task to be run that meets a data locality requirement.
  • the task scheduling module is configured to: when the allocation condition includes a data locality requirement, if the task is not selected after the user is scanned, the data is removed from the allocation condition.
  • the locality request is to re-scan the user queue from the head of the user queue waiting for resource allocation, and each time a user is scanned, it is determined whether the user has a task to be run, and if the user has a task to be run, The scan is terminated, and the task to be run is selected from the task queue of the user to be run. If the user does not have a task to be run, the next user is scanned.
  • the embodiment of the invention further provides a resource management device, which is applied to a slave node of a Hadoop system, and includes:
  • the detecting and reporting module is configured to: send a notification message carrying the information of the idle resource slot to the primary node, where the information about the idle resource slot includes the type information of the idle resource slot of the node;
  • the receiving and processing module is configured to: receive the to-be-run task assigned by the primary node to the idle resource slot of the node, and put the received task to be executed into the task startup queue;
  • the task startup module is configured to: when the task startup queue is non-empty and there is currently a free resource slot, the task to be run is taken out from the task startup queue to be started.
  • the receiving and processing module is configured to:
  • the task startup module is set to:
  • the reduce task startup queue is empty and the map task startup queue is non-empty and there is currently a free resource slot, the task to be run is taken out from the map task startup queue to start.
  • the embodiment of the invention further provides a resource management system, including:
  • a computer readable storage medium storing computer executable instructions for performing the method of any of the above.
  • a resource management method, apparatus, and system provided by an embodiment of the present invention breaks the map slot in the Hadoop system by improving the scheduler on the master node and the task tracker on the slave node. Run the map task.
  • the reduce slot can only run the limit of the reduce task. Keep all resource slots as busy as possible, thus improving the resource utilization of the Hadoop system.
  • Figure 1 is a schematic diagram of a compute node and a resource slot.
  • FIG. 2 is a schematic diagram of borrowing of an internal resource pool of a user resource pool according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of borrowing of a resource pool of a resource pool between users according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a resource management method (master node) according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram (slave node) of a resource management method according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a resource management apparatus (master node) according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a resource management apparatus (slave node) according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of a resource management system according to an embodiment of the present invention.
  • multiple jobs may have one type of resource slot idle at different time periods, while the other type is overloaded.
  • idle reduce slots or map slots
  • the scheduler (on the master node) is responsible for user selection at the user level, after selecting the user, selecting the appropriate job from the user's job queue, and finally handing the start of the job task to the task tracker in the MapReduce framework. TaskTracker (from the node).
  • Resource slot borrowing As shown in FIG. 2, the internal borrowing of the user resource pool, that is, the idle resource slot in the user resource pool is borrowed. A resource slot where the user is overloaded. As shown in FIG. 3, the borrowing between user resource pools, that is, the user can borrow idle resource slots of other user resource pools. Resource slot borrowing reduces resource pool idleness and keeps all resource slots as busy as possible, thus improving resource utilization of Hadoop clusters.
  • an embodiment of the present invention provides a resource management method, which is applied to a primary node of a Hadoop system, and the method includes:
  • the obtaining the idle resource slot information of the slave node includes:
  • the slave node After receiving the heartbeat message of the requesting task that is sent from the node, it is learned that the slave node has a free resource slot according to the idle resource slot information carried in the heartbeat message;
  • the idle resource slot information includes: resource slot type information;
  • the resource slot type includes: a map resource slot or a reduce resource slot;
  • S402 Select a user from a queue of users waiting for resource allocation, and select a task to be run from the queue of the user to be run after the user is selected, including: according to the idle resource slot type information in the idle resource slot information And selecting, from the queue of the to-be-running task of the user, a task to be run that matches the type of the idle resource slot, and selecting the to-be-running task slot when the task to be run that matches the type of the idle resource slot does not exist. Different types of tasks to be run;
  • the user is selected from a queue of users waiting for resource allocation, including:
  • Each scan to a user determining whether the user satisfies an allocation condition, if the user satisfies the allocation condition, the scan is terminated, and if the user does not satisfy the allocation condition, scanning the next user;
  • the allocation condition includes: the user has a task to be run that meets a data locality requirement;
  • the data locality requirement means that the data block to be processed by the task is on the same node or the same rack as the resource slot allocated to the task;
  • the selecting a task to be run from the task queue to be run of the user when the allocation condition includes a data locality requirement includes:
  • the data locality requirement is removed from the allocation condition, and the user queue is scanned again from the head of the user queue waiting for resource allocation, and each scan is performed. Go to a user to determine whether the user has a task to be run. If the user has a task to be run, the scan is terminated, and the task to be run is selected from the task queue to be run of the user, for example, the user has no pending task. , then scan the next user;
  • the user queue waiting for resource allocation sorts the user according to a fairness algorithm
  • the tasks to be run with a long waiting time are preferentially selected;
  • the borrowing of the internal resource slot of the user means that after the user allocates the resource slot, the user analyzes the situation in four cases: 1) The first case: determining whether the free resource slot is a map slot and the user has a map task that satisfies the locality. If the condition is met, the map resource slot is allocated to the map task; 2) the second case is: determining whether the idle resource slot is a reduce slot and the user has a reduce task to be executed, and if the condition is met, the reduce resource slot is allocated to The third task is to determine whether the free resource slot is a map slot and the user has to perform the reduce task. If the condition is met, the map slot is used to perform the reduce task.
  • the fourth case determining whether the idle resource slot is The reduce slot has a map task that satisfies the locality. If this condition is met, the reduce slot is borrowed to the map task. It can be seen that the user internal resource slot borrowing occurs in the third and fourth cases described above.
  • the borrowing of resource slots between users means that the user queues are first sorted by priority when resource allocation is performed. According to the priority principle, the resource slot should be assigned to the user with the highest priority. However, it is very likely that the user may not have a qualified task, such as a map task that does not have a reduce task and does not satisfy the data locality, so the resource slot can be borrowed to other users.
  • the borrowing implementation is to scan the next user in the user queue to determine whether the next user has a task that satisfies the condition, and if so, lend the resource slot to the user, otherwise continue to scan other users in the user queue.
  • the allocating the to-be-running task to the slave node includes:
  • the scheduler on the master node is responsible for scheduling and assigning a map task or a reduce task to the slave node;
  • an embodiment of the present invention provides a resource management method, which is applied to a slave node of a Hadoop system, and the method includes:
  • the sending, by the primary node, a notification message that carries the information about the idle resource slot includes:
  • the resource slot type includes: a map resource slot or a reduce resource slot;
  • the receiving the to-be-running task assigned by the primary node and placing the received to-be-running task into the task startup queue includes:
  • the received map task and the reduce task may also be placed in the same task startup queue;
  • the task to be run is started from the task start queue, and the method includes:
  • the reduce task start queue is empty and the map task start queue is non-empty and there is currently a free resource slot, the task to be run is taken out from the map task start queue to be started;
  • Starting the reduce task firstly helps to end the job as soon as possible and release the resources occupied by the job.
  • the to-be-run task may be sequentially taken out from the task startup queue. Start up;
  • the task tracker (TaskTracker) on the slave node is responsible for starting the map task or the reduce task;
  • the embodiment of the present invention provides a resource management apparatus, which is applied to a master node of a Hadoop system, and includes:
  • the information receiving module 601 is configured to: acquire idle resource slot information of the slave node;
  • the task scheduling module 602 is configured to: select a user from a queue of users waiting for resource allocation, and select a task to be run from the queue of the user to be run after the user is selected, including: according to the information of the idle resource slot
  • the information of the idle resource slot type is selected from the queue of the to-be-running task of the user, and the task to be run that matches the type of the idle resource slot is selected.
  • the information sending module 603 is configured to: after the task to be run is successfully selected, assign the to-be-running task to the slave node.
  • the task scheduling module 602 is configured to:
  • Each time a user is scanned it is determined whether the user satisfies an allocation condition. If the user satisfies the allocation condition, the scanning is terminated. If the user does not satisfy the allocation condition, the next user is scanned.
  • the allocation condition includes: the user has a task to be run that meets a data locality requirement.
  • the task scheduling module 602 is configured to: when the allocation condition includes a data locality requirement, if the user is unable to select a task to be run after scanning all users, Removing the data locality requirement from the condition, re-scanning the user queue from the head of the user queue waiting for the resource allocation, and scanning each user to determine whether the user has a task to be run, such as the user has to wait When the task is run, the scan is terminated, and the task to be run is selected from the task queue of the user to be run. If the user does not have a task to be run, the next user is scanned.
  • the information receiving module 601 is configured to:
  • the slave node After receiving the heartbeat message of the requesting task that is sent from the node, it is learned that the slave node has a free resource slot according to the idle resource slot information carried in the heartbeat message;
  • the resource slot type includes: a map resource slot or a reduce resource slot.
  • an embodiment of the present invention provides a resource management apparatus, which is applied to a slave node of a Hadoop system, and includes:
  • the detecting and reporting module 701 is configured to: send a notification message carrying the information of the idle resource slot to the primary node, where the information about the idle resource slot includes the type information of the idle resource slot of the local node;
  • the receiving and processing module 702 is configured to: receive the to-be-run task assigned by the active node to the idle resource slot of the node, and put the received task to be executed into the task startup queue;
  • the task startup module 703 is configured to: when the task startup queue is non-empty and there is currently a free resource slot, the task to be run is taken out from the task startup queue to be started.
  • the receiving and processing module 702 is configured to:
  • the task startup module 703 is configured to:
  • the reduce task startup queue is empty and the map task startup queue is non-empty and there is currently a free resource slot, the task to be run is taken out from the map task startup queue to start.
  • the detection and reporting module 701 is configured to:
  • the resource slot type includes: a map resource slot or a reduce resource slot.
  • an embodiment of the present invention provides a resource management system, including: a Hadoop system master node having the resource management device, and a Hadoop system slave node having the resource management device.
  • the resource management method, device and system provided by the foregoing embodiment, by modifying the scheduler on the master node and the task tracker on the slave node, break the map slot in the Hadoop system and only run the map task, and the reduce slot only The ability to run the reduce task limit, keep all resource slots as busy as possible, and improve the resource utilization of the Hadoop system.
  • all or part of the steps of the above embodiments may also be implemented by using an integrated circuit. These steps may be separately fabricated into individual integrated circuit modules, or multiple modules or steps may be fabricated into a single integrated circuit module. achieve.
  • the devices/function modules/functional units in the above embodiments may be implemented by a general-purpose computing device, which may be centralized on a single computing device or distributed over a network of multiple computing devices.
  • the device/function module/functional unit in the above embodiment When the device/function module/functional unit in the above embodiment is implemented in the form of a software function module and sold or used as a stand-alone product, it can be stored in a computer readable storage medium.
  • the above mentioned computer readable storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
  • the map slot in the Hadoop system can only run the map task, and the reduce slot can only run the limit of the reduce task, so as to make all the restrictions
  • the resource slots are kept busy, which improves the resource utilization of the Hadoop system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)
  • Multi Processors (AREA)

Abstract

本文公布一种资源管理方法、装置和系统,该方法包括:获取从节点的空闲资源槽信息;从等待资源分配的用户队列中选取用户,在选出用户后,从所述用户的待运行任务队列中选取待运行任务,包括:根据所述空闲资源槽信息中的空闲资源槽类型信息,优先从所述用户的待运行任务队列中选取与所述空闲资源槽类型匹配的待运行任务,在不存在与所述空闲资源槽类型匹配的待运行任务时,选取与所述空闲资源槽类型不同的待运行任务;在成功选取到待运行任务后,将所述待运行任务分配给所述从节点。

Description

一种资源管理方法、装置和系统 技术领域
本申请涉及但不限于计算机技术领域。
背景技术
Hadoop系统是目前使用十分广泛的一个分布式系统,用来处理大规模数据。Hadoop集群由一个主节点和多个从节点组成,每个节点可以是一台计算机或者一台虚拟机。主节点用来管理Hadoop分布式文件系统HDFS(Hadoop Distributed File System,HDFS)和每个作业的处理过程(即MapReduce计算框架),从节点负责数据的存储和对作业数据的处理。Hadoop采用Google公司提出的MapReduce并行处理框架。主节点在MapReduce中称为JobTracker,负责作业的处理过程;从节点在MapReduce框架中称之为TaskTracker,负责作业任务的执行。Hadoop作业的输入数据被划分成很多大小相同的数据块分布在计算机集群中,由多个节点并行处理这些输入数据从而加快作业的处理时间。一个节点可以通过配置同时存储和处理多个数据块,每个数据块对应一个任务。作业的执行分为两个阶段:第一个阶段即map(映射)阶段,每个节点处理分布在集群中作业的map任务;第二个阶段为Reduce(归约)阶段,即通过reduce任务对分布在每个节点的map任务处理结果进行汇总,形成最终的作业处理结果。
在Hadoop集群中,所有的计算资源被抽象为槽,每个槽可以被独占用来处理一个任务,根据计算节点(即从节点)的硬件配置,管理员可以配置不同数目的槽。由于每个作业都由一个map任务集合和一个reduce任务集合组成,而map任务和reduce任务对集群资源的需求有所不同,所以将槽划分为map槽和reduce槽两种类型。其中,map槽只能运行map任务,reduce槽只能运行reduce任务。所以在Hadoop中槽是最基本的计算单元,并且槽的数目在集群启动前已被管理员配置完毕,运行过程中不能改变。资源槽也是资源分配的基本单位,每个资源槽占用着本节点上一定的物理资源,比如CPU、内存、磁盘和网络带宽。图1是一个计算节点和资源槽的示意图。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
在Hadoop中,每个作业包括map任务集合和reduce任务集合,每一个任务对应一个资源槽(map任务对应map槽,reduce任务对应reduce槽),对作业任务的执行有两个严格的限制:(1)reduce任务必须在所有map任务完成后才能真正开始;(2)map任务只能运行在map槽上,reduce任务只能运行在reduce槽上。这两个限制带来的结果就是在不同的作业负载和资源槽配置下,集群资源利用率和性能都有较大不同,即使在最优的作业提交顺序和最优的配置资源槽下仍然会严重影响相应资源槽的利用率。由于map任务和reduce任务的数目随着时间的推移都在不断的变化,分配给map(或者reduce)任务的资源槽数目可能会超过map(或者reduce)任务的数目。所以,在MapReduce集群动态负载下,可能会出现一种资源槽负载过重而另一种资源槽却有空闲,从而导致资源浪费
本文提供一种资源管理方法、装置和系统,能够提高Hadoop系统的资源利用率。
本发明实施例提供了一种资源管理方法,应用于Hadoop系统的主节点,该方法包括:
获取从节点的空闲资源槽信息;
从等待资源分配的用户队列中选取用户,在选出用户后,从所述用户的待运行任务队列中选取待运行任务,包括:根据所述空闲资源槽信息中的空闲资源槽类型信息,优先从所述用户的待运行任务队列中选取与所述空闲资源槽类型匹配的待运行任务,在不存在与所述空闲资源槽类型匹配的待运行任务时,选取与所述空闲资源槽类型不同的待运行任务;
在成功选取到待运行任务后,将所述待运行任务分配给所述从节点。
可选地,所述从等待资源分配的用户队列中选取用户,包括:
从所述等待资源分配的用户队列的头部开始扫描所述用户队列;
每扫描到一个用户,判断所述用户是否满足分配条件,如所述用户满足所述分配条件,则扫描终止,如所述用户不满足所述分配条件,则扫描下一个用户。
可选地,所述分配条件包括:所述用户具有满足数据本地性要求的待运行任务。
可选地,在所述分配条件包含数据本地性要求时,从所述用户的待运行任务队列中选取待运行任务,还包括:
如扫描完所有的用户后未能选取到待运行任务,则从所述分配条件中去除数据本地性要求,重新从所述等待资源分配的用户队列的头部开始扫描所述用户队列,每扫描到一个用户,判断所述用户是否具有待运行任务,如所述用户具有待运行任务,则扫描终止,从所述用户的待运行任务队列中选取待运行任务,如所述用户没有待运行任务,则扫描下一个用户。
本发明实施例还提供了一种资源管理方法,应用于Hadoop系统的从节点,该方法包括:
检测到空闲资源槽后,向主节点发送携带空闲资源槽信息的通知消息,所述空闲资源槽信息包括本节点的空闲资源槽的类型信息;
接收所述主节点为本节点的空闲资源槽分配的待运行任务并将接收到的待运行任务放入任务启动队列中;
在所述任务启动队列非空且当前存在空闲资源槽时,从所述任务启动队列中取出待运行任务进行启动。
可选地,所述接收所述主节点分配的待运行任务并将接收到的待运行任务放入任务启动队列中,包括:
将接收到的map任务放入map任务启动队列,将接收到的reduce任务放入reduce任务启动队列;
所述在所述任务启动队列非空且当前存在空闲资源槽时,从所述任务启动队列中取出待运行任务进行启动,包括:
如所述reduce任务启动队列非空且当前存在空闲资源槽,则从所述 reduce任务启动队列中取出待运行任务进行启动;
如所述reduce任务启动队列为空且所述map任务启动队列非空且当前存在空闲资源槽,则从所述map任务启动队列中取出待运行任务进行启动。
本发明实施例还提供了一种资源管理装置,应用于Hadoop系统的主节点,包括:
信息接收模块,设置为:获取从节点的空闲资源槽信息;
任务调度模块,设置为:从等待资源分配的用户队列中选取用户,在选出用户后,从所述用户的待运行任务队列中选取待运行任务,包括:根据所述空闲资源槽信息中的空闲资源槽类型信息,优先从所述用户的待运行任务队列中选取与所述空闲资源槽类型匹配的待运行任务,在不存在与所述空闲资源槽类型匹配的待运行任务时,选取与所述空闲资源槽类型不同的待运行任务;
信息发送模块,设置为:在成功选取到待运行任务后,将所述待运行任务分配给所述从节点。
可选地,所述任务调度模块,是设置为:
从所述等待资源分配的用户队列的头部开始扫描所述用户队列;
每扫描到一个用户,判断所述用户是否满足分配条件,如所述用户满足所述分配条件,则扫描终止,如所述用户不满足所述分配条件,则扫描下一个用户。
可选地,所述分配条件包括:所述用户具有满足数据本地性要求的待运行任务。
可选地,所述任务调度模块,是设置为:在所述分配条件包含数据本地性要求时,如扫描完所有的用户后未能选取到待运行任务,则从所述分配条件中去除数据本地性要求,重新从所述等待资源分配的用户队列的头部开始扫描所述用户队列,每扫描到一个用户,判断所述用户是否具有待运行任务,如所述用户具有待运行任务,则扫描终止,从所述用户的待运行任务队列中选取待运行任务,如所述用户没有待运行任务,则扫描下一个用户。
本发明实施例还提供了一种资源管理装置,应用于Hadoop系统的从节点,包括:
检测及上报模块,设置为:向主节点发送携带空闲资源槽信息的通知消息,所述空闲资源槽信息包括本节点的空闲资源槽的类型信息;
接收及处理模块,设置为:接收所述主节点为本节点的空闲资源槽分配的待运行任务并将接收到的待运行任务放入任务启动队列中;
任务启动模块,设置为:在所述任务启动队列非空且当前存在空闲资源槽时,从所述任务启动队列中取出待运行任务进行启动。
可选地,所述接收及处理模块,是设置为:
将接收到的map任务放入map任务启动队列,将接收到的reduce任务放入reduce任务启动队列;
所述任务启动模块,是设置为:
如所述reduce任务启动队列非空且当前存在空闲资源槽,则从所述reduce任务启动队列中取出待运行任务进行启动;
如所述reduce任务启动队列为空且所述map任务启动队列非空且当前存在空闲资源槽,则从所述map任务启动队列中取出待运行任务进行启动。
本发明实施例还提供了一种资源管理系统,包括:
具有上述资源管理装置的Hadoop系统主节点,和具有上述资源管理装置的Hadoop系统从节点。
一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行上述任一项的方法。
与相关技术相比,本发明实施例提供的一种资源管理方法、装置和系统,通过对主节点上的调度器和从节点上的任务跟踪器进行改进,打破了Hadoop系统中map槽只能运行map任务,reduce槽只能运行reduce任务的限制,尽可能使所有的资源槽都保持忙碌,从而提高Hadoop系统的资源利用率。
在阅读并理解了附图和详细描述后,可以明白其他方面。
附图概述
图1为计算节点和资源槽的示意图。
图2为本发明实施例用户资源池内部资源槽借用示意图。
图3为本发明实施例用户间资源池资源槽借用示意图。
图4为本发明实施例一种资源管理方法的示意图(主节点)。
图5为本发明实施例一种资源管理方法的示意图(从节点)。
图6为本发明实施例一种资源管理装置示意图(主节点)。
图7为本发明实施例一种资源管理装置示意图(从节点)。
图8为本发明实施例一种资源管理系统示意图。
本发明的实施方式
下文中将结合附图对本发明的实施方式进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
多个作业在从map阶段进入reduce阶段的过程中,在不同的时间段可能会出现一类资源槽空闲,而另一类却负载过重。对于这些空闲的reduce槽(或者map槽),可以借给负载过重的map(或者reduce)任务使用,从而提高Hadoop系统的资源利用率。
调度器(主节点上)会负责在用户级进行用户选择,在选定用户后,再从该用户的作业队列内选取合适的作业,最后将作业任务的启动交给MapReduce框架中的任务跟踪器TaskTracker(从节点上)。
对Hadoop系统的调度器和MapReduce框架进行改进,打破MapReduce并行计算框架中对于map资源槽和reduce资源槽的限制,在保证用户之间公平性的同时,分别在用户资源池内部和用户资源池之间进行资源槽借用。如图2所示,用户资源池内部的借用,即借用用户资源池内的空闲资源槽给该 用户负载过重的资源槽。如图3所示,用户资源池之间的借用,即用户可以借用其他用户资源池的空闲资源槽。资源槽借用减少了资源槽空闲现象,尽可能使所有资源槽保持忙碌,从而提高了Hadoop集群的资源利用率。
如图4所示,本发明实施例提供了一种资源管理方法,应用于Hadoop系统的主节点,该方法包括:
S401,获取从节点的空闲资源槽信息;
其中,所述获取从节点的空闲资源槽信息,包括:
接收到从节点发送的请求分配任务的心跳消息后,根据所述心跳消息中携带的空闲资源槽信息获知所述从节点有空闲资源槽;
其中,所述空闲资源槽信息包括:资源槽类型信息;
其中,所述资源槽类型包括:map资源槽或reduce资源槽;
S402,从等待资源分配的用户队列中选取用户,在选出用户后,从所述用户的待运行任务队列中选取待运行任务,包括:根据所述空闲资源槽信息中的空闲资源槽类型信息,优先从所述用户的待运行任务队列中选取与所述空闲资源槽类型匹配的待运行任务,在不存在与所述空闲资源槽类型匹配的待运行任务时,选取与所述空闲资源槽类型不同的待运行任务;
其中,所述从等待资源分配的用户队列中选取用户,包括:
从所述等待资源分配的用户队列的头部开始扫描所述用户队列;
每扫描到一个用户,判断所述用户是否满足分配条件,如所述用户满足所述分配条件,则扫描终止,如所述用户不满足所述分配条件,则扫描下一个用户;
其中,所述分配条件包括:所述用户具有满足数据本地性要求的待运行任务;
其中,所述数据本地性要求是指:任务要处理的数据块与分配给该任务的资源槽在同一个节点或同一机架上;
其中,在所述分配条件包含数据本地性要求时,从所述用户的待运行任务队列中选取待运行任务,还包括:
如扫描完所有的用户后未能选取到待运行任务,则从所述分配条件中去除数据本地性要求,重新从所述等待资源分配的用户队列的头部开始扫描所述用户队列,每扫描到一个用户,判断所述用户是否具有待运行任务,如所述用户具有待运行任务,则扫描终止,从所述用户的待运行任务队列中选取待运行任务,如所述用户没有待运行任务,则扫描下一个用户;
其中,所述等待资源分配的用户队列按照公平性算法对用户进行排序;
其中,在同类型的待运行任务存在多个时,优先选取等待时间长的待运行任务;
其中,用户内部资源槽的借用是指:用户在分配到资源槽后,分四种情况进行分析:1)第一种情况:判断空闲资源槽是否为map槽且用户有满足本地性的map任务,若满足该条件则会将map资源槽分配给map任务;2)第二种情况:判断空闲资源槽是否为reduce槽且用户有待执行的reduce任务,若满足该条件则将reduce资源槽分配给reduce任务;3)第三种情况:判断空闲资源槽是否为map槽且用户有待执行reduce任务,若满足该条件则借用map槽执行reduce任务;4)第四种情况:判断空闲资源槽是否为reduce槽且有满足本地性的map任务,若满足该条件则将reduce槽借用给map任务。可以看出,用户内部资源槽借用发生在上述第三种和第四种情况。
用户之间资源槽的借用是指:在进行资源分配时首先会对用户队列按照优先级排序。按照优先级原则,该资源槽应该分配给优先级最高的用户。但是,很可能的情况是:该用户可能没有符合条件的任务,例如没有reduce任务且没有满足数据本地性的map任务,所以可以将该资源槽借用给其他用户。借用的实现就是通过扫描用户队列中下一个用户,判断下一个用户是否具有满足条件的任务,如果有则将该资源槽借给这个用户,否则继续依次扫描用户队列中的其他用户。
S403,在成功选取到待运行任务后,将所述待运行任务分配给所述从节点;
其中,所述将所述待运行任务分配给所述从节点,包括:
向所述从节点返回所述心跳消息的响应消息,其中携带所述待运行任务 的信息;
其中,主节点上的调度器负责调度并分配map任务或reduce任务给从节点;
如图5所示,本发明实施例提供了一种资源管理方法,应用于Hadoop系统的从节点,该方法包括:
S501,检测到空闲资源槽后,向主节点发送携带空闲资源槽信息的通知消息,所述空闲资源槽信息包括本节点的空闲资源槽的类型信息;
其中,所述向主节点发送携带所述空闲资源槽信息的通知消息,包括:
向主节点发送请求分配任务的心跳消息,其中携带空闲资源槽信息;
其中,所述资源槽类型包括:map资源槽或reduce资源槽;
S502,接收所述主节点为本节点的空闲资源槽分配的待运行任务并将接收到的待运行任务放入任务启动队列中;
可选地,所述接收所述主节点分配的待运行任务并将接收到的待运行任务放入任务启动队列中,包括:
将接收到的map任务放入map任务启动队列,将接收到的reduce任务放入reduce任务启动队列;
可选地,也可以将接收到的map任务和reduce任务放入同一个任务启动队列中;
S503,在所述任务启动队列非空且当前存在空闲资源槽时,从所述任务启动队列中取出待运行任务进行启动;
可选地,所述在所述任务启动队列非空且当前存在空闲资源槽时,从所述任务启动队列中取出待运行任务进行启动,包括:
如所述reduce任务启动队列非空且当前存在空闲资源槽,则从所述reduce任务启动队列中取出待运行任务进行启动;
如所述reduce任务启动队列为空且所述map任务启动队列非空且当前存在空闲资源槽,则从所述map任务启动队列中取出待运行任务进行启动;
先启动reduce任务有利于尽快结束作业,释放作业占用的资源。
可选地,也可以采用其他的策略启动待运行任务;比如,如果将接收到的map任务和reduce任务放入同一个任务启动队列中,则可以顺序从所述任务启动队列中取出待运行任务进行启动;
其中,从节点上的任务跟踪器(TaskTracker)负责启动map任务或reduce任务;
如图6所示,本发明实施例提供了一种资源管理装置,应用于Hadoop系统的主节点,包括:
信息接收模块601,设置为:获取从节点的空闲资源槽信息;
任务调度模块602,设置为:从等待资源分配的用户队列中选取用户,在选出用户后,从所述用户的待运行任务队列中选取待运行任务,包括:根据所述空闲资源槽信息中的空闲资源槽类型信息,优先从所述用户的待运行任务队列中选取与所述空闲资源槽类型匹配的待运行任务,在不存在与所述空闲资源槽类型匹配的待运行任务时,选取与所述空闲资源槽类型不同的待运行任务;
信息发送模块603,设置为:在成功选取到待运行任务后,将所述待运行任务分配给所述从节点。
其中,所述任务调度模块602,是设置为:
从所述等待资源分配的用户队列的头部开始扫描所述用户队列;
每扫描到一个用户,判断所述用户是否满足分配条件,如所述用户满足所述分配条件,则扫描终止,如所述用户不满足所述分配条件,则扫描下一个用户。
其中,所述分配条件包括:所述用户具有满足数据本地性要求的待运行任务。
其中,所述任务调度模块602,是设置为:在所述分配条件包含数据本地性要求时,如扫描完所有的用户后未能选取到待运行任务,则从所述分配 条件中去除数据本地性要求,重新从所述等待资源分配的用户队列的头部开始扫描所述用户队列,每扫描到一个用户,判断所述用户是否具有待运行任务,如所述用户具有待运行任务,则扫描终止,从所述用户的待运行任务队列中选取待运行任务,如所述用户没有待运行任务,则扫描下一个用户。
其中,所述信息接收模块601,是设置为:
接收到从节点发送的请求分配任务的心跳消息后,根据所述心跳消息中携带的空闲资源槽信息获知所述从节点有空闲资源槽;
其中,所述资源槽类型包括:map资源槽或reduce资源槽。
如图7所示,本发明实施例提供了一种资源管理装置,应用于Hadoop系统的从节点,包括:
检测及上报模块701,设置为:向主节点发送携带空闲资源槽信息的通知消息,所述空闲资源槽信息包括本节点的空闲资源槽的类型信息;
接收及处理模块702,设置为:接收所述主节点为本节点的空闲资源槽分配的待运行任务并将接收到的待运行任务放入任务启动队列中;
任务启动模块703,设置为:在所述任务启动队列非空且当前存在空闲资源槽时,从所述任务启动队列中取出待运行任务进行启动。
其中,所述接收及处理模块702,是设置为:
将接收到的map任务放入map任务启动队列,将接收到的reduce任务放入reduce任务启动队列;
所述任务启动模块703,是设置为:
如所述reduce任务启动队列非空且当前存在空闲资源槽,则从所述reduce任务启动队列中取出待运行任务进行启动;
如所述reduce任务启动队列为空且所述map任务启动队列非空且当前存在空闲资源槽,则从所述map任务启动队列中取出待运行任务进行启动。
其中,所述检测及上报模块701,是设置为:
向主节点发送请求分配任务的心跳消息,其中携带空闲资源槽信息;
其中,所述资源槽类型包括:map资源槽或reduce资源槽。
如图8所示,本发明实施例提供了一种资源管理系统,包括:具有上述资源管理装置的Hadoop系统主节点,和具有上述资源管理装置的Hadoop系统从节点。
上述实施例提供的一种资源管理方法、装置和系统,通过对主节点上的调度器和从节点上的任务跟踪器进行改进,打破了Hadoop系统中map槽只能运行map任务,reduce槽只能运行reduce任务的限制,尽可能使所有的资源槽都保持忙碌,从而提高Hadoop系统的资源利用率。
本领域普通技术人员可以理解上述实施例的全部或部分步骤可以使用计算机程序流程来实现,所述计算机程序可以存储于一计算机可读存储介质中,所述计算机程序在相应的硬件平台上(如系统、设备、装置、器件等)执行,在执行时,包括方法实施例的步骤之一或其组合。
可选地,上述实施例的全部或部分步骤也可以使用集成电路来实现,这些步骤可以被分别制作成一个个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。
上述实施例中的装置/功能模块/功能单元可以采用通用的计算装置来实现,它们可以集中在单个的计算装置上,也可以分布在多个计算装置所组成的网络上。
上述实施例中的装置/功能模块/功能单元以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。上述提到的计算机可读取存储介质可以是只读存储器,磁盘或光盘等。
工业实用性
本发明实施例通过对主节点上的调度器和从节点上的任务跟踪器进行改进,打破了Hadoop系统中map槽只能运行map任务,reduce槽只能运行reduce任务的限制,尽可能使所有的资源槽都保持忙碌,从而提高Hadoop系统的资源利用率。

Claims (14)

  1. 一种资源管理方法,应用于Hadoop系统的主节点,该方法包括:
    获取从节点的空闲资源槽信息;
    从等待资源分配的用户队列中选取用户,在选出用户后,从所述用户的待运行任务队列中选取待运行任务,包括:根据所述空闲资源槽信息中的空闲资源槽类型信息,优先从所述用户的待运行任务队列中选取与所述空闲资源槽类型匹配的待运行任务,在不存在与所述空闲资源槽类型匹配的待运行任务时,选取与所述空闲资源槽类型不同的待运行任务;
    在成功选取到待运行任务后,将所述待运行任务分配给所述从节点。
  2. 如权利要求1所述的方法,其中:
    所述从等待资源分配的用户队列中选取用户,包括:
    从所述等待资源分配的用户队列的头部开始扫描所述用户队列;
    每扫描到一个用户,判断所述用户是否满足分配条件,如所述用户满足所述分配条件,则扫描终止,如所述用户不满足所述分配条件,则扫描下一个用户。
  3. 如权利要求2所述的方法,其中:
    所述分配条件包括:所述用户具有满足数据本地性要求的待运行任务。
  4. 如权利要求3所述的方法,其中:
    在所述分配条件包含数据本地性要求时,从所述用户的待运行任务队列中选取待运行任务,还包括:
    如扫描完所有的用户后未能选取到待运行任务,则从所述分配条件中去除数据本地性要求,重新从所述等待资源分配的用户队列的头部开始扫描所述用户队列,每扫描到一个用户,判断所述用户是否具有待运行任务,如所述用户具有待运行任务,则扫描终止,从所述用户的待运行任务队列中选取待运行任务,如所述用户没有待运行任务,则扫描下一个用户。
  5. 一种资源管理方法,应用于Hadoop系统的从节点,该方法包括:
    检测到空闲资源槽后,向主节点发送携带空闲资源槽信息的通知消息,所述空闲资源槽信息包括本节点的空闲资源槽的类型信息;
    接收所述主节点为本节点的空闲资源槽分配的待运行任务并将接收到的待运行任务放入任务启动队列中;
    在所述任务启动队列非空且当前存在空闲资源槽时,从所述任务启动队列中取出待运行任务进行启动。
  6. 如权利要求5所述的方法,其中:
    所述接收所述主节点分配的待运行任务并将接收到的待运行任务放入任务启动队列中,包括:
    将接收到的映射map任务放入map任务启动队列,将接收到的归约reduce任务放入reduce任务启动队列;
    所述在所述任务启动队列非空且当前存在空闲资源槽时,从所述任务启动队列中取出待运行任务进行启动,包括:
    如所述reduce任务启动队列非空且当前存在空闲资源槽,则从所述reduce任务启动队列中取出待运行任务进行启动;
    如所述reduce任务启动队列为空且所述map任务启动队列非空且当前存在空闲资源槽,则从所述map任务启动队列中取出待运行任务进行启动。
  7. 一种资源管理装置,应用于Hadoop系统的主节点,包括:
    信息接收模块,设置为:获取从节点的空闲资源槽信息;
    任务调度模块,设置为:从等待资源分配的用户队列中选取用户,在选出用户后,从所述用户的待运行任务队列中选取待运行任务,包括:根据所述空闲资源槽信息中的空闲资源槽类型信息,优先从所述用户的待运行任务队列中选取与所述空闲资源槽类型匹配的待运行任务,在不存在与所述空闲资源槽类型匹配的待运行任务时,选取与所述空闲资源槽类型不同的待运行任务;
    信息发送模块,设置为:在成功选取到待运行任务后,将所述待运行任 务分配给所述从节点。
  8. 如权利要求7所述的装置,其中:
    所述任务调度模块,是设置为:
    从所述等待资源分配的用户队列的头部开始扫描所述用户队列;
    每扫描到一个用户,判断所述用户是否满足分配条件,如所述用户满足所述分配条件,则扫描终止,如所述用户不满足所述分配条件,则扫描下一个用户。
  9. 如权利要求8所述的装置,其中:
    所述分配条件包括:所述用户具有满足数据本地性要求的待运行任务。
  10. 如权利要求9所述的装置,其中:
    所述任务调度模块,是设置为:在所述分配条件包含数据本地性要求时,如扫描完所有的用户后未能选取到待运行任务,则从所述分配条件中去除数据本地性要求,重新从所述等待资源分配的用户队列的头部开始扫描所述用户队列,每扫描到一个用户,判断所述用户是否具有待运行任务,如所述用户具有待运行任务,则扫描终止,从所述用户的待运行任务队列中选取待运行任务,如所述用户没有待运行任务,则扫描下一个用户。
  11. 一种资源管理装置,应用于Hadoop系统的从节点,包括:
    检测及上报模块,设置为:向主节点发送携带空闲资源槽信息的通知消息,所述空闲资源槽信息包括本节点的空闲资源槽的类型信息;
    接收及处理模块,设置为:接收所述主节点为本节点的空闲资源槽分配的待运行任务并将接收到的待运行任务放入任务启动队列中;
    任务启动模块,设置为:在所述任务启动队列非空且当前存在空闲资源槽时,从所述任务启动队列中取出待运行任务进行启动。
  12. 如权利要求11所述的装置,其中:
    所述接收及处理模块,是设置为:
    将接收到的map任务放入map任务启动队列,将接收到的reduce任务放入reduce任务启动队列;
    所述任务启动模块,是设置为:
    如所述reduce任务启动队列非空且当前存在空闲资源槽,则从所述reduce任务启动队列中取出待运行任务进行启动;
    如所述reduce任务启动队列为空且所述map任务启动队列非空且当前存在空闲资源槽,则从所述map任务启动队列中取出待运行任务进行启动。
  13. 一种资源管理系统,包括:
    具有权利要求7-10中任一项所述的资源管理装置的Hadoop系统主节点,和具有权利要求11-12中任一项所述的资源管理装置的Hadoop系统从节点。
  14. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1-6任一项的方法。
PCT/CN2015/095196 2015-09-10 2015-11-20 一种资源管理方法、装置和系统 WO2016145904A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510574287.3A CN106528288A (zh) 2015-09-10 2015-09-10 一种资源管理方法、装置和系统
CN201510574287.3 2015-09-10

Publications (1)

Publication Number Publication Date
WO2016145904A1 true WO2016145904A1 (zh) 2016-09-22

Family

ID=56920369

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/095196 WO2016145904A1 (zh) 2015-09-10 2015-11-20 一种资源管理方法、装置和系统

Country Status (2)

Country Link
CN (1) CN106528288A (zh)
WO (1) WO2016145904A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733476A (zh) * 2017-04-20 2018-11-02 北京京东尚科信息技术有限公司 一种执行多任务的方法和装置
CN109992383A (zh) * 2019-03-13 2019-07-09 南京苍穹浩瀚信息科技有限公司 一种充分利用网络计算资源的多租户大数据框架调度方法
CN111045811A (zh) * 2019-12-23 2020-04-21 五八有限公司 一种任务分配方法、装置、电子设备及存储介质
CN111625592A (zh) * 2019-02-28 2020-09-04 北京京东尚科信息技术有限公司 分布式数据库的负载均衡方法和装置
CN112000480A (zh) * 2020-08-25 2020-11-27 深圳忆联信息系统有限公司 提升ssd全盘扫描效率的方法、装置、设备及介质
EP3832661A1 (en) 2015-12-03 2021-06-09 UNL Holdings LLC Systems and methods for controlled drug delivery pumps

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933664B (zh) * 2017-03-09 2020-06-30 北京搜狐新媒体信息技术有限公司 一种Hadoop集群的资源调度方法及装置
CN107479683B (zh) * 2017-08-15 2019-12-20 爱普(福建)科技有限公司 一种面向组态软件的串行计算方法
CN109034536A (zh) * 2018-06-26 2018-12-18 天津字节跳动科技有限公司 服务资源调度方法、装置、计算机设备及存储介质
CN110087324B (zh) * 2019-04-22 2022-09-30 京信网络系统股份有限公司 资源分配方法、装置、接入网设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103106249A (zh) * 2013-01-08 2013-05-15 华中科技大学 一种基于Cassandra的数据并行处理系统
CN103617087A (zh) * 2013-11-25 2014-03-05 华中科技大学 一种适合迭代计算的MapReduce优化方法
CN104268018A (zh) * 2014-09-22 2015-01-07 浪潮(北京)电子信息产业有限公司 一种Hadoop集群中的作业调度方法和作业调度器
CN104317650A (zh) * 2014-10-10 2015-01-28 北京工业大学 一种面向Map/Reduce型海量数据处理平台的作业调度方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100485605C (zh) * 2001-09-24 2009-05-06 中兴通讯股份有限公司 一种多任务实时操作系统的实现方法
US20120079501A1 (en) * 2010-09-27 2012-03-29 Mark Henrik Sandstrom Application Load Adaptive Processing Resource Allocation
CN102073546B (zh) * 2010-12-13 2013-07-10 北京航空航天大学 一种云计算环境中分布式计算模式下的任务动态调度方法
US9886310B2 (en) * 2014-02-10 2018-02-06 International Business Machines Corporation Dynamic resource allocation in MapReduce

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103106249A (zh) * 2013-01-08 2013-05-15 华中科技大学 一种基于Cassandra的数据并行处理系统
CN103617087A (zh) * 2013-11-25 2014-03-05 华中科技大学 一种适合迭代计算的MapReduce优化方法
CN104268018A (zh) * 2014-09-22 2015-01-07 浪潮(北京)电子信息产业有限公司 一种Hadoop集群中的作业调度方法和作业调度器
CN104317650A (zh) * 2014-10-10 2015-01-28 北京工业大学 一种面向Map/Reduce型海量数据处理平台的作业调度方法

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3832661A1 (en) 2015-12-03 2021-06-09 UNL Holdings LLC Systems and methods for controlled drug delivery pumps
CN108733476A (zh) * 2017-04-20 2018-11-02 北京京东尚科信息技术有限公司 一种执行多任务的方法和装置
CN108733476B (zh) * 2017-04-20 2021-04-30 北京京东尚科信息技术有限公司 一种执行多任务的方法和装置
CN111625592A (zh) * 2019-02-28 2020-09-04 北京京东尚科信息技术有限公司 分布式数据库的负载均衡方法和装置
CN109992383A (zh) * 2019-03-13 2019-07-09 南京苍穹浩瀚信息科技有限公司 一种充分利用网络计算资源的多租户大数据框架调度方法
CN109992383B (zh) * 2019-03-13 2022-11-22 南京苍穹浩瀚信息科技有限公司 一种充分利用网络计算资源的多租户大数据框架调度方法
CN111045811A (zh) * 2019-12-23 2020-04-21 五八有限公司 一种任务分配方法、装置、电子设备及存储介质
CN112000480A (zh) * 2020-08-25 2020-11-27 深圳忆联信息系统有限公司 提升ssd全盘扫描效率的方法、装置、设备及介质
CN112000480B (zh) * 2020-08-25 2023-12-05 深圳忆联信息系统有限公司 提升ssd全盘扫描效率的方法、装置、设备及介质

Also Published As

Publication number Publication date
CN106528288A (zh) 2017-03-22

Similar Documents

Publication Publication Date Title
WO2016145904A1 (zh) 一种资源管理方法、装置和系统
Didona et al. Size-aware sharding for improving tail latencies in in-memory key-value stores
CN109783229B (zh) 线程资源分配的方法及装置
JP6751780B2 (ja) アクセラレーション・リソース処理方法及び装置
Tan et al. Coupling task progress for mapreduce resource-aware scheduling
US20130061220A1 (en) Method for on-demand inter-cloud load provisioning for transient bursts of computing needs
JP3678414B2 (ja) 多重プロセッサ・システム
KR101651871B1 (ko) 멀티코어 시스템 상에서 단위 작업을 할당하는 방법 및 그 장치
CN109564528B (zh) 分布式计算中计算资源分配的系统和方法
CN109697122B (zh) 任务处理方法、设备及计算机存储介质
US20170068574A1 (en) Multiple pools in a multi-core system
US20170031622A1 (en) Methods for allocating storage cluster hardware resources and devices thereof
JP6197791B2 (ja) 分散処理装置及び分散処理システム並びに分散処理方法
WO2022247105A1 (zh) 一种任务调度方法、装置、计算机设备和存储介质
CN110990154B (zh) 一种大数据应用优化方法、装置及存储介质
JP2010533924A (ja) リソース割り当てを拡大および縮小することによるスケジューリング
JP4992408B2 (ja) ジョブ割当プログラム、方法及び装置
WO2020125396A1 (zh) 一种共享数据的处理方法、装置及服务器
CN106775948B (zh) 一种基于优先级的云任务调度方法及装置
CN106462593B (zh) 大规模并行处理数据库的系统和方法
JP7506096B2 (ja) 計算資源の動的割り当て
WO2016061935A1 (zh) 一种资源调度方法、装置及计算机存储介质
EP1768024B1 (en) Processing management device, computer system, distributed processing method, and computer program
CN106569887B (zh) 一种云环境下细粒度任务调度方法
JP2014120097A (ja) 情報処理装置、プログラム、及び、情報処理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15885273

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15885273

Country of ref document: EP

Kind code of ref document: A1