WO2016061935A1 - 一种资源调度方法、装置及计算机存储介质 - Google Patents

一种资源调度方法、装置及计算机存储介质 Download PDF

Info

Publication number
WO2016061935A1
WO2016061935A1 PCT/CN2015/071475 CN2015071475W WO2016061935A1 WO 2016061935 A1 WO2016061935 A1 WO 2016061935A1 CN 2015071475 W CN2015071475 W CN 2015071475W WO 2016061935 A1 WO2016061935 A1 WO 2016061935A1
Authority
WO
WIPO (PCT)
Prior art keywords
queue
priority
information
resource
resources
Prior art date
Application number
PCT/CN2015/071475
Other languages
English (en)
French (fr)
Inventor
陈福忠
刘新强
梁平
汪邵飞
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016061935A1 publication Critical patent/WO2016061935A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications

Definitions

  • the present invention relates to communication control technologies, and in particular, to a resource scheduling method, apparatus, and computer storage medium.
  • Hadoop technology is currently the most widely used technology in big data platforms.
  • Hadoop technology uses priority and time-based policies to schedule resources; specifically, all applications are submitted to the default queue.
  • this default queue all applications are queued according to priority, and the same priority is pressed.
  • the chronological order is queued, that is, the application with the highest priority and the queue time is prioritized to allocate resources preferentially.
  • the embodiments of the present invention provide a resource scheduling method and device, which can implement resource exclusive in a specific service scenario.
  • the embodiment of the invention provides a resource scheduling method, and the method includes:
  • the queue attribute information includes exclusive server information of the queue, and priority information of the queue;
  • Resource scheduling is performed based on the dedicated server information of the queue and the priority information of the queue.
  • the dedicated server information based on the queue and the priority information of the queue are used to perform resource scheduling on jobs in all queues, including:
  • the resources of the dedicated server corresponding to the queue are allocated to the queue according to the priority of the queue from high to low.
  • the method further includes: when the priorities of the queues are the same, allocating resources of the dedicated server corresponding to the queue to the queue according to a first-in first-out rule.
  • the method when the queue attribute information does not include the dedicated server information of the queue, or the dedicated server information of the queue is configured to be empty, the method further includes:
  • the resources of all servers are allocated to the queue according to the priority of the queue from high to low.
  • the method further includes allocating resources of all the servers to the queue according to a first-in first-out rule when the priorities of the queues are the same.
  • An embodiment of the present invention further provides a resource scheduling apparatus, where the apparatus includes: a configuration unit and a scheduling unit;
  • the configuration unit is configured to pre-configure queue attribute information;
  • the queue attribute information includes dedicated server information of the queue, and priority information of the queue;
  • the scheduling unit is configured to perform resource scheduling based on the dedicated server information of the queue configured by the configuration unit and the priority information of the queue.
  • the scheduling unit is configured to allocate, according to the priority of the queue, the resources of the dedicated server corresponding to the queue for the queue from high to low.
  • the scheduling unit is further configured to allocate resources of the dedicated server corresponding to the queue to the queue according to a first-in first-out rule when the priorities of the queues are the same.
  • the scheduling unit is further configured to: when the queue attribute information configured by the configuration unit does not include dedicated server information of the queue, or When the server information is configured to be empty, resources of all servers are allocated to the queue according to the priority of the queue from high to low.
  • the scheduling unit is further configured to allocate resources of all servers to the queue according to a first-in first-out rule when the priorities of the queues are the same.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the resource scheduling method according to the embodiment of the invention.
  • the resource scheduling method, device, and computer storage medium provided by the embodiment of the present invention, by pre-configuring queue attribute information; the queue attribute information includes exclusive server information of the queue, and priority information of the queue; The server information and the priority information of the queue are used for resource scheduling. In this way, the exclusive use of resources in a specific service scenario is realized, and the adverse effects caused by the mutual preemption of resources between special services with high security and stability requirements are avoided.
  • FIG. 1 is a schematic flowchart of a resource scheduling method according to Embodiment 1 of the present invention.
  • FIG. 2 is a schematic structural diagram of a resource scheduling apparatus according to Embodiment 1 of the present invention.
  • FIG. 3 is a schematic flowchart of a resource scheduling method according to Embodiment 2 of the present invention.
  • FIG. 4 is a schematic diagram of traversal of a queue by a resource scheduler according to Embodiment 2 of the present invention.
  • FIG. 5 is a schematic diagram of a resource scheduler performing resource scheduling on a queue according to Embodiment 2 of the present invention.
  • FIG. 1 is a schematic flowchart of a resource scheduling method according to Embodiment 1 of the present invention; as shown in FIG. 1 , the method includes:
  • Step 101 Pre-configure queue attribute information; the queue attribute information includes exclusive queues. Server information, as well as priority information for the queue.
  • the resource scheduling method may be applied to a scheduler in a master node in a distributed system in an actual application.
  • the scheduler loads the queue configuration file before the resource scheduling, and the queue configuration file adds queue attribute information, and the queue attribute information may be configured by the user in advance.
  • the dedicated server identifier may be a host name of the dedicated server or an Internet Protocol (IP) address of the host; when the dedicated server set in the dedicated server information of the queue is two or more, The two dedicated server IDs are separated by commas. If the exclusive server information of the queue or the dedicated server information of the queue is set to be empty in the queue attribute information, indicating that the queue has no dedicated server configured, all servers may be allocated to the queue. Job processing.
  • the queue attribute information further includes the priority information of the queue, and the priority information of the queue is used to configure the queue priority; for example, the queue priority is divided into 5 levels; wherein the queue priority level 1 is the highest level; When the priority information of the queue is not configured, the priority of the queue is the lowest level by default. That is, the priority of the queue is 5 by default.
  • Step 102 Perform resource scheduling based on the dedicated server information of the queue and the priority information of the queue.
  • the dedicated server information based on the queue and the priority information of the queue are used to perform resource scheduling on jobs in all queues, including:
  • the resources of the dedicated server corresponding to the queue are allocated to the queue according to the priority of the queue from high to low.
  • the priority of the queue is 5, and when the priority of the queue A is 3 and the priority of the queue B is 5, the queue A and the queue B have the attributes configured in the queue attribute information. If the servers are all servers C, the resources of the server C are preferentially assigned to the jobs of the queue A, and then allocated to the jobs in the queue B.
  • the resources of the dedicated server corresponding to the queue are allocated to the queue according to a first input first output (FIFO) rule.
  • FIFO first input first output
  • the priority of the queue is 5, and when the priority of the queue A is 3 and the priority of the queue B is 3, the exclusive configuration of the queue attribute information of the queue A and the queue B is specifically configured.
  • the server is the server C, and according to the chronological order of the jobs in the queue A and the queue B, the resources of the server C preferentially allocate resources for the jobs in the queue A and the queue B in the time-first operation. .
  • the technical solution of the embodiment of the present invention achieves the exclusive use of resources in a specific service scenario, and avoids the adverse effects caused by the mutual preemption of resources between special services with high security and stability requirements.
  • the method when the queue attribute information does not include the dedicated server information of the queue, or the dedicated server information of the queue is configured to be empty, the method further includes:
  • the resources of all servers are allocated to the queue according to the priority of the queue from high to low.
  • the allocating all the resources of the server to the queue according to the priority of the queue from high to low including: allocating idle resources of all servers to the queue according to the priority of the queue from high to low .
  • the priority of the queue is 5
  • the priority of the queue A is 3
  • the priority of the queue B is 5.
  • the queue attribute information of the queue A and the queue B are not configured with the dedicated server.
  • Information; a server capable of providing resources for the queue A and the queue B includes servers C1, C2, and C3, and preferentially allocates resources of the server C1 to the queue A when only the server C1 is currently in an idle state. , then redistributed to the queue B.
  • the in-first-out rule allocates resources for all servers for the queue.
  • the priority of the queue is 5, and when the priority of the queue A is 3 and the priority of the queue B is 3, the queue attribute information of the queue A and the queue B are not configured exclusively.
  • Server information the server capable of providing resources for the queue A and the queue B includes servers C1, C2, and C3, and according to the jobs in the queue A and the queue B when only the server C1 is currently in an idle state
  • the technical solution of the embodiment not only realizes the exclusive use of resources in a specific service scenario, but also avoids the adverse effects caused by the mutual preemption of resources between special services with high security and stability requirements. Moreover, the resources are fully utilized, and the resource utilization rate is greatly improved.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the resource scheduling method according to the embodiment of the invention.
  • FIG. 2 is a schematic structural diagram of a resource scheduling apparatus according to Embodiment 1 of the present invention; as shown in FIG. 2, the apparatus includes: a configuration unit 21 and a scheduling unit 22; ,
  • the configuration unit 21 is configured to pre-configure queue attribute information; the queue attribute information includes dedicated server information of the queue, and priority information of the queue;
  • the scheduling unit 22 is configured to perform resource scheduling based on the dedicated server information of the queue configured by the configuration unit 21 and the priority information of the queue.
  • the resource scheduling apparatus may be implemented by a scheduler in a master node in a distributed file system in an actual application.
  • the scheduling unit 22 is configured to allocate, according to the priority of the queue, the resources of the dedicated server corresponding to the queue for the queue from high to low.
  • the scheduling unit 22 is further configured to When the priorities of the queues are the same, the queues are allocated resources of the dedicated servers corresponding to the queues according to the first-in first-out rule.
  • the scheduling unit 22 is further configured to: when the queue attribute information configured by the configuration unit 21 does not include dedicated server information of the queue, or exclusive of the queue When the server information is configured to be empty, resources of all servers are allocated to the queue according to the priority of the queue from high to low.
  • the scheduling unit 22 is further configured to allocate resources of all servers to the queue according to a first-in first-out rule when the priorities of the queues are the same.
  • the configuration unit 21 and the scheduling unit 22 in the device may be implemented by a central processing unit (CPU, Central Processing Unit) and a digital signal processor (DSP, Digital Signal Processor) in the device. Or an implementation of a Field-Programmable Gate Array (FPGA).
  • CPU Central Processing Unit
  • DSP Digital Signal Processor
  • FPGA Field-Programmable Gate Array
  • FIG. 3 is a schematic flowchart of a resource scheduling method according to Embodiment 2 of the present invention; as shown in FIG. 3, the method includes:
  • Step 301 The resource management node (ResourceManager) sends an initialization message to the resource scheduler (ResourceScheduler) to initialize the resource scheduler.
  • Step 302 The resource scheduler loads a queue configuration file.
  • the server identifier may be the host name of the dedicated server or the IP address of the host, and the two dedicated server identifiers are separated by commas; if the configuration item does not have a dedicated server identifier that is configured as a queue configuration , which means a dedicated server configured without queues.
  • the queue attribute information further includes priority information of the queue, and the priority information of the queue is configured to configure a queue priority; for example, the queue priority is divided into five levels, wherein the queue priority level 1 is the highest level; When the priority information of the queue is configured, the priority of the queue is the lowest, that is, the priority of the queue is 5 by default. Further, the queue configuration file is loaded into the corresponding queue object (Queue). .
  • Step 303 The resource scheduler traverses the queue object to obtain queue attribute information of each queue.
  • the resource scheduler starts from the root queue, traverses the entire hierarchical queue from the root queue to the leaf queue, and obtains queue attribute information of each queue, that is, acquires exclusive server related information of the queue and priority information of the queue, and the The dedicated server related information of the queue and the priority information of the queue are saved in the memory object.
  • FIG. 4 is a schematic diagram of traversal of a queue by a resource scheduler according to Embodiment 2 of the present invention; as shown in FIG. 4, it is assumed that the system includes three leaf queues of A1, A2, and B, and the server for processing the job includes C1. C2 and C3 three servers; the resource scheduler starts from the root queue (ROOT), traverses the hierarchical queue from the root queue to the leaf queue; obtains the queue attribute information of the leaf queues A1, A2, and B; as shown in FIG.
  • ROOT root queue
  • the priority of the leaf queue A1 is 3, the C1 server and the C3 server are the exclusive servers of the leaf queue A1; the priority of the leaf queue A2 is 3, the C1 server and the C2 server are the exclusive servers of the leaf queue A2;
  • the setting of the configuration item information and the priority information indicates that the leaf queue B has a priority of 5, and all servers can process the jobs in the leaf queue B, but only when any of the servers is idle. The job in the leaf queue B can be processed.
  • the queue information processed by each server is as follows:
  • the queues processed by the C1 server are: A1 (priority is 3), A2 (priority is 3), and B (priority is 5);
  • the queues processed by the C2 server are: A2 (priority is 3) and B (priority is 5);
  • the queues processed by the C3 server are: A1 (priority is 3) and B (priority is 5).
  • Step 304 Acquire job slice information from a temporary directory of the distributed computing engine of the HDFS, and generate an internal task object according to the job slice information.
  • Step 305 The job management program (AppMaster) acquires the resource request information of the Task according to the Task object.
  • the resource request information of the Task includes: a task priority, a host (Host) where the desired resource is located, and a resource quantity (including a memory and a central part). Attribute information such as the processor, etc., the number of containers, and whether or not the locality is relaxed.
  • Step 306 The job management program (AppMaster) sends a heartbeat message to the resource management node (ResourceManager) to request resource allocation.
  • Step 307 The resource management node (ResourceManager) triggers the resource scheduler, and saves the resource allocation request of the job management program (AppMaster) into the memory of the resource management node (ResourceManager).
  • Step 308 The compute node (NodeManager) reports a heartbeat message to the resource management node (ResourceManager), and releases the idle container to prepare a new resource allocation for the resource management node (ResourceManager).
  • Step 309 The resource management node (ResourceManager) triggers the resource scheduler to perform resource allocation.
  • Step 310 The resource scheduler cleans up the internal Container.
  • Step 311 The resource scheduler traverses the queue tree from the root queue; and finds the high priority leaf queue through the binary tree algorithm.
  • FIG. 5 is a schematic diagram of a resource scheduler performing resource scheduling on a queue according to Embodiment 2 of the present invention; as shown in FIG. 5, the resource scheduler uses the queue attribute information saved in step 303, when the job is submitted to a specific queue, Specific queue configuration exclusive service The dedicated server processes only the jobs in the particular queue.
  • the jobs in the high priority queue are preferentially allocated to the resources of the dedicated server corresponding to the high priority queue; the jobs in the same priority queue are allocated resources according to the FIFO algorithm.
  • the C1 server in the compute node releases the resources, it finds the leaf queue A1 (priority 3), the leaf queue A2 (priority 3), and the leaf queue B (priority 5). Data; because the leaf queue A1 and the leaf queue A2 have the same priority, the resource scheduler first allocates resources according to the FIFO algorithm in the waiting operation of the leaf queue A1 and the leaf queue A2 when allocating resources; if the leaf queue A1 and If the leaf queue A2 does not wait for the job, the resource allocation is performed by the FIFO algorithm in the leaf queue B.
  • Step 312 The resource scheduler finds the application with higher priority and allocates resources through the binary tree algorithm; if the allocation succeeds, the resource allocation is ended, and the resource allocation result is saved.
  • each application carries the priority information
  • the resource scheduler can obtain the priority information of each application through the binary tree algorithm, and find the application with high priority.
  • Step 313 The job management program (AppMaster) sends a heartbeat message request resource allocation to the resource management node (ResourceManager), and the resource management node (ResourceManager) sends a request message to the resource scheduler, where the request message is configured to request resource allocation. Resulting; the resource scheduler returns a response message of the request message, the response message of the request message carries a resource allocation result; the resource management node (ResourceManager) sends a heartbeat message to the job management program (AppMaster) The resource allocation result is carried in the middle.
  • Step 314 The job management program (AppMaster) allocates resources according to the resource allocation result in the following order: the job with high priority, the localization of source data, the own rack, and not in the same rack.
  • the job management program (AppMaster) first allocates resources in order of priority, preferentially allocates resources to high-priority jobs; and preferentially allocates resources when there are still resources remaining.
  • the resource is given to the local (that is, the server) job; in the case where the resource is still left, the resource is preferentially allocated to the other servers in the rack; if the resource remains, the resource is allocated to other racks.
  • the job of the server first allocates resources in order of priority, preferentially allocates resources to high-priority jobs; and preferentially allocates resources when there are still resources remaining.
  • the resource is given to the local (that is, the server) job; in the case where the resource is still left, the resource is preferentially allocated to the other servers in the rack; if the resource remains, the resource is allocated to other racks.
  • Step 315 The job management program (AppMaster) sends a heartbeat message to the resource management node (ResourceManager) to notify the resource management node (ResourceManager) to release other resource requests of the Task.
  • Step 316 The job manager (AppMaster) sends a message to the computing node (NodeManager) requesting to start the task.
  • Step 317 The compute node (NodeManager) starts the task.
  • the resource management node (ResourceManager) and the resource scheduler (ResourceScheduler) are functional units in a primary node of the distributed system, and the resource management node (ResourceManager) is mainly configured to a resource scheduler (ResourceScheduler) is mainly configured to schedule resources;
  • the compute node (NodeManager) is a functional unit of a slave node of the distributed system;
  • the management program (AppMaster) is a functional unit in the slave node of the distributed system.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention can take the form of a hardware embodiment, a software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
  • the embodiment of the present invention pre-configures queue attribute information; the queue attribute information includes dedicated server information of the queue, and priority information of the queue; and resources are performed based on the dedicated server information of the queue and the priority information of the queue. Scheduling. In this way, the exclusive use of resources in a specific service scenario is realized, and the adverse effects caused by the mutual preemption of resources between special services with high security and stability requirements are avoided.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本发明实施例公开了一种资源调度方法、装置及计算机存储介质;其中,所述资源调度方法包括:预先配置队列属性信息;所述队列属性信息包括队列的专属服务器信息,以及所述队列的优先级信息;基于所述队列的专属服务器信息以及所述队列的优先级信息进行资源调度。

Description

一种资源调度方法、装置及计算机存储介质 技术领域
本发明涉及通信控制技术,具体涉及一种资源调度方法、装置及计算机存储介质。
背景技术
Hadoop技术是目前在大数据平台中应用率最高的技术。而目前,Hadoop技术采用基于优先级和时间的策略对资源进行调度;具体的,所有应用都提交到默认队列中,在该默认队列中,所有应用先按照优先级进行排队,相同优先级的按时间先后顺序进行排队,即优先级高且排队时间在先的应用优先分配资源。
但随着Hadoop技术的普及,单个Hadoop集群中的用户量和应用程序种类不断增加,采用上述的资源调度机制已不能很好的利用集群的资源,也不能够满足不同应用的服务质量要求,尤其在特定高优先级应用需要独占资源的场景下,上述资源调度机制已不能满足该特定场景的需求,因此亟待提出一种新的资源调度方案。
发明内容
为解决现有存在的技术问题,本发明实施例提供一种资源调度方法及装置,能够实现特定业务场景下的资源独占。
为达到上述目的,本发明实施例的技术方案是这样实现的:
本发明实施例提供了一种资源调度方法,所述方法包括:
预先配置队列属性信息;所述队列属性信息包括队列的专属服务器信息,以及所述队列的优先级信息;
基于所述队列的专属服务器信息以及所述队列的优先级信息进行资源调度。
在另一实施例中,所述基于所述队列的专属服务器信息以及所述队列的优先级信息对所有队列中的作业进行资源调度,包括:
按所述队列的优先级由高到低为所述队列分配所述队列对应的专属服务器的资源。
在另一实施例中,所述方法还包括:当所述队列的优先级相同时,按先入先出规则为所述队列分配所述队列对应的专属服务器的资源。
在另一实施例中,当所述队列属性信息中不包括所述队列的专属服务器信息,或者所述队列的专属服务器信息配置为空时,所述方法还包括:
按所述队列的优先级由高到低为所述队列分配所有服务器的资源。
在另一实施例中,所述方法还包括:当所述队列的优先级相同时,按先入先出规则为所述队列分配所有服务器的资源。
本发明实施例还提供了一种资源调度装置,所述装置包括:配置单元和调度单元;其中,
所述配置单元,配置为预先配置队列属性信息;所述队列属性信息包括队列的专属服务器信息,以及所述队列的优先级信息;
所述调度单元,配置为基于所述配置单元配置的所述队列的专属服务器信息以及所述队列的优先级信息进行资源调度。
在另一实施例中,所述调度单元,配置为按所述队列的优先级由高到低为所述队列分配所述队列对应的专属服务器的资源。
在另一实施例中,所述调度单元,还配置为当所述队列的优先级相同时,按先入先出规则为所述队列分配所述队列对应的专属服务器的资源。
在另一实施例中,所述调度单元,还配置为当所述配置单元配置的所述队列属性信息中不包括所述队列的专属服务器信息,或者所述队列的专 属服务器信息配置为空时,按所述队列的优先级由高到低为所述队列分配所有服务器的资源。
在另一实施例中,所述调度单元,还配置为当所述队列的优先级相同时,按先入先出规则为所述队列分配所有服务器的资源。
本发明实施例还提供了一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行本发明实施例所述的资源调度方法。
本发明实施例提供的资源调度方法、装置及计算机存储介质,通过预先配置队列属性信息;所述队列属性信息包括队列的专属服务器信息,以及所述队列的优先级信息;基于所述队列的专属服务器信息以及所述队列的优先级信息进行资源调度。如此,实现了特定业务场景下的资源独占,避免了安全性和稳定性要求较高的特殊业务之间相互抢占资源所带来的不利影响。
附图说明
图1为本发明实施例一的资源调度方法的流程示意图;
图2为本发明实施例一的资源调度装置的组成结构示意图;
图3为本发明实施例二的资源调度方法的流程示意图;
图4为本发明实施例二中资源调度器对队列进行的遍历的示意图;
图5为本发明实施例二中资源调度器对队列进行资源调度的示意图。
具体实施方式
下面结合附图及具体实施例对本发明作进一步详细的说明。
本发明实施例提供了一种资源调度方法;图1为本发明实施例一的资源调度方法的流程示意图;如图1所示,所述方法包括:
步骤101:预先配置队列属性信息;所述队列属性信息包括队列的专属 服务器信息,以及所述队列的优先级信息。
本实施例中,所述资源调度方法在实际应用中,可应用在分布式系统中的主节点中的调度器中。所述调度器在资源调度之前加载队列配置文件,所述队列配置文件中新增队列属性信息,所述队列属性信息可预先由用户配置。
其中,所述队列属性信息包括队列的专属服务器信息,所述队列的专属服务器信息具体可以如下所示:yarn.queueA1.hosts=C1,C3;其中,C1和C3表示为队列配置的专属服务器标识,所述专属服务器标识可以是所述专属服务器的主机名或者主机的互联网协议(IP,Internet Protocol)地址;当所述队列的专属服务器信息中设置的专属服务器为两个或两个以上时,两个专属服务器标识之间用逗号分隔。若所述队列属性信息中无所述队列的专属服务器信息、或所述队列的专属服务器信息设置为空时,表明所述队列无配置的专属服务器,所有的服务器均可分配至所述队列进行作业处理。
其中,所述队列属性信息还包括队列的优先级信息,所述队列的优先级信息用于配置队列优先级;例如队列优先级分为5级;其中,队列优先级1级为最高级;当没有配置队列的优先级信息时,默认所述队列的优先级为最低级,即默认所述队列的优先级为5级。
步骤102:基于所述队列的专属服务器信息以及所述队列的优先级信息进行资源调度。
这里,所述基于所述队列的专属服务器信息以及所述队列的优先级信息对所有队列中的作业进行资源调度,包括:
按所述队列的优先级由高到低为所述队列分配所述队列对应的专属服务器的资源。
具体的,以队列的优先级为5级为例,当队列A的优先级为3、队列B的优先级为5时,且所述队列A和所述队列B的队列属性信息中配置的专 属服务器均为服务器C,则所述服务器C的资源优先为所述队列A的作业分配,再为所述队列B中的作业分配。
当所述队列的优先级相同时,按先入先出(FIFO,First Input First Output)规则为所述队列分配所述队列对应的专属服务器的资源。
具体的,以队列的优先级为5级为例,当队列A的优先级为3、队列B的优先级为3时,且所述队列A和所述队列B的队列属性信息中配置的专属服务器均为服务器C,则根据所述队列A和所述队列B中的作业的时间先后顺序,所述服务器C的资源优先为所述队列A和所述队列B中时间在先的作业分配资源。
采用本发明实施例的技术方案,实现了特定业务场景下的资源独占,避免了安全性和稳定性要求较高的特殊业务之间相互抢占资源所带来的不利影响。
依据本发明实施例的另一实施例,当所述队列属性信息中不包括所述队列的专属服务器信息,或者所述队列的专属服务器信息配置为空时,所述方法还包括:
按所述队列的优先级由高到低为所述队列分配所有服务器的资源。
具体的,所述按所述队列的优先级由高到低为所述队列分配所有服务器的资源,包括:按所述队列的优先级由高到低为所述队列分配所有服务器中的空闲资源。例如,以队列的优先级为5级为例,当队列A的优先级为3、队列B的优先级为5时,且所述队列A和所述队列B的队列属性信息中没有配置专属服务器信息;能够为所述队列A和所述队列B提供资源的服务器包括服务器C1、C2和C3,则在当前只有服务器C1处于空闲状态时,优先将所述服务器C1的资源分配给所述队列A,再分配至所述队列B。
依据本发明实施例的另一实施例,当所述队列的优先级相同时,按先 入先出规则为所述队列分配所有服务器的资源。
具体的,以队列的优先级为5级为例,当队列A的优先级为3、队列B的优先级为3时,且所述队列A和所述队列B的队列属性信息中没有配置专属服务器信息,能够为所述队列A和所述队列B提供资源的服务器包括服务器C1、C2和C3,则在当前只有服务器C1处于空闲状态时,根据所述队列A和所述队列B中的作业的时间先后顺序,所述服务器C1的资源优先为所述队列A和所述队列B中时间在先的作业分配资源。
采用本实施例的技术方案,不仅实现了特定业务场景下的资源独占,避免了安全性和稳定性要求较高的特殊业务之间相互抢占资源所带来的不利影响。而且充分利用了资源,大大提高了资源利用率。
本发明实施例还提供了一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行本发明实施例所述的资源调度方法。
本发明实施例还提供了一种资源调度装置;图2为本发明实施例一的资源调度装置的组成结构示意图;如图2所示,所述装置包括:配置单元21和调度单元22;其中,
所述配置单元21,配置为预先配置队列属性信息;所述队列属性信息包括队列的专属服务器信息,以及所述队列的优先级信息;
所述调度单元22,配置为基于所述配置单元21配置的所述队列的专属服务器信息以及所述队列的优先级信息进行资源调度。
本实施例中,所述资源调度装置在实际应用中,可通过分布式文件系统中的主节点中的调度器实现。
依据本发明实施例的另一实施例,所述调度单元22,配置为按所述队列的优先级由高到低为所述队列分配所述队列对应的专属服务器的资源。
依据本发明实施例的另一实施例,所述调度单元22,还配置为当所述 队列的优先级相同时,按先入先出规则为所述队列分配所述队列对应的专属服务器的资源。
依据本发明实施例的另一实施例,所述调度单元22,还配置为当所述配置单元21配置的所述队列属性信息中不包括所述队列的专属服务器信息,或者所述队列的专属服务器信息配置为空时,按所述队列的优先级由高到低为所述队列分配所有服务器的资源。
依据本发明实施例的另一实施例,所述调度单元22,还配置为当所述队列的优先级相同时,按先入先出规则为所述队列分配所有服务器的资源。
本领域技术人员应当理解,本发明实施例的资源调度装置中各处理单元的功能,可参照前述资源调度方法的相关描述而理解,本发明实施例的资源调度装置中各处理单元,可通过实现本发明实施例所述的功能的模拟电路而实现,也可以通过执行本发明实施例所述的功能的软件在智能终端上的运行而实现。
本实施例中,所述装置中的配置单元21和调度单元22,在实际应用中可由所述装置中的中央处理器(CPU,Central Processing Unit)、数字信号处理器(DSP,Digital Signal Processor)或可编程门阵列(FPGA,Field-Programmable Gate Array)实现。
本发明实施例还提供了一种资源调度方法;图3为本发明实施例二的资源调度方法的流程示意图;如图3所示,所述方法包括:
步骤301:资源管理节点(ResourceManager)向资源调度器(ResourceScheduler)发送初始化消息,以初始化所述资源调度器。
步骤302:所述资源调度器加载队列配置文件。所述队列配置文件中新增队列属性信息,所述队列属性信息包括配置项信息,所述配置项信息配置为为队列配置的服务器或服务器群;其中,所述配置项具体可以如下所示:yarn.queueA1.hosts=C1,C3;其中,C1和C3表示为队列配置的专属服 务器标识,所述专属服务器标识可以是所述专属服务器的主机名或者主机的IP地址,两个专属服务器标识之间用逗号分隔;若所述配置项中没有表示为队列配置的专属服务器标识,则说明为无队列配置的专属服务器。所述队列属性信息中还包括队列的优先级信息,所述队列的优先级信息配置为配置队列优先级;例如队列优先级分为5级,其中,队列优先级1级为最高级;当没有配置队列的优先级信息时,默认所述队列的优先级为最低级,即默认所述队列的优先级为5级;进一步的,将所述队列配置文件加载至对应的队列对象(Queue)中。
步骤303:所述资源调度器遍历队列对象,获取每个队列的队列属性信息。
这里,所述资源调度器从根队列开始,从根队列到叶子队列遍历整个层级队列,获取每个队列的队列属性信息,即获取队列的专属服务器相关信息和队列的优先级信息,将所述队列的专属服务器相关信息和所述队列的优先级信息保存在内存对象中。
具体的,图4为本发明实施例二中资源调度器对队列进行的遍历的示意图;如图4所示,假定系统中包括A1、A2和B三个叶子队列,处理作业的服务器包括C1、C2和C3三个服务器;则资源调度器从根队列(ROOT)开始,从根队列到叶子队列遍历层级队列;获得叶子队列A1、A2、B的队列属性信息;如图4所示。获得叶子队列A1的优先级为3,C1服务器和C3服务器是叶子队列A1的专属服务器;获得叶子队列A2的优先级为3,C1服务器和C2服务器是叶子队列A2的专属服务器;叶子队列B无配置项信息和优先级信息的设置,则表明叶子队列B的优先级为5,以及所有服务器均可处理所述叶子队列B中的作业,但是仅当所有服务器中任一服务器处于空闲状态时,才能够处理所述叶子队列B中的作业。
则每个服务器对应处理的队列信息如下所示:
C1服务器对应处理的队列为:A1(优先级为3)、A2(优先级为3)、B(优先级为5);
C2服务器对应处理的队列为:A2(优先级为3)、B(优先级为5);
C3服务器对应处理的队列为:A1(优先级为3)、B(优先级为5)。
步骤304:从HDFS的分布式计算引擎的临时目录获取作业切片信息,并根据所述作业切片信息生成内部作业(Task)对象。
步骤305:作业管理程序(AppMaster)根据所述Task对象获取Task的资源请求信息,所述Task的资源请求信息包括:任务优先级、期望资源所在主机(Host)、资源量(具体包括内存、中央处理器等)、容器(Container)数量、是否松弛本地性等属性信息。
步骤306:作业管理程序(AppMaster)向资源管理节点(ResourceManager)发送心跳消息以请求资源分配。
步骤307:资源管理节点(ResourceManager)触发资源调度器,并将作业管理程序(AppMaster)的资源分配请求保存到所述资源管理节点(ResourceManager)的内存中。
步骤308:计算节点(NodeManager)向资源管理节点(ResourceManager)上报心跳消息,并释放空闲Container,以便为所述资源管理节点(ResourceManager)准备新的资源分配。
步骤309:资源管理节点(ResourceManager)触发资源调度器进行资源分配。
步骤310:资源调度器清理内部Container。
步骤311:资源调度器从根队列开始遍历队列树;并通过二叉树算法查找到高优先级的叶子队列。图5为本发明实施例二中资源调度器对队列进行资源调度的示意图;如图5所示,资源调度器利用步骤303中保存的队列属性信息,当作业提交到某个特定队列时,所述特定队列配置专属服务 器,所述专属服务器只处理所述特定队列中的作业。
其中,处于高优先级队列中的作业,优先分配到所述高优先级队列对应的专属服务器的资源;处于相同优先级队列中的作业,按FIFO算法分配资源。
如图5所示,当计算节点(NodeManager)中的C1服务器释放资源时,查找到叶子队列A1(优先级为3)、叶子队列A2(优先级为3)和叶子队列B(优先级为5)的数据;由于叶子队列A1和叶子队列A2的优先级相同,资源调度器在分配资源时,先在叶子队列A1和叶子队列A2的等待作业中按FIFO算法进行资源分配;如果叶子队列A1和叶子队列A2没有等待作业,则在叶子队列B中按FIFO算法进行资源分配。
步骤312:资源调度器通过二叉树算法,找到优先高的应用并进行资源分配;分配成功则结束此次资源分配,并保存资源分配结果。
这里,每个应用都携带有优先级信息,所述资源调度器可通过二叉树算法获取到所述每个应用的优先级信息,查找到优先级高的应用。
步骤313:作业管理程序(AppMaster)向资源管理节点(ResourceManager)发送心跳消息申请资源分配,所述资源管理节点(ResourceManager)向所述资源调度器发送请求消息,所述请求消息配置为请求资源分配结果;所述资源调度器返回所述请求消息的响应消息,所述请求消息的响应消息中携带有资源分配结果;所述资源管理节点(ResourceManager)向所述作业管理程序(AppMaster)发送心跳消息中携带有所述资源分配结果。
步骤314:作业管理程序(AppMaster)根据所述资源分配结果按如下顺序分配资源:优先级高的作业、源数据本地化、本机架、不在同一机架。
具体的,所述作业管理程序(AppMaster)首先按优先级顺序分配资源,优先将资源分配给高优先级的作业;在资源仍有剩余的情况下,优先分配 资源给本地(即本服务器)的作业;在资源仍有剩余的情况下,优先分配资源给本机架中的其他服务器的作业;在资源仍有剩余的情况下,分配资源给其他机架中的服务器的作业。
步骤315:作业管理程序(AppMaster)向资源管理节点(ResourceManager)发送心跳消息,以通知所述资源管理节点(ResourceManager)释放所述Task的其他资源请求。
步骤316:作业管理程序(AppMaster)向计算节点(NodeManager)发送消息,请求启动Task。
步骤317:计算节点(NodeManager)启动task。
在本实施例中,所述资源管理节点(ResourceManager)和所述资源调度器(ResourceScheduler)均为分布式系统的主节点中的功能单元,所述资源管理节点(ResourceManager)主要配置为对所述资源调度器(ResourceScheduler)的控制及触发,所述资源调度器(ResourceScheduler)主要配置为对资源进行调度;所述计算节点(NodeManager)为所述分布式系统的从节点的功能单元;所述作业管理程序(AppMaster)为所述分布式系统的从节点中的功能单元。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、 嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
以上所述,仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。
工业实用性
本发明实施例通过预先配置队列属性信息;所述队列属性信息包括队列的专属服务器信息,以及所述队列的优先级信息;基于所述队列的专属服务器信息以及所述队列的优先级信息进行资源调度。如此,实现了特定业务场景下的资源独占,避免了安全性和稳定性要求较高的特殊业务之间相互抢占资源所带来的不利影响。

Claims (11)

  1. 一种资源调度方法,所述方法包括:
    预先配置队列属性信息;所述队列属性信息包括队列的专属服务器信息,以及所述队列的优先级信息;
    基于所述队列的专属服务器信息以及所述队列的优先级信息进行资源调度。
  2. 根据权利要求1所述的方法,其中,所述基于所述队列的专属服务器信息以及所述队列的优先级信息对所有队列中的作业进行资源调度,包括:
    按所述队列的优先级由高到低为所述队列分配所述队列对应的专属服务器的资源。
  3. 根据权利要求2所述的方法,其中,所述方法还包括:当所述队列的优先级相同时,按先入先出规则为所述队列分配所述队列对应的专属服务器的资源。
  4. 根据权利要求1所述的方法,其中,当所述队列属性信息中不包括所述队列的专属服务器信息,或者所述队列的专属服务器信息配置为空时,所述方法还包括:
    按所述队列的优先级由高到低为所述队列分配所有服务器的资源。
  5. 根据权利要求4所述的方法,其中,所述方法还包括:当所述队列的优先级相同时,按先入先出规则为所述队列分配所有服务器的资源。
  6. 一种资源调度装置,所述装置包括:配置单元和调度单元;其中,
    所述配置单元,配置为预先配置队列属性信息;所述队列属性信息包括队列的专属服务器信息,以及所述队列的优先级信息;
    所述调度单元,配置为基于所述配置单元配置的所述队列的专属服务器信息以及所述队列的优先级信息进行资源调度。
  7. 根据权利要求6所述的装置,其中,所述调度单元,配置为按所述队列的优先级由高到低为所述队列分配所述队列对应的专属服务器的资源。
  8. 根据权利要求7所述的装置,其中,所述调度单元,还配置为当所述队列的优先级相同时,按先入先出规则为所述队列分配所述队列对应的专属服务器的资源。
  9. 根据权利要求6所述的装置,其中,所述调度单元,还配置为当所述配置单元配置的所述队列属性信息中不包括所述队列的专属服务器信息,或者所述队列的专属服务器信息配置为空时,按所述队列的优先级由高到低为所述队列分配所有服务器的资源。
  10. 根据权利要求9所述的装置,其中,所述调度单元,还配置为当所述队列的优先级相同时,按先入先出规则为所述队列分配所有服务器的资源。
  11. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1至5任一项所述的资源调度方法。
PCT/CN2015/071475 2014-10-20 2015-01-23 一种资源调度方法、装置及计算机存储介质 WO2016061935A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410558180.5A CN105592110B (zh) 2014-10-20 2014-10-20 一种资源调度方法及装置
CN201410558180.5 2014-10-20

Publications (1)

Publication Number Publication Date
WO2016061935A1 true WO2016061935A1 (zh) 2016-04-28

Family

ID=55760127

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/071475 WO2016061935A1 (zh) 2014-10-20 2015-01-23 一种资源调度方法、装置及计算机存储介质

Country Status (2)

Country Link
CN (1) CN105592110B (zh)
WO (1) WO2016061935A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111580945A (zh) * 2020-04-21 2020-08-25 智业互联(厦门)健康科技有限公司 微服务任务协调调度方法及系统
CN112380017A (zh) * 2020-11-30 2021-02-19 成都虚谷伟业科技有限公司 一种基于松散内存释放的内存管理系统
CN113553361A (zh) * 2021-07-30 2021-10-26 北京东方国信科技股份有限公司 资源管理方法及装置
CN117234740A (zh) * 2023-11-13 2023-12-15 沐曦集成电路(杭州)有限公司 一种gpu硬件资源的调度方法、装置、设备及介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107665143B (zh) * 2016-07-27 2020-10-16 华为技术有限公司 资源管理方法、装置及系统
CN107889155A (zh) * 2016-09-30 2018-04-06 中兴通讯股份有限公司 一种网络切片的管理方法及装置
CN107194608B (zh) * 2017-06-13 2021-09-17 复旦大学 一种面向残疾人社区的众包标注任务分配方法
CN108667654B (zh) * 2018-04-19 2021-04-20 北京奇艺世纪科技有限公司 服务器集群自动扩容方法及相关设备
CN110175073B (zh) * 2019-05-31 2022-05-31 杭州数梦工场科技有限公司 数据交换作业的调度方法、发送方法、装置及相关设备

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1670707A (zh) * 2004-03-19 2005-09-21 联想(北京)有限公司 一种机群作业的管理方法
CN103294531A (zh) * 2012-03-05 2013-09-11 阿里巴巴集团控股有限公司 一种任务分配方法及系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101668237A (zh) * 2009-10-20 2010-03-10 国网信息通信有限公司 一种业务存储器配置方法和模块
CN103596285A (zh) * 2012-08-16 2014-02-19 华为技术有限公司 无线资源调度方法及无线资源调度器及系统
US9400682B2 (en) * 2012-12-06 2016-07-26 Hewlett Packard Enterprise Development Lp Ranking and scheduling of monitoring tasks
CN103873279B (zh) * 2012-12-13 2015-07-15 腾讯科技(深圳)有限公司 一种服务器管理方法,及装置

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1670707A (zh) * 2004-03-19 2005-09-21 联想(北京)有限公司 一种机群作业的管理方法
CN103294531A (zh) * 2012-03-05 2013-09-11 阿里巴巴集团控股有限公司 一种任务分配方法及系统

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111580945A (zh) * 2020-04-21 2020-08-25 智业互联(厦门)健康科技有限公司 微服务任务协调调度方法及系统
CN112380017A (zh) * 2020-11-30 2021-02-19 成都虚谷伟业科技有限公司 一种基于松散内存释放的内存管理系统
CN112380017B (zh) * 2020-11-30 2024-04-09 成都虚谷伟业科技有限公司 一种基于松散内存释放的内存管理系统
CN113553361A (zh) * 2021-07-30 2021-10-26 北京东方国信科技股份有限公司 资源管理方法及装置
CN117234740A (zh) * 2023-11-13 2023-12-15 沐曦集成电路(杭州)有限公司 一种gpu硬件资源的调度方法、装置、设备及介质
CN117234740B (zh) * 2023-11-13 2024-02-20 沐曦集成电路(杭州)有限公司 一种gpu硬件资源的调度方法、装置、设备及介质

Also Published As

Publication number Publication date
CN105592110A (zh) 2016-05-18
CN105592110B (zh) 2020-06-30

Similar Documents

Publication Publication Date Title
WO2016061935A1 (zh) 一种资源调度方法、装置及计算机存储介质
US11314551B2 (en) Resource allocation and scheduling for batch jobs
US9942273B2 (en) Dynamic detection and reconfiguration of a multi-tenant service
US10193977B2 (en) System, device and process for dynamic tenant structure adjustment in a distributed resource management system
US9626209B2 (en) Maintaining virtual machines for cloud-based operators in a streaming application in a ready state
US20170031622A1 (en) Methods for allocating storage cluster hardware resources and devices thereof
US8756599B2 (en) Task prioritization management in a virtualized environment
US20200174838A1 (en) Utilizing accelerators to accelerate data analytic workloads in disaggregated systems
US9563474B2 (en) Methods for managing threads within an application and devices thereof
CN109729106B (zh) 处理计算任务的方法、系统和计算机程序产品
WO2016183799A1 (zh) 一种硬件加速方法以及相关设备
US10489177B2 (en) Resource reconciliation in a virtualized computer system
US20130305245A1 (en) Methods for managing work load bursts and devices thereof
WO2016101799A1 (zh) 一种基于分布式系统的业务分配方法及装置
KR20110083084A (ko) 가상화를 이용한 서버 운영 장치 및 방법
JP2017037492A (ja) 分散処理プログラム、分散処理方法および分散処理装置
US9424083B2 (en) Managing metadata for a distributed processing system with manager agents and worker agents
WO2022271223A1 (en) Dynamic microservices allocation mechanism
JP2016024612A (ja) データ処理制御方法、データ処理制御プログラムおよびデータ処理制御装置
JP2016091555A (ja) データステージング管理システム
WO2017075796A1 (zh) 网络功能虚拟化nfv网络中分配虚拟资源的方法和装置
US10915704B2 (en) Intelligent reporting platform
US11354164B1 (en) Robotic process automation system with quality of service based automation
US9990240B2 (en) Event handling in a cloud data center
CN106657195B (zh) 任务处理方法和中继设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15852770

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15852770

Country of ref document: EP

Kind code of ref document: A1