WO2020000944A1 - 基于抢占式调度的资源共享使用方法、系统及设备 - Google Patents

基于抢占式调度的资源共享使用方法、系统及设备 Download PDF

Info

Publication number
WO2020000944A1
WO2020000944A1 PCT/CN2018/123464 CN2018123464W WO2020000944A1 WO 2020000944 A1 WO2020000944 A1 WO 2020000944A1 CN 2018123464 W CN2018123464 W CN 2018123464W WO 2020000944 A1 WO2020000944 A1 WO 2020000944A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
physical node
target
resources
priority
Prior art date
Application number
PCT/CN2018/123464
Other languages
English (en)
French (fr)
Inventor
孙宏健
Original Assignee
星环信息科技(上海)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 星环信息科技(上海)有限公司 filed Critical 星环信息科技(上海)有限公司
Priority to JP2020573022A priority Critical patent/JP7060724B2/ja
Priority to CA3104806A priority patent/CA3104806C/en
Priority to EP18924598.8A priority patent/EP3799390A4/en
Priority to SG11202013049XA priority patent/SG11202013049XA/en
Publication of WO2020000944A1 publication Critical patent/WO2020000944A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • H04L67/61Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources taking into account QoS or priority requirements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5019Ensuring fulfilment of SLA
    • H04L41/5022Ensuring fulfilment of SLA by giving priorities, e.g. assigning classes of service
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1012Server selection for load balancing based on compliance of requirements or conditions with available server resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/504Resource capping
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of computer technology, for example, to a method and system for sharing resources based on preemptive scheduling.
  • a resource sharing distributed system multiple tenants share the use of resources; at the same time, the resources used by each tenant need to have certain restrictions to ensure that each tenant can have resources to use, and there is no "starvation" of tenant resources. phenomenon.
  • the scheduler in a distributed system effectively schedules the jobs or tasks of multiple tenants to ensure that each tenant's jobs or tasks are executed stably and quickly, while the resources in the distributed system are fully utilized.
  • the traditional distributed management system provides multiple scheduling strategies to ensure that tasks of multiple tenants can be evenly distributed to physical nodes in the distributed system, and then the physical nodes run the assigned tasks. However, the task is handled in a way that resources are not fully utilized.
  • the embodiments of the present application provide a method and system for resource sharing and use based on preemptive scheduling, which can improve resource utilization.
  • an embodiment of the present application provides a task creation method, including: an API server obtains a task creation request; and when the API server detects that a quota of a tenant to which the task belongs includes a priority matching the task Resources, and the matching resources meet the creation conditions of the task, and the task is created according to the creation request.
  • an embodiment of the present application further provides a task scheduling method, including: a scheduler acquiring a current task to be scheduled from a task scheduling queue, and acquiring a task on each physical node that is greater than or equal to the current task priority, Forming a node-task mapping table; the scheduler determines a target physical node that meets the preset filtering condition according to the mapping table and a preset filtering condition; and the scheduler compares the current task with the target physical node Bind and send the bound information to the API server.
  • an embodiment of the present application further provides a task preemption method, including: when a physical node processes a target task to be run, obtaining a list of tasks that are running on the physical node; and the physical node detecting its remaining resources Whether the resources required for the target task operation are satisfied; if the remaining resources do not satisfy the resources required for the target task operation, the physical node reduces tasks in the task list with a priority lower than the target task priority, The priority is sequentially shifted into the queue to be removed from low to high until the remaining resources obtained after the physical node executes the task in the task list meets the resources required for the target task to run, and the target is used.
  • a task preempts a task in the queue to be removed; and the physical node invokes an execution environment to run the target task.
  • an embodiment of the present application further provides a resource sharing and usage method based on preemptive scheduling, including: an API server obtains a task creation request; when the API server detects that the quota of the tenant to which the task belongs includes the Resources with matching priorities of the tasks, and the matching resources meet the creation conditions of the tasks, the tasks are created according to the creation request; the scheduler obtains the tasks created by the API server, and forms a task scheduling queue; The scheduler obtains a current task to be scheduled from the task scheduling queue, and obtains a task on each physical node that is greater than or equal to the current task priority to form a node-task mapping table; the scheduler according to the mapping table and The preset screening condition determines a target physical node that meets the preset screening condition; the scheduler binds the current task to the target physical node, and sends the bound information to an API server; the physical node listens to the The binding information of the task and the physical node in the API server is described, and the corresponding one is obtained
  • Tasks and form a task queue; when the physical node processes a target task to be run in the task queue, obtain a list of tasks that are running on the physical node; the physical node detects whether its remaining resources meet the target Resources required for task operation; if the remaining resources do not meet the resources required for the target task operation, the physical node reduces tasks in the task list with a priority lower than the target task priority, from the lowest priority High order is sequentially moved into the queue to be removed, until the remaining resources obtained after the physical node executes the task in the task list meets the resources required for the target task to run, and the target task is used to preempt the to-be-removed Tasks in a queue; and the physical node invokes an execution environment to run the target task.
  • an embodiment of the present application further provides an API server, including: a request acquisition module configured to acquire a task creation request; a task creation module configured to detect that a quota of a tenant to which the task belongs includes the A resource with a priority matching task is described, and the matching resource satisfies the creation condition of the task, and the task is created according to the creation request.
  • an embodiment of the present application further provides a scheduler, including: a mapping table forming module, configured to obtain a current task to be scheduled from a task scheduling queue, and obtain a current task greater than or equal to the current task on each physical node.
  • Priority tasks form a node-task mapping table;
  • a screening module is configured to determine a target physical node that meets the preset screening conditions according to the mapping table and a preset screening condition;
  • a binding module is configured to set the current The task is bound to the target physical node and sends the bound information to the API server.
  • an embodiment of the present application further provides a task preemption device, including: a task list obtaining module configured to obtain a running task list on the physical node when processing a target task to be run; a detection module, It is set to detect whether the remaining resources on the physical node meet the resources required for the target task operation; the preemption module is set to lower the priority in the task list when the remaining resources do not meet the resources required for the target task operation
  • the tasks at the priority of the target task are sequentially moved into the queue to be removed in order of priority from low to high until the remaining resources obtained by the physical node executing the tasks in the task list satisfy the operation of the target task.
  • Required resources, and use the target task to preempt tasks in the queue to be removed; a task execution module is configured to call an execution environment to run the target task.
  • an embodiment of the present application further provides a resource sharing and use system based on preemptive scheduling, including an API server, a scheduler, and a task preemption device provided by the embodiments of the present application.
  • an embodiment of the present application provides a device, including: one or more processors; and a storage device, configured to store one or more programs.
  • the one or more programs When executed by the one or more processors, cause the one or more processors to implement the task creation method provided by the embodiment of the present application, or implement the task scheduling method provided by the embodiment of the present application Or implement the task preemption method provided in the embodiments of the present application.
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the method for creating a task provided by the embodiment of the present application is implemented, or the embodiment provided by the embodiment of the present application is provided.
  • the task scheduling method, or the task preemption method provided in the embodiment of the present application.
  • the technical solution provided in the embodiment of the present application determines the priority of the resource quota of the tenant and matches the priority of the task with the resources of each priority under the owning tenant to determine whether to create a task.
  • the scheduler filters physical nodes based on task priorities and preset filtering conditions, selects the most suitable physical node, and schedules the current task to be scheduled to the most suitable physical node. When resources are tight, only the logical ones are performed. Resource preemption does not immediately preempt resources. This delayed preemption scheduling method can logically free up resources for high-priority tasks.
  • preempted tasks can improve resources. Utilization.
  • preemption of low-priority tasks can enable the physical node to prioritize important tasks and increase resources Utilization.
  • FIG. 1 is a flowchart of a task creation method according to an embodiment of the present application.
  • FIG. 2 is a flowchart of a task scheduling method according to an embodiment of the present application.
  • FIG. 3 is a flowchart of a task scheduling method according to an embodiment of the present application.
  • FIG. 4 is a flowchart of a task preemption method according to an embodiment of the present application.
  • FIG. 5 is a flowchart of a task preemption method according to an embodiment of the present application.
  • FIG. 6 is a flowchart of a resource sharing and using method based on preemptive scheduling according to an embodiment of the present application.
  • FIG. 7 is a structural block diagram of an API server according to an embodiment of the present application.
  • FIG. 8a is a structural block diagram of a scheduler according to an embodiment of the present application.
  • FIG. 8b is a schematic structural diagram of a scheduling system according to an embodiment of the present application.
  • FIG. 9 is a structural block diagram of a task preemption device according to an embodiment of the present application.
  • FIG. 10 is a structural block diagram of a resource sharing and utilization system based on preemptive scheduling according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a device according to an embodiment of the present application.
  • FIG. 1 is a flowchart of a task creation method according to an embodiment of the present application.
  • the method may be applied to an Application Programming Interface (API) server (API Server) and executed by an API server.
  • API server may be A component in the cluster management platform and implemented in software and / or hardware.
  • the cluster management platform can be a platform for managing a large number of resources in the cluster.
  • the cluster management platform includes but is not limited to Kubernetes and Mesos, and can be integrated. In multiple computer devices.
  • the task creation method provided in the embodiment of the present application includes S110-S120.
  • the API server obtains a task creation request.
  • the method provided in the embodiment of the present application may be applied to a cluster, and the cluster may include multiple physical nodes.
  • the resources on each physical node may be shared resources of multiple tenants, and multiple physical nodes may be managed by the cluster management platform.
  • the cluster management platform assigns tasks to physical nodes to enable the physical nodes to perform corresponding tasks.
  • the cluster management platform can be integrated in multiple computer devices. Multiple computer devices can be operated by users. Users can log in to the cluster management platform and submit task creation requests.
  • the API server in the cluster management platform obtains the creation of tasks submitted by users. Request, create task. Task scheduling is performed by the scheduler in the cluster management platform, and tasks are reasonably allocated to corresponding physical nodes. The task is performed by a physical node.
  • the API server can be a component in the cluster management platform, which can complete the creation of tasks; provide rich functional plug-ins, and improve the management of the cluster.
  • the API server may obtain a request for creating a task submitted by a user.
  • the API server may obtain a request for creating an application submitted by the user.
  • the priority of the task can be set.
  • the quota can be a group of resources.
  • this group of resources can include processors (Central Processing Unit). , CPU), memory, graphics processing unit (GPU), field programmable gate array (Field Programmable Gate Array, FPGA), artificial intelligence (Artificial intelligence) chip, processor priority and memory priority Level, GPU priority, FPGA priority, AI chip priority, etc., resource quotas can be set in advance for each tenant.
  • processors Central Processing Unit
  • GPU graphics processing unit
  • FPGA field programmable gate array
  • Artificial intelligence Artificial intelligence
  • each task carries identification information
  • the API server can identify the tenant to which each task belongs based on the identification information, and determine whether the quota of the tenant contains a priority matching the task.
  • Resources if the priority matching resources contained in the task are included, continue to determine whether the matching resources meet the creation conditions of the task, and when the matching resources meet the creation conditions of the task, create a task.
  • the creation conditions can be the number of CPUs and / or memory usage, or other conditions. For example, when a user submits a task creation request, the priority of the task can be set to high priority through the cluster management platform, and then the priority of the resource required by the task is also high priority, such as: 10 tasks required by the task High priority CPU. If the quota of the tenant to which the task belongs contains high-priority CPUs, and the number of high-priority CPUs is greater than or equal to 10, the task is created.
  • the server acquiring task creation request includes: an API server acquiring a task pod creation request.
  • the API server detects that the quota of the tenant to which the task belongs contains resources matching the priority of the task, and the matching resources meet the creation conditions of the task
  • creating the task according to the creation request includes: when the The API server detects that the quota quota of the tenant namespace to which the task pod belongs contains a resource quota value that matches the priority corresponding to the pod, and the quota value meets the creation conditions of the task pod.
  • Create request to create a pod includes: an API server acquiring a task pod creation request.
  • pod is the smallest and simplest unit that can be created and deployed in Kubernetes.
  • a pod represents a process running in the cluster.
  • a pod is a component in Kubernetes. For example, you can create an application, start a process, and so on.
  • a pod encapsulates an application container (in some cases, several containers). This container can store independent network IPs, policy options that manage how the container runs, and so on.
  • pod represents a unit of deployment: an instance of an application in kubernetes, which may be combined by one or more containers to share resources.
  • creating a pod can be creating an application and so on.
  • a namespace is an abstract collection of a set of resources and objects. For example, it can be used to divide objects inside the kubernetes system into different project groups or user groups.
  • a namespace is often used to isolate different tenants or users.
  • quota can be used for resource management and resource limit. The size of the quota value can represent the number of resources. For example, the resources set under a tenant are 20 high-priority CPUs, and the quota value in the namespace can be 20, the quota value can represent the number of resources.
  • An embodiment of the present application provides a task creation method.
  • a task creation request is obtained, it is detected whether the quota of the tenant to which the task belongs contains resources matching the task priority, and whether the resources matching the task priority are met.
  • the task creation condition is created when both conditions are met (the quota of the owning tenant includes both the resource matching the task priority and the resource also meets the task creation condition).
  • This embodiment uses the Priority setting of resource quotas and matching the priority of tasks with the resources of each priority under the tenant to determine whether to create a task can allow tenants to use resources preferentially when resources are tight and prevent tenants from abusing high priority Level of resources, resulting in the "starvation" phenomenon that low-priority tasks cannot continuously obtain resources.
  • FIG. 2 is a flowchart of a task scheduling method provided by an embodiment of the present application.
  • the method may be applied to a scheduler, which may be a component of a cluster management platform and implemented in software and / or hardware.
  • the cluster management platform It can be a platform for managing a large number of hardware resources in a cluster.
  • the cluster management platform includes, but is not limited to, Kubernetes and Mesos, and can be integrated in multiple computer devices.
  • a cluster may include multiple physical nodes, resources on the multiple physical nodes may be shared resources of multiple tenants, and multiple physical nodes may be performed by a cluster management platform Management, the cluster management platform assigns tasks to physical nodes to enable the physical nodes to perform corresponding tasks.
  • the cluster management platform can be integrated in multiple computer devices. Multiple computer devices can be operated by users. Users can log in to the cluster management platform and submit task creation requests.
  • the API server in the cluster management platform obtains the creation of tasks submitted by users. Request, create a task, task scheduling by the scheduler in the cluster management platform, reasonably assign the task to the corresponding physical node, and the physical node executes the task.
  • the embodiment of the present application is applied at a stage where a scheduler performs task scheduling.
  • the technical solution provided in the embodiment of the present application includes S210-S230.
  • the scheduler obtains a current task to be scheduled from a task scheduling queue, and obtains a task with a priority greater than or equal to the current task on each physical node to form a node-task mapping table.
  • the scheduler may be a component of the cluster management platform, and may listen to tasks created in the API server from the API service, and read tasks from the API server, and the read tasks form a task scheduling queue. The scheduler schedules tasks according to the order of the tasks in the task scheduling queue.
  • the physical node can be various physical machines, and the scheduler can obtain resource information (including all resources and available resources) and the task queue running on each physical node from each physical node. Among them, each task in the task queue has a priority.
  • the scheduler obtains the current task to be scheduled from the task scheduling queue, it obtains a task with a priority greater than or equal to the current task on each physical node, and forms a node-task mapping table. For example, the priority of the current task is high priority. Tasks greater than or equal to the high priority on physical node 1 are task 1, task 2, and task 3.
  • the scheduler obtains task 1, task 2, and task on physical node 1. 3. Form a node-task mapping table.
  • the tasks on the physical node that have a priority equal to or greater than the current task include: tasks that are running on the physical node and that are greater than or equal to the current task priority, and tasks that are to be run on the physical node and that are greater than or equal to the current task priority.
  • the scheduler determines a target physical node that best meets the preset filtering condition according to the mapping table and a preset filtering condition.
  • the scheduler performs filtering based on the preset filtering conditions from the node-task mapping table to select the target physical nodes that best meet the preset filtering conditions.
  • the preset filtering conditions may include a matching condition between a resource required by the current task and a remaining resource on the physical node, a matching condition between a port required by the current task and a port on the physical node, and the like.
  • the scheduler determining a target physical node that best meets the preset filtering condition according to the mapping table and a preset filtering condition includes: the scheduler filters out from the mapping table Physical nodes that meet the screening conditions of the first stage form a node group; the physical nodes of the node group are scored according to the mapping table and the second-stage preferred conditions, and the physical node with the highest score is selected as the target physical node.
  • the screening conditions in the first stage and the preferred conditions in the second stage are not the same.
  • the filtering conditions in the first stage can be the matching conditions between the ports required by the current task and the ports on the physical node, whether the physical node has a special label, etc.
  • the preferred conditions in the second stage can be the resources and physical required by the current task
  • the matching conditions between the remaining resources on the nodes, and the second-stage preferred conditions may include one condition or multiple conditions.
  • weights can also be set for each condition, and the score of the physical node can be determined according to the weights.
  • the second presets a matching condition between the resources required for the current task and the remaining resources on the physical node.
  • the scheduler selects a physical node with a GPU to form a node group. The scheduler determines whether the remaining resources on the physical nodes in the node group meet the resource conditions required by the current task, removes the physical nodes that do not meet the conditions, and scores the physical nodes that meet the resource conditions required by the current task. The more resources remaining, the higher the score for the physical node, the physical node with the highest score is the target physical node.
  • the method for selecting a target physical node includes, but is not limited to, the foregoing methods.
  • the physical node with the highest score is selected as the target physical node, that is, the physical node with the highest score is selected as the most suitable physical node.
  • the amount of data processing in each screening can be reduced. To improve the efficiency of task scheduling.
  • the scheduler binds the current task to the target physical node, and sends the bound information to an API server.
  • the scheduler binds the current task to the physical nodes (i.e., the target physical nodes) that are most closely matched with the preset filtering conditions, and sends the bound information to the API server so that each physical node can read from the API server Take the respective tasks performed.
  • the physical nodes i.e., the target physical nodes
  • the scheduler filters multiple physical nodes based on the priority of the task and preset filtering conditions, and screens out the most appropriate physical node (that is, the target physical node) corresponding to each task, and schedules the current task to be scheduled. To the most appropriate physical node.
  • resources are tight, only logical resource preemption is performed, and resources are not immediately preempted.
  • This delayed preemption scheduling method can logically give up resources for high-priority tasks. When resources are not fully utilized, Keeping preempted tasks can improve resource utilization.
  • FIG. 3 is a flowchart of a task scheduling method according to an embodiment of the present application.
  • the method provided by this embodiment can be applied to a kubernetes system.
  • the technical solution provided by this embodiment includes S310-S340.
  • the scheduler obtains the current pod to be scheduled from the pod scheduling queue, and obtains a pod with a priority greater than or equal to the current pod on each physical node to form a node-pod mapping table.
  • the scheduler screens out physical nodes that meet the first-stage screening conditions from the mapping table to form a node group.
  • the physical nodes of the node group are scored according to the mapping table and the second stage preferred conditions, and the physical node with the highest score is selected as the target physical node.
  • the scheduler binds the current pod to the target physical node, and sends the bound information to the API server.
  • the scheduler filters the physical nodes based on the priority of the task and preset filtering conditions, selects the most suitable physical node, and schedules the current task to be scheduled to the most suitable physical node.
  • resources are tight, only logical resource preemption is performed, and resources are not immediately preempted.
  • This delayed preemption scheduling method can logically give up resources for high-priority tasks. When resources are not fully utilized, Keeping preempted tasks can improve resource utilization.
  • FIG. 4 is a flowchart of a task preemption method provided by an embodiment of the present application.
  • the method may be executed by a task preemption device, the device may be implemented by software and / or hardware, and the device may be integrated in a computer device.
  • the task preemption method provided in the embodiment of the present application is applicable to a scenario where a physical node processes a task.
  • the technical solution provided in the embodiment of the present application includes: S410-S440.
  • the physical node may be a computer device, for example, a physical machine.
  • the physical node can obtain the corresponding task by monitoring the binding information between the task and the physical node in the API server, and the obtained task forms a task queue.
  • the physical nodes process the tasks in sequence according to the order of the tasks in the task queue.
  • the currently processed task is called the target task to be run.
  • the physical node obtains a list of tasks that are running on the physical node. Among them, the task list records information of tasks running on the physical node. There can be one or more tasks running on the physical node.
  • the physical node detects whether its remaining resources meet the resources required for the target task to run.
  • the resources required for the target task to run may include CPU, memory, and so on.
  • the remaining resources of the physical node can be understood as the resources available on the physical node. For example, if the number of CPUs remaining on the physical node is 10 and the memory is 1G. The CPU required for the target task to run is 10 and the memory is 2G. The remaining resources on the physical node cannot meet the resources required for the target task to run.
  • the physical node moves tasks in the task list with a priority lower than the priority of the target task into the queue to be removed in order from low to high (the physical node does not execute Move the tasks in the queue to be removed) until the remaining resources obtained after the physical node executes the tasks in the task list satisfy the resources required for the target task to run, and use the target task to preempt the pending Remove tasks from the queue.
  • the physical node detects that its remaining resources can meet the resources required for the target task to run, it directly invokes the execution environment to run the target task. If the physical node detects that its remaining resources do not meet the resources required for the target task to run, it sorts the tasks in the task list in order of priority from low to high, and assigns tasks with a priority lower than the priority of the target task according to The priority is moved into the queue to be removed from low to high, until the remaining resources obtained after the physical node executes the task in the task list meets the resources required for the target task to run. The target task is used to preempt the tasks in the queue to be removed. Resources, that is, to stop running tasks in the queue to be removed.
  • the target task is refused to be executed.
  • the tasks in the running task list on the physical node have a priority lower than the priority of the target task (A, B, and C respectively) .
  • into the queue to be removed that is, first move A into the queue to be removed to determine whether the remaining resources obtained after the physical node runs the task list B, C, D and E are satisfied Resource conditions required for the target task to run. If the remaining resources obtained after the physical nodes run tasks B, C, D, and E meet the resource conditions required for the target task to run, using the target task to preempt A that is running, that is, to stop A.
  • the physical node invokes an execution environment to run the target task.
  • preemption of a low-priority task may enable the physical node to prioritize important tasks. Can improve resource utilization.
  • the task preemption method further includes: the physical node acquires resource usage information every set time interval; if the physical node determines that the resource usage information reaches a preset limit condition, The tasks in the task list are moved into the queue to be removed in order of priority from low to high until the resource usage information determined after the physical node executes the tasks in the task list does not reach a preset limit condition And stop the tasks in the queue to be removed.
  • the task in the task list is a task running on the current physical node.
  • each physical node acquires resource usage information at a set interval, determines whether the resource usage information reaches a preset limit condition, and determines whether task preemption needs to be triggered. If resource usage information reaches a preset limit, task preemption is triggered. If resource usage information does not reach a preset limit, task preemption is not required.
  • the task preemption process is to sort the tasks in the task list according to the priority, and move the tasks in the task list to the queue to be removed in order from low to high priority, until the physical node executes the task list. The resource usage information determined after the task does not reach the preset limit, and the task in the queue to be removed is stopped
  • the preset limiting condition may be that the resource usage reaches a set value, or it may be another limiting condition. For example, if the resource usage of a physical node reaches a set value, task preemption is triggered.
  • the physical node triggers task preemption according to the resource usage information, which can improve resource utilization and when resources are tight, it can preempt resources from low-priority tasks and prioritize important tasks.
  • FIG. 5 is a flowchart of a task preemption method provided by an embodiment of the present application.
  • the method provided by the embodiment of the present application runs in a Kubernetes system. As shown in FIG. 5, the method provided in this embodiment of the present application includes: S510-S540.
  • kubelet is a component of the Kubernetes system, which can monitor the pod, mount the volumes needed by the pod, download the secret of the pod, run the container in the pod through docker / rkt, and periodically execute the liveness probe defined for the container in the pod. Report the status of the pod to other components of the system and the status of the nodes.
  • the physical node detects whether its remaining resources meet the resources required for the target pod to run through a kubelet.
  • the physical node uses a kubelet to move the pods in the pod list with a priority lower than the target pod priority into the queue to be removed in order from low to high, until the physical
  • the remaining resources obtained after the node executes the pods in the pod list satisfy the resources required for the target pod to run, and use the target pods to preempt the pods in the queue to be removed.
  • the physical node invokes an execution environment through a kubelet to run the target pod.
  • a physical node processes a target pod to be run, if the remaining resources of the physical node do not meet the conditions required for the target pod to run, a low priority pod (priority lower than the priority of the target pod) is preempted.
  • the physical node can be made to process an important task first (in this embodiment, the important task refers to a target pod), and the utilization rate of resources can be improved.
  • FIG. 6 is a flowchart of a resource sharing and using method based on preemptive scheduling according to an embodiment of the present application.
  • the method is executed by a resource sharing and using system based on preemptive scheduling, and the system may be implemented by software and / or hardware. achieve.
  • the method provided in the embodiment of the present application may be applied to a cluster, and the cluster may include multiple physical nodes.
  • the resources on each physical node may be shared resources of multiple tenants, and multiple physical nodes may be managed by the cluster management platform. , Assign the task to the physical node so that the physical node performs the corresponding task.
  • the cluster management platform can be integrated in multiple computer devices. Multiple computer devices can be operated by users. Users can log in to the cluster management platform and submit task creation requests.
  • the API server in the cluster management platform obtains the creation of tasks submitted by users. Request, create a task, task scheduling by the scheduler in the cluster management platform, reasonably assign the task to the corresponding physical node, and the physical node executes the task
  • the technical solution provided in the embodiment of the present application includes: S610-S692.
  • the API server obtains a task creation request.
  • the scheduler obtains the task created by the API server and forms a task scheduling queue.
  • the API server acquires task creation requests in real time, and creates tasks for creation requests that meet the conditions. So there is at least one task in the API server.
  • the scheduler obtains all tasks created by the API server and forms these tasks into a task scheduling queue.
  • the scheduler obtains a current task to be scheduled from the task scheduling queue, and obtains a task with a priority equal to or greater than the current task on each physical node to form a node-task mapping table.
  • the scheduler schedules tasks in turn according to the order of the tasks in the task scheduling queue.
  • the current task to be scheduled refers to the task currently being scheduled by the scheduler in the task scheduling queue.
  • the scheduler determines a target physical node that best meets the preset filtering condition according to the mapping table and a preset filtering condition.
  • the scheduler binds the current task to the target physical node, and sends the bound information to an API server.
  • the scheduler schedules the tasks in the task scheduling queue in order to obtain the binding information of each task in the task scheduling queue and its corresponding target physical node.
  • the physical node obtains the corresponding task by monitoring the binding information between the task and the physical node in the API server, and forms a task queue.
  • the physical node listens to the binding information in the API server, and determines the task corresponding to the physical node through the target physical node in the binding information.
  • the task can be one or multiple, and these tasks form a task queue.
  • the physical node processes the tasks in turn according to the order of the tasks in the task queue.
  • the target task to be run refers to the task currently being processed by the physical node in the task queue.
  • the physical node detects whether its remaining resources meet the resources required for the target task to run.
  • the physical node moves the tasks in the task list with a priority lower than the target task, and moves the tasks into the queue to be removed in order of priority from low to high until the physical node executes the task.
  • the remaining resources obtained by the tasks in the list satisfy the resources required for the target task to run, and use the target task to preempt tasks in the queue to be removed.
  • the physical node invokes an execution environment to run the target task.
  • Kubernetes version 1.3 of the related technology is based on the Quality of Service resource sharing scheme, which is used to manage shared resources.
  • the quality of service from high to low is Guarantee, Burstable, and Best Effort.
  • the Best Effort task can be scheduled and run when the cluster resources are not fully used.
  • cluster resources are tight, Best Effort's tasks are preempted first. This scheme does not consider the task scheduling link.
  • the cluster is full, it cannot free up resources for high-quality tasks, and it cannot limit the number of BestEffort tasks in the tenant, and it cannot distinguish the order in which BestEffort tasks are preempted.
  • Priority-based scheduling scheme introduced in Kubernetes version 1.8 of related technologies. Tasks can be set with priority. When resources are tight, the scheduler will preempt low-priority tasks to provide sufficient resources for high-priority tasks. However, the preemption of tasks in this scheme occurs in the scheduler, that is, when the logical scheduling of the cluster is full, there will be cases where the resources in the cluster are not fully utilized, the resource utilization is not high, and each priority cannot be accurately limited. The number of tasks.
  • the method provided in the embodiment of the present application sets the priority of the resource quota of the tenant, which can accurately limit the number of multiple priority tasks under each tenant.
  • the order in which tasks are preempted can be distinguished.
  • priority-based task preemption occurs at the physical node and does not occur at the scheduler. It can logically free up resources for high-priority tasks, and continue to run the preempted tasks when resources are not fully utilized. , Can improve the utilization of resources.
  • the method provided in the embodiment of the present application determines the priority of a resource quota of a tenant, and matches the priority of a task with resources of multiple priorities under the owning tenant to determine whether to create a task.
  • resources are prioritized to prevent tenants from abusing high-priority resources, which causes low-priority tasks to continue to fail to acquire resources, resulting in "starvation”.
  • the scheduler filters the physical nodes based on the priority of the task and preset filtering conditions, selects the most suitable physical node, and schedules the current task to be scheduled to the most suitable physical node.
  • resources are tight, it only performs logical Resource preemption does not immediately preempt resources. This delayed preemption scheduling method can logically give up resources for high-priority tasks.
  • FIG. 7 is a structural block diagram of an API server according to an embodiment of the present application. As shown in FIG. 7, the API server includes a request obtaining module 710 and a task creation module 720.
  • the request acquisition module 710 is configured to acquire a creation request of a task.
  • the task creation module 720 is configured to, when it is detected that the quota of the tenant to which the task belongs includes resources matching the priority of the task, and the matching resources satisfy the task creation conditions, create the task according to the creation request.
  • the apparatus is applied in a Kubernetes system, and the request acquisition module 710 is configured to acquire a pod creation request.
  • the task creation module 720 is configured to: when it is detected that the quota of the namespace to which the pod belongs contains a resource quota value that matches the priority of the pod, and the quota value meets the creation conditions of the pod, create a pod according to the creation request .
  • the above task creation device can execute the task creation method provided in any embodiment of the present application, and has the corresponding functional modules and beneficial effects of executing the task creation method.
  • FIG. 8a is a structural block diagram of a scheduler according to an embodiment of the present application. As shown in FIG. 8a, the scheduler includes a mapping table forming module 810, a screening module 820, and a binding module 830.
  • the mapping table forming module 810 is configured to obtain a current task to be scheduled from a task scheduling queue, and obtain a task on each physical node whose priority is greater than or equal to the current task to form a node-task mapping table.
  • the filtering module 820 is configured to determine, according to the mapping table and a preset filtering condition, a target physical node that best meets the preset filtering condition.
  • the binding module 830 is configured to bind the current task to the target physical node, and send the bound information to an API server.
  • the screening module 820 is configured to screen physical nodes that meet the first-stage screening conditions from the mapping table to form a node group;
  • the physical nodes of the node group are scored, and the physical node with the highest score is selected as the target physical node.
  • the apparatus is applied in a Kubernetes system, and the mapping table forming module 810 is configured to obtain a current pod to be scheduled from a pod scheduling queue, and obtain a pod on each physical node that is greater than or equal to a priority assigned to the current pod To form a node-pod mapping table.
  • the binding module 830 is configured to bind the current pod to the target physical node and send the binding information to the API server.
  • the scheduler structure can also be other structural forms, so that the task scheduling method can be executed.
  • the scheduler may include a scheduling system.
  • the scheduling system may include four parts: a node information list 840, a screening algorithm library 850, a preferred algorithm library 860, and an unscheduled queue 870.
  • the node information list 840 is set to record the currently available physical node information, including resource information (all resources and available resources) on the physical node, and task queues that have been running on the physical node. This part of the information is the key information when the scheduling method is specified and needs to be synchronized in real time to ensure that the scheduling system has a comprehensive understanding of resources and tasks.
  • the screening algorithm library 850 is set to predefine a variety of algorithms for screening physical nodes to ensure removal of physical nodes that do not meet the task execution conditions.
  • the optimization algorithm library 860 is set to define a plurality of preferred node algorithms and algorithm weights in advance.
  • the physical node with the highest score calculated by the preferred algorithm will be selected as the scheduling node, that is, the target physical node.
  • the scheduling queue 870 is set as a queue formed by unscheduled tasks, and is a priority queue to ensure that high-priority tasks are scheduled first.
  • the above device can execute the task scheduling method provided by any embodiment of the present application, and has corresponding function modules and beneficial effects for executing the task scheduling method.
  • FIG. 9 is a structural block diagram of a task preemption device according to an embodiment of the present application.
  • the task preemption device includes a task list acquisition module 910, a detection module 920, a preemption module 930, and a task execution module 940.
  • the task list obtaining module 910 is configured to obtain a running task list on the physical node when processing a target task to be run.
  • the detection module 920 is configured to detect whether the remaining resources on the physical node meet the resources required for the target task to run.
  • the preemption module 930 is configured to: when the remaining resources on the physical node do not meet the resources required for the target task to run, the physical node lowers the priority of the task list to the task of the target task and lowers the priority according to the priority. Move to the queue to be removed in order from highest to highest (that is, remove tasks from the task list in order from low to high in order), until the remaining resources obtained after the physical node executes the tasks in the task list satisfy Describe the resources required for the target task to run, and use the target task to preempt tasks in the queue to be removed.
  • the task execution module 940 is configured to invoke an execution environment by the physical node to run the target task.
  • the preemption module may be further configured to acquire resource usage information at a set interval.
  • the tasks in the task list are sequentially moved into the queue to be removed in order of priority from low to high until the physical node executes the task list.
  • the resource usage information determined after the tasks in the server does not reach the preset limit condition, and the tasks in the queue to be removed are stopped.
  • the device is applied in a Kubernetes system
  • the target task is a target pod
  • the task list is a pod list
  • the tasks in the task list are running pods on the physical node.
  • the above device can execute the task preemption method provided by any embodiment of the present application, and has the corresponding functional modules and beneficial effects of executing the task preemption method.
  • FIG. 10 is a schematic structural diagram of a task preemption system according to an embodiment of the present application.
  • the task preemption system includes an API server 1010 provided by the foregoing embodiment, a scheduler 1020 provided by the foregoing embodiment, and the foregoing implementation.
  • the example provides a task preemption device 1030.
  • the API server 1010 and the scheduler 1020 are components of a cluster management platform, and the cluster management platform is integrated on a computer device used by a user.
  • the task preemption device 1030 may be integrated on a physical machine in a physical node.
  • FIG. 11 is a schematic structural diagram of a device according to an embodiment of the present application. As shown in FIG. 11, the device includes: one or more processors 1110. In FIG. 11, a processor 1110 is used as an example; and a memory 1120.
  • the device may further include an input device 1130 and an output device 1140.
  • the processor 1110, the memory 1120, the input device 1130, and the output device 1140 in the device may be connected through a bus or other manners.
  • a connection through a bus is used as an example.
  • the memory 1120 is a non-transitory computer-readable storage medium and can be used to store software programs, computer-executable programs, and modules, such as program instructions / modules corresponding to a task creation method in the embodiments of the present application (for example, drawings The request acquisition module 710 and the task creation module 720 shown in FIG. 7) or the program instructions / modules corresponding to a task scheduling method in the embodiment of the present application (for example, the mapping table forming module 810 and the screening module shown in FIG. 8) 820 and binding module 830), or a program instruction / module corresponding to a task preemption method in the embodiment of the present application (for example, the task list acquisition module 910, the detection module 920, the preemption module 930, and Task execution module 940).
  • a task creation method for example, drawings
  • the program instructions / modules corresponding to a task scheduling method in the embodiment of the present application for example, the mapping table forming module 8
  • the processor 1110 runs software programs, instructions, and modules stored in the memory 1120, thereby executing various functional applications and data processing of the computer device, that is, a task creation method that implements the foregoing method embodiment: an API server acquires task creation Request; and when the API server detects that the quota of the tenant to which the task belongs contains resources matching the priority of the task, and the matching resources meet the conditions for creating the task, the task is created according to the creation request.
  • the processor 1110 implements a task scheduling method of the foregoing method embodiment by running software programs, instructions, and modules stored in the memory 1120: the scheduler obtains the current task to be scheduled from the task scheduling queue, and acquires each physical A task on a node that is greater than or equal to the priority of the current task forms a node-task mapping table; the scheduler determines a target physical node that best meets the preset filtering condition according to the mapping table and a preset filtering condition; and The scheduler binds the current task to the target physical node, and sends the bound information to an API server.
  • the processor 1110 implements a task preemption method of the foregoing method embodiment by running software programs, instructions, and modules stored in the memory 1120: when a physical node processes a target task to be run, it acquires that the physical node is running.
  • the physical node detects whether its remaining resources meet the resources required for the target task to run; if the physical node detects that its remaining resources do not meet the resources required for the target task to run, the physical node adds the task list Tasks whose priority is lower than the priority of the target task are sequentially moved into the queue to be removed in order of priority from low to high, until the remaining resources obtained after the physical node executes the tasks in the task list Satisfy the resources required for the target task to run, and use the target task to preempt tasks in the queue to be removed; and the physical node invokes an execution environment to run the target task.
  • the memory 1120 may include a storage program area and a storage data area, where the storage program area may store an operating system and application programs required for at least one function; the storage data area may store data created according to the use of the computer device, and the like.
  • the memory 1120 may include a high-speed random access memory, and may further include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage device.
  • the memory 1120 may optionally include a memory remotely set relative to the processor 1110, and these remote memories may be connected to the terminal device through a network. Examples of the above network include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
  • the input device 1130 may be used to receive inputted numeric or character information, and generate key signal inputs related to user settings and function control of a computer device.
  • the output device 1140 may include a display device such as a display screen.
  • An embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements a task creation method as provided in the embodiment of the present application: obtaining a task creation request; and When it is detected that the quota of the tenant to which the task belongs contains resources matching the priority of the task, and the matching resources meet the creation conditions of the task, the task is created according to the creation request.
  • a task scheduling method of the above method embodiment is implemented, that is, the scheduler obtains a current task to be scheduled from a task scheduling queue, and obtains the current task at or above each physical node.
  • a task preemption method of the foregoing method embodiment is implemented: when a physical node processes a target task to be run, it obtains a list of tasks that are running on the physical node; the physical node detects its remaining Whether the resource meets the resources required for the target task operation; if the physical node detects that its remaining resources do not meet the resources required for the target task operation, the physical node lowers the priority in the task list than the target task priority High-level tasks, which are sequentially moved into the queue to be removed in order from low to high, until the remaining resources obtained after the physical node executes the tasks in the task list satisfy the resources required for the target task to run, And using the target task to preempt a task in the queue to be removed; and the physical node invokes an execution environment to run the target task.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal propagated in baseband or transmitted as part of a carrier wave, which carries a computer-readable program code. Such a propagated data signal may take a variety of forms, including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. .
  • Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for performing the operations of this application may be written in one or more programming languages, or a combination thereof, including programming languages such as Java, Smalltalk, C ++, and also conventional Procedural programming language-such as "C" or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, as an independent software package, partly on the user's computer, partly on a remote computer, or entirely on a remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider) Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider Internet service provider

Abstract

本申请实施例公开了基于抢占式调度的资源共享使用方法、系统及设备,其中,该方法包括:API服务器创建任务;当调度器处理当前任务时,基于优先级筛选出最符合预设筛选条件的目标物理节点,将当前任务与目标物理节点绑定信息发送到API服务器;当物理节点处理任务队列中待运行的目标任务时,获取物理节点上正在运行的任务列表;若物理节点检测其剩余资源没有满足目标任务运行所需的资源;将任务列表中的优先级低于目标任务优先级的任务,按照优先级从低到高的顺序移入待移除队列,直至物理节点执行任务列表中的任务所得到的剩余资源满足目标任务运行所需的资源,抢占待移除队列中的任务。

Description

基于抢占式调度的资源共享使用方法、系统及设备
本申请要求在2018年6月25日提交中国专利局、申请号为201810659298.5的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,例如涉及一种基于抢占式调度的资源共享使用方法系统及设备。
背景技术
在资源共享的分布式系统中,多个租户共享使用资源;同时每个租户使用的资源又需要有一定的限制,以保证每个租户都能够有资源使用,不出现租户资源“饿死”的现象。分布式系统中的调度器是通过有效地调度多个租户的作业或任务,从而保证每个租户的作业或任务得到稳定而又快速地执行,同时分布式系统内的资源得到充分利用。
传统的分布式管理系统提供了多个调度策略,以保证多个租户的任务可以在分布式系统中均衡分配到物理节点,再由物理节点运行分配的任务。但是任务的处理方式存在资源没有充分利用的现象。
发明内容
本申请实施例提供一种基于抢占式调度的资源共享使用方法系统及设备,可以提高资源的利用率。
第一方面,本申请实施例提供了一种任务创建方法,包括:API服务器获取任务的创建请求;以及当API服务器检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务。
第二方面,本申请实施例还提供了一种任务调度方法,包括:调度器从任务调度队列获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务优先级的任务,形成节点-任务映射表;所述调度器根据所述映射表以及预设筛选条件确定符合所述预设筛选条件的目标物理节点;以及所述调度器将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
第三方面,本申请实施例还提供了一种任务抢占方法,包括:当物理节点处理待运行的目标任务时,获取所述物理节点上正在运行的任务列表;所述物理节点检测其剩余资源是否满足所述目标任务运行所需的资源;若剩余资源不满足目标任务运行所需的资源,所述物理节点将所述任务列表中的优先级低于所述目标任务优先级的任务,按照优先级从低到高的顺序依次移入待移除队列,直至所述物理节点执行所述任务列表中的任务后所得到的剩余资源满足所述目标任务运行所需的资源,并采用所述目标任务抢占所述待移除队列中的任务; 以及所述物理节点调用执行环境,运行所述目标任务。
第四方面,本申请实施例还提供了一种基于抢占式调度的资源共享使用方法,包括:API服务器获取任务的创建请求;当所述API服务器检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务;调度器获取所述API服务器创建的任务,并形成任务调度队列;所述调度器从所述任务调度队列获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务优先级的任务,形成节点-任务映射表;所述调度器根据所述映射表以及预设筛选条件确定符合所述预设筛选条件的目标物理节点;所述调度器将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器;物理节点监听所述API服务器中任务与物理节点的绑定信息,基于监听到的所述绑定信息获取对应的任务,并形成任务队列;当所述物理节点处理所述任务队列中待运行的目标任务时,获取所述物理节点上正在运行的任务列表;所述物理节点检测其剩余资源是否满足所述目标任务运行所需的资源;若剩余资源不满足目标任务运行所需的资源,所述物理节点将所述任务列表中的优先级低于所述目标任务优先级的任务,按照优先级从低到高的顺序依次移入待移除队列,直至所述物理节点执行所述任务列表中的任务后所得到的剩余资源满足所述目标任务运行所需的资源,并采用目标任务抢占所述待移除队列中的任务;以及所述物理节点调用执行环境,运行所述目标任务。
第五方面,本申请实施例还提供了一种API服务器,包括:请求获取模块,设置为获取任务的创建请求;任务创建模块,设置为当检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务。
第六方面,本申请实施例还提供了一种调度器,包括:映射表形成模块,设置为从任务调度队列中获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务优先级的任务,形成节点-任务映射表;筛选模块,设置为根据所述映射表以及预设筛选条件确定符合所述预设筛选条件的目标物理节点;绑定模块,设置为将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
第七方面,本申请实施例还提供了一种任务抢占装置,包括:任务列表获取模块,设置为当处理待运行的目标任务时,获取所述物理节点上正在运行的任务列表;检测模块,设置为检测物理节点上的剩余资源是否满足所述目标任务运行所需的资源;抢占模块,设置为当剩余资源不满足目标任务运行所需的资源时,将所述任务列表中的优先级低于所述目标任务优先级的任务,按照优先级从低到高的顺序依次移入待移除队列,直至所述物理节点执行所述任务列表中的任务所得到的剩余资源满足所述目标任务运行所需的资源,并采用所述目标任务抢占所述待移除队列中的任务;任务执行模块,设置为调用执行环境,运行所述目标任务。
第八方面,本申请实施例还提供了一种基于抢占式调度的资源共享使用系 统,包括本申请实施例提供的API服务器、调度器以及任务抢占装置。
第九方面,本申请实施例提供了一种设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序。
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本申请实施例提供的任务创建方法,或者实现本申请实施例提供的任务调度方法,或者实现本申请实施例提供的任务抢占方法。
第十方面,本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本申请实施例提供的任务创建方法,或者实现本申请实施例提供的任务调度方法,或者实现本申请实施例提供的任务抢占方法。
本申请实施例提供的技术方案,通过对租户的资源配额进行优先级的设置,并通过将任务的优先级与所属租户下每个优先级的资源进行匹配,从而确定是否创建任务,可以让租户在资源紧张时优先使用资源,防止租户滥用高优先级的资源,导致低优先级的任务持续获取不了资源出现的“饿死”现象。调度器通过任务的优先级以及预设筛选条件对物理节点进行筛选,筛选出最合适的物理节点,将待调度的当前任务调度到最合适的物理节点,当资源紧张时,只进行逻辑上的资源抢占,并没有立即抢占资源,这种延后抢占的调度方法,可以在逻辑上为高优先级的任务腾出资源,在资源没有被充分利用时,继续运行被抢占的任务,可以提高资源的利用率。当物理节点处理待运行的目标任务时,若物理节点的剩余资源不满足目标任务运行的所需的条件,基于将低优先级的任务进行抢占,可以使物理节点优先处理重要任务,可以提高资源的利用率。
附图说明
图1是本申请实施例提供的一种任务创建方法流程图。
图2是本申请实施例提供的一种任务调度方法流程图。
图3是本申请实施例提供的一种任务调度方法流程图。
图4是本申请实施例提供的一种任务抢占方法流程图。
图5是本申请实施例提供的一种任务抢占方法流程图。
图6是本申请实施例提供的一种基于抢占式调度的资源共享使用方法流程图。
图7是本申请实施例提供的一种API服务器的结构框图。
图8a是本申请实施例提供的一种调度器的结构框图。
图8b是本申请实施例提供的一种调度系统的结构示意图。
图9是本申请实施例提供的一种任务抢占装置的结构框图。
图10是本申请实施例提供的一种基于抢占式调度的资源共享使用系统结构框图。
图11是本申请实施例提供的一种设备的结构示意图。
具体实施方式
下面结合附图和实施例对本申请作说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部内容。
图1是本申请实施例提供的一种任务创建方法流程图,所述方法可应用于应用编程接口(Application Programming Interface,API)服务器(API Server),由API服务器来执行,该API服务器可以是集群管理平台中的一个组件,并采用软件和/硬件的方式来实现,集群管理平台可以是对集群中的大量的资源进行管理的平台,集群管理平台包括但不限于Kubernetes和Mesos,并可集成在多个计算机设备中。如图1所示,本申请实施例提供的任务创建方法包括S110-S120。
在S110中,API服务器获取任务的创建请求。
本申请实施例提供的方法可以应用在集群中,在集群中可以包含多个物理节点,每个物理节点上的资源可以是多个租户的共享资源,多个物理节点可以由集群管理平台进行管理,集群管理平台将任务分配给物理节点以使物理节点执行相应的任务。集群管理平台可以集成在多个计算机设备中,多个计算机设备可以由用户进行操作,用户可以登录集群管理平台,并递交任务的创建请求,集群管理平台中的API服务器获取用户提交的任务的创建请求,创建任务。由集群管理平台中的调度器进行任务调度,将任务合理分配给对应的物理节点。由物理节点执行该任务。
其中,API服务器可以为集群管理平台中的一个组件,可以完成任务的创建;提供丰富的功能性插件,完善对集群的管理等。
其中,API服务器可以获取用户提交的一个任务的创建请求,例如,可以是获取用户提交的创建一个应用的请求。其中,当用户提交任务的创建请求时,可以对任务的优先级进行设置。
在S120中,当API服务器检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务。
用户在管理粒度上被分到若干组内,每组称为一个租户,可以根据需要为每个租户预先设置配额,配额可以是一组资源,例如,该组资源可以包括处理器(Central Processing Unit,CPU)、内存、图形处理器(Graphics Processing Unit,GPU)、现场可编程门阵列(Field Programmable Gate Array,FPGA)、人工智能(Artificial Intelligence,AI)芯片、处理器的优先级以及内存的优先级、GPU优先级、FPGA优先级、AI芯片优先级等,可以为每个租户预先设置资源配额。通过合理的设置配额,可以授权给租户使用合适优先级的资源,能够让租户在资源紧张的时候优先使用资源,还可以限制租户滥用高优先级的资源,导致低优先级的任务持续获取不了资源,出现“饿死”的现象。
当用户通过集群管理平台提交任务的创建请求时,每个任务携带标识信息, API服务器可以根据标识信息识别每个任务所属的租户,判断该租户的配额里是否包含与该任务的优先级匹配的资源,如果包含于该任务的优先级匹配的资源,继续判断该匹配资源是否满足该任务的创建条件,当该匹配资源满足该任务的创建条件时,则创建任务。
创建条件可以是CPU的数量和/或内存的占有率,还可以是其他条件。例如,当用户提交任务的创建请求时,可以通过集群管理平台设置该任务的优先级为高优先级,那么该任务所需资源的优先级也为高优先级,如:该任务所需10个高优先级的CPU。若该任务所属的租户的配额中包含高优先级的CPU,且高优先级的CPU的数量大于或等于10,则创建该任务。
在本申请的一个实施方式中,可选的,本申请实施例提供的方法可以应用在Kubernetes环境下。所述服务器获取任务创建请求包括:API服务器获取任务pod的创建请求。当API服务器检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务包括:当所述API服务器检测到所述任务pod所属的租户namespace的配额quota里包含与所述pod对应的优先级匹配的资源quota值,且该quota值满足所述任务pod的创建条件,则API服务器根据所述创建请求创建pod。
其中,pod是kubernetes中可以创建和部署的最小,也是最简的单位。一个pod代表着集群中运行的一个进程。pod是Kubernetes中的组件,例如,可以创建一个应用,可以启动一个进程等。pod中封装着应用的容器(有的情况下是好几个容器),该容器可以存储独立的网络IP、管理容器如何运行的策略选项等。其中,pod代表着部署的一个单位:kubernetes中应用的一个实例,可能由一个或者多个容器组合在一起共享资源。其中,创建一个pod可以是创建一个应用等。
其中,Namespace是对一组资源和对象的抽象集合,比如可以用来将kubernetes系统内部的对象划分为不同的项目组或用户组,namespace常用来隔离不同的租户或者用户。在kubernetes中,quota可以用来进行资源管理和资源限制,quota值的大小可以代表资源的多少,例如,在一个租户下设置的资源是20个高优先级的CPU,namespace中的quota值可以是20,即quota值可以代表资源的数量。
本申请实施例提供的一种任务创建方法,当获取到任务的创建请求时,通过检测任务所属租户的配额里是否包含与任务优先级匹配的资源,以及检测与任务优先级匹配的资源是否满足任务的创建条件,当两个条件均满足(所属租户的配额中既包含于该任务优先级匹配的资源,且该资源也满足该任务的创建条件)时创建该任务,本实施例通过对租户的资源配额进行优先级的设置,并通过将任务的优先级与所属租户下各优先级的资源进行匹配,从而确定是否创建任务,可以让租户在资源紧张时优先使用资源,防止租户滥用高优先级的资源,导致低优先级的任务持续获取不了资源出现的“饿死”现象。
图2是本申请实施例提供的一种任务调度方法流程图,所述方法可应用于调度器,调度器可以是集群管理平台的组件,并采用软件和/硬件的方式来实现,集群管理平台可以是对集群中的大量的硬件资源进行管理的平台,集群管理平台包括但不限于Kubernetes和Mesos,并可集成在多个计算机设备中。本申请实施例提供的方法可以应用在该环境下:在集群中可以包含多个物理节点,多个物理节点上的资源可以是多个租户的共享资源,多个物理节点可以由集群管理平台进行管理,集群管理平台将任务分配给物理节点以使物理节点执行相应的任务。集群管理平台可以集成在多个计算机设备中,多个计算机设备可以由用户进行操作,用户可以登录集群管理平台,并递交任务的创建请求,集群管理平台中的API服务器获取用户提交的任务的创建请求,创建任务,由集群管理平台中的调度器进行任务调度,将任务合理分配给对应的物理节点,由物理节点执行该任务。其中,本申请实施例应用在调度器进行任务调度的阶段。
如图2所示,本申请实施例提供的技术方案包括S210-S230。
在S210中,调度器从任务调度队列获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务的优先级的任务,形成节点-任务映射表。
其中,调度器可以是集群管理平台的组件,可以从API服务中监听API服务器中创建的任务,并从API服务器中读取任务,读取的任务形成任务调度队列。其中,调度器按照任务调度队列中任务的顺序,对任务进行调度。
物理节点可以是各种物理机,调度器可以从每个物理节点获取资源信息(包括全部资源和可用资源)和在每个物理节点上正在运行的任务队列。其中,该任务队列中的每个任务均有优先级。当调度器从任务调度队列获取待调度的当前任务时,获取每个物理节点上大于等于当前任务的优先级的任务,并形成节点-任务映射表。例如,当前任务的优先级为高优先级,物理节点1上大于或等于高优先级的任务有任务1、任务2和任务3,则调度器获取物理节点1上的任务1、任务2和任务3、并形成节点-任务映射表。其中,物理节点上大于等于当前任务的优先级的任务包括:物理节点上正在运行的且大于等于当前任务优先级的任务,以及物理节点上待运行的且大于等于当前任务优先级的任务。
在S220中,所述调度器根据所述映射表以及预设筛选条件确定最符合所述预设筛选条件的目标物理节点。
调度器从节点-任务映射表中根据预设筛选条件进行筛选,筛选出最符合预设筛选条件的目标物理节点。其中,预设筛选条件可以包括当前任务所需的资源与物理节点上的剩余资源之间的匹配条件、当前任务所需的端口与物理节点上的端口之间的匹配条件等。
在本申请一个实施方式中,可选的,所述调度器根据所述映射表以及预设筛选条件确定最符合所述预设筛选条件的目标物理节点包括:所述调度器从映射表筛选出符合第一阶段筛选条件的物理节点,形成节点组;根据映射表以及第二阶段优选条件,对所述节点组的物理节点进行评分,并筛选出分数最高的物理节点作为目标物理节点。
其中,第一阶段筛选条件和第二阶段优选条件并不相同。例如,第一阶段筛选条件可以是当前任务所需的端口与物理节点上的端口之间的匹配条件、物理节点是否有特殊标签等,第二阶段优选条件可以是当前任务所需的资源与物理节点上的剩余资源之间的匹配条件,并且第二阶段优选条件中可以包括一个条件,也可以包含多个条件。当第二阶段优选条件中包含多个条件时,也可以为每个条件设置权重,根据权重确定物理节点的评分。
对该实施方式进行举例说明,若第一阶段筛选条件为物理节点需要有GPU的标签,第二预设当前任务所需的资源与物理节点上的剩余资源之间的匹配条件。对照节点-任务映射表以及获取的物理节点的信息,调度器选取具有GPU的物理节点,形成节点组。调度器判断节点组中物理节点上的剩余资源是否满足当前任务所需的资源条件,将不满足条件的物理节点去除,并对满足当前任务所需的资源条件的物理节点进行打分,物理节点上的剩余资源越多,对该物理节点的打分越高,则分数最高的物理节点是目标物理节点。其中,筛选目标物理节点的方法包括但并不限于上述的方法。
通过上述两次筛选,筛选出分数最高的物理节点作为目标物理节点,即筛选出分数最高的物理节点作为最合适的物理节点,相对于一次筛选的情况,可以减少每次筛选时数据的处理量,提高任务调度的效率。
在S230中,所述调度器将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
调度器通过将当前任务与筛选出的最符合预设筛选条件的物理节点(即目标物理节点)进行绑定,将绑定的信息发送到API服务器,以使每个物理节点可以从API服务器读取各自执行的任务。
本申请实施例调度器基于任务的优先级以及预设筛选条件对多个物理节点进行筛选,筛选出每个任务对应的最合适的物理节点(即目标物理节点),将待调度的当前任务调度到最合适的物理节点。当资源紧张时,只进行逻辑上的资源抢占,并没有立即抢占资源,这种延后抢占的调度方法,可以在逻辑上为高优先级的任务让出资源,在资源没有被充分利用时,继续保留被抢占的任务,可以提高资源的利用率。
图3是本申请实施例提供的一种任务调度方法流程图,其中,该实施例提供的方法可以应用于kubernetes系统中,如图3所示,本实施例提供的技术方案包括S310-S340。
在S310中,调度器从pod调度队列中获取待调度的当前pod,并获取每个物理节点上大于等于所述当前pod的优先级的pod,形成节点-pod映射表。
在S320中,所述调度器从映射表筛选出符合第一阶段筛选条件的物理节点,形成节点组。
在S330中,根据映射表以及第二阶段优选条件,对所述节点组的物理节点进行评分,并筛选出分数最高的物理节点作为目标物理节点。
在S340中,所述调度器将所述当前pod与所述目标物理节点绑定,并将 绑定的信息发送到API服务器。
由此,调度器基于任务的优先级以及预设筛选条件对物理节点进行筛选,筛选出最合适的物理节点,将待调度的当前任务调度到最合适的物理节点。当资源紧张时,只进行逻辑上的资源抢占,并没有立即抢占资源,这种延后抢占的调度方法,可以在逻辑上为高优先级的任务让出资源,在资源没有被充分利用时,继续保留被抢占的任务,可以提高资源的利用率。
图4是本申请实施例提供的一种任务抢占方法流程图,所述方法可由任务抢占装置来执行,所述装置由软件和/或硬件来实现,所述装置可集成在计算机设备中。本申请实施例提供的任务抢占方法适用于物理节点处理任务的场景下。如图4所示,本申请实施例提供的技术方案包括:S410-S440。
在S410中,当物理节点处理待运行的目标任务时,获取所述物理节点上正在运行的任务列表。
其中,物理节点可以是计算机设备,例如,物理机等。物理节点可以通过监听API服务器中任务与物理节点的绑定信息,获取对应的任务,由获取的任务形成任务队列。物理节点根据任务队列中的任务的顺序依次进行处理,当前处理的任务称为待运行的目标任务。当物理节点处理待运行的目标任务时,物理节点获取该物理节点上正在运行的任务列表。其中,任务列表记载了物理节点上正在运行的任务的信息。物理节点上正在运行的任务可以有一个,也可以有多个。
在S420中,所述物理节点检测其剩余资源是否满足所述目标任务运行所需的资源。
若物理节点检测其剩余资源不满足目标任务运行所需的资源,执行S430,若物理节点检测其剩余资源满足目标任务运行所需的资源,执行S440。
在本步骤中,目标任务运行所需的资源可以包括CPU、内存等。物理节点的剩余资源可以理解为物理节点上的可用资源。例如,若物理节点剩余CPU的数量为10个,内存为1G。目标任务运行所需的CPU为10,内存为2G,则物理节点上的剩余资源不能满足目标任务运行所需的资源。
在S430中,所述物理节点将所述任务列表中的优先级低于所述目标任务的优先级的任务按照优先级从低到高的顺序依次移入待移除队列(该物理节点不执行被移入待移除队列中的任务),直至所述物理节点执行所述任务列表中的任务后所得到的剩余资源满足所述目标任务运行所需的资源,并采用所述目标任务抢占所述待移除队列中的任务。
在本步骤中,若物理节点检测其剩余资源可以满足目标任务运行所需的资源,直接调用执行环境,运行目标任务。若物理节点检测其剩余资源并不能满足目标任务运行所需的资源,将任务列表中的任务按照优先级由低到高的顺序进行排序,并将优先级低于目标任务的优先级的任务按照优先级从低到高的顺序移入到待移除队列,直至物理节点执行任务列表中的任务后所得到的剩余资源满足目标任务运行所需的资源,采用目标任务抢占待移除队列中的任务的资 源,即停止运行待移除队列中的任务。
在将任务列表中的任务移入待移除队列过程中,若物理节点上的剩余资源不满足目标任务的运行条件,则拒绝执行目标任务。
需要说明的是,当将任务列表中优先级低于目标任务优先级的任务移入待移除队列的过程中,并不停止执行待移除队列中的任务,当判断物理节点执行任务列表中的任务后所得到的剩余资源满足目标任务运行所需的资源时,才停止执行待移除队列中的任务。
对本步骤进行举例说明,若任务列表中共有5个任务(物理节点上正在运行的任务有5个),分别是A,B,C,D和E,优先级分别为1,2,3,4和5,其中,任务列表中的任务按照优先级由低到高的顺序排序是A,B,C,D和E。待处理的目标任务的优先级为4。若物理节点上的剩余资源并不能满足目标任务运行所需的资源条件,则将物理节点上正在运行的任务列表中的优先级低于目标任务优先级的任务(分别是A,B和C),按照优先级从低到高的顺序依次移入待移除队列,即先将A移入待移除队列,判断物理节点运行任务列表中的B,C,D和E后所得到的剩余资源是否满足目标任务运行所需的资源条件。若物理节点运行任务B,C,D和E后所得到的剩余资源满足目标任务运行所需的资源条件,采用目标任务将正在运行的A抢占,即将A停止。
若将A移入待移除队列后,物理节点运行任务列表中的B,C,D和E所得到的剩余资源不满足目标任务运行所需的资源条件,则继续将B移入到待移除队列中,再次检测运行任务列表中的任务C,D和E后所得到的剩余资源是否满足目标任务运行所需的资源条件,重复上述的判断步骤,直至检测到运行任务列表中的任务后所得到的剩余资源满足目标任务运行所需的资源条件。若直至任务列表中的任务的优先级不低于目标任务的优先级时(即任务列表中的任务A,B和C均被移入到待移出队列中,只剩下任务D和E时),物理节点运行任务列表中的任务D和E后所得到的剩余资源不满足目标任务运行所需的资源条件,则拒绝运行目标任务。
在S440中,所述物理节点调用执行环境,运行所述目标任务。
本申请实施例当物理节点处理待运行的目标任务时,若物理节点的剩余资源不满足目标任务运行的所需的条件,将低优先级的任务进行抢占,可以使物理节点优先处理重要任务,可以提高资源的利用率。
在上述实施例的基础上,所述的任务抢占方法还包括:所述物理节点每间隔设定时间获取资源使用信息;所述物理节点若确定所述资源使用信息达到预设限制条件,将所述任务列表中的任务按照优先级由低到高的顺序移入到所述待移除队列,直至所述物理节点执行所述任务列表中的任务后所确定的资源使用信息没有达到预设限制条件,并停止所述待移除队列中的任务。
可选地,任务列表中的任务为当前物理节点上正在运行的任务。其中,每个物理节点间隔设定时间获取资源使用信息,判断资源的使用信息是否达到预设限制条件,来判断是否需要触发任务抢占。若资源的使用信息达到预设限制条件,则触发任务抢占,否资源使用信息未达到预设限制条件,不需要触发任 务抢占。任务抢占的过程是:将任务列表中的任务按照优先级进行排序,并将任务列表中任务按照优先级由低到高的顺序依次移入到待移除队列,直至所述物理节点执行任务列表中的任务后所确定的资源使用信息没有达到预设限制条件,并停止待移除队列中的任务
其中,预设限制条件可以是资源使用达到设置值,也可以是其他限制条件。例如,若物理节点的资源使用达到设定值,则触发任务抢占。
由此,物理节点通过根据资源使用信息触发任务抢占,可以提高资源利用率以及当资源紧张时,可以从低优先级任务中抢占资源,优先处理重要任务。
图5是本申请实施例提供的一种任务抢占方法流程图,本申请实施例提供的方法运行在Kubernetes系统中。如图5所示,本申请实施例提供的方法包括:S510-S540。
在S510中,当物理节点通过kubelet处理待运行的目标pod时,获取所述物理节点上正在运行的pod列表。
其中,kubelet是Kubernetes系统的组件,可以监视pod,挂载pod所需要的volumes,下载pod的secret,通过docker/rkt来运行pod中的容器,周期的执行pod中为容器定义的liveness探针,上报pod的状态给系统的其他组件,以及节点的状态。
在S520中,所述物理节点通过kubelet检测其剩余资源是否满足所述目标pod运行所需的资源。
若物理节点检测其剩余资源不满足目标pod运行所需的资源,执行S530,若物理节点检测其剩余资源满足目标pod运行所需的资源,执行S540。
在S530中,所述物理节点通过kubelet将所述pod列表中的优先级低于所述目标pod优先级的pod,按照优先级由低到高的顺序依次移入待移除队列,直至所述物理节点执行所述pod列表中的pod后所得到的剩余资源满足所述目标pod运行所需的资源,并采用所述目标pod抢占所述待移除队列中的pod。
在S540中,所述物理节点通过kubelet调用执行环境,运行所述目标pod。
由此,当物理节点处理待运行的目标pod时,若物理节点的剩余资源不满足目标pod运行的所需的条件,将低优先级的pod(优先级低于目标pod的优先级)进行抢占,可以使物理节点优先处理重要任务(在本实施例中,该重要任务指的是目标pod),可以提高资源的利用率。
图6是本申请实施例提供的一种基于抢占式调度的资源共享使用方法流程图,所述方法由基于抢占式调度的资源共享使用系统来执行,所述系统可通过软件和/或硬件来实现。本申请实施例提供的方法可以应用在集群中,在集群中可以包含多个物理节点,每个物理节点上的资源可以是多个租户的共享资源,多个物理节点可以由集群管理平台进行管理,将任务分配给物理节点以使物理节点执行相应的任务。集群管理平台可以集成在多个计算机设备中,多个计算机设备可以由用户进行操作,用户可以登录集群管理平台,并递交任务的创建 请求,集群管理平台中的API服务器获取用户提交的任务的创建请求,创建任务,由集群管理平台中的调度器进行任务调度,将任务合理分配给对应的物理节点,由物理节点执行该任务。
如图6所示,本申请实施例提供的技术方案包括:S610-S692。
在S610中,API服务器获取任务的创建请求。
在S620中,当所述API服务器检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务。
在S630中,调度器获取所述API服务器创建的任务,并形成任务调度队列。
API服务器实时获取任务的创建请求,并为满足条件的创建请求创建任务。因此API服务器中有至少一个任务。调度器获取API服务器创建的所有任务,并将这些任务形成任务调度队列。
在S640中,所述调度器从所述任务调度队列获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务的优先级的任务,形成节点-任务映射表。
调度器根据任务调度队列中任务的排列顺序,依次对任务进行调度。待调度的当前任务指的是任务调度队列中调度器当前正在进行调度的任务。
在S650中,所述调度器根据所述映射表以及预设筛选条件确定最符合所述预设筛选条件的目标物理节点。
在S660中,所述调度器将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
调度器依次对任务调度队列中的任务进行调度,得到任务调度队列中每个任务与其对应的目标物理节点的绑定信息。
在S670中,物理节点通过监听所述API服务器中任务与物理节点的绑定信息,获取对应的任务,并形成任务队列。
物理节点监听API服务器中的绑定信息,通过绑定信息中的目标物理节点确定与该物理节点对应的任务。该任务可以是一个也可以是多个,将这些任务形成任务队列。
在S680中,当所述物理节点处理所述任务队列中待运行的目标任务时,获取所述物理节点上正在运行的任务列表。
物理节点根据任务队列中任务的排列顺序,依次对任务进行处理。待运行的目标任务指的是任务队列中物理节点当前正在处理的任务。
在S690中,所述物理节点检测其剩余资源是否满足所述目标任务运行所需的资源。
若物理节点检测其剩余资源不满足目标任务运行所需的资源,执行S691,若物理节点检测其剩余资源满足目标任务运行所需的资源,执行S692。
在S691中,所述物理节点将所述任务列表中的优先级低于所述目标任务的任务,按照优先级由低到高的顺序移入待移除队列,直至所述物理节点执行 所述任务列表中的任务所得到的剩余资源满足所述目标任务运行所需的资源,并采用目标任务抢占所述待移除队列中的任务。
在S692中,所述物理节点调用执行环境,运行所述目标任务。
相关技术的Kubernetes 1.3版本基于服务质量(Quality Of Service)的资源共享方案,用来进行管理共享资源,服务质量从高到低分别是Guarantee,Burstable以及Best Effort。其中,Best Effort的任务可以在集群资源没有充分使用时,进行调度并运行。当集群资源紧张时,Best Effort的任务优先被抢占。这种方案并没有考虑任务调度的环节,当集群调度满时,不能为高服务质量的任务腾出资源,并且无法限制租户内Best Effort任务的数目,以及无法区分Best Effort任务被抢占的顺序。
相关技术的Kubernetes 1.8版本引入的基于优先级的调度方案,任务可以设置优先级,当资源紧张的时,调度器会将低优先级的任务抢占,为高优先级的任务提供充分的资源。但该种方案中任务的抢占发生在调度器,即在集群逻辑调度满的时候发生,集群中会有资源没有充分利用的情况,资源的利用率并不高,并且无法精确限制每个优先级任务的数目。
相对于相关技术中的调度方案,本申请实施例提供的方法,对租户的资源配额进行优先级的设置,可以精确限制每个租户下多种优先级任务的数目,通过对任务优先级的设置,当任务被抢占时,可以区分任务被抢占的顺序。本申请实施例中基于优先级的任务抢占发生在物理节点,并没有发生在调度器,可以在逻辑上为高优先级的任务腾出资源,在资源没有充分利用时,继续运行被抢占的任务,可以提高资源的利用率。
本申请实施例提供的方法,通过对租户的资源配额进行优先级的设置,并通过将任务的优先级与所属租户下多个优先级的资源进行匹配,从而确定是否创建任务,可以让租户在资源紧张时优先使用资源,防止租户滥用高优先级的资源,导致低优先级的任务持续获取不了资源,出现“饿死”现象。调度器基于任务的优先级以及预设筛选条件对物理节点进行筛选,筛选出最合适的物理节点,将待调度的当前任务调度到最合适的物理节点,当资源紧张时,只进行逻辑上的资源抢占,并没有立即抢占资源,这种延后抢占的调度方法,可以在逻辑上为高优先级的任务让出资源,在资源没有被充分利用时,继续运行被抢占的任务,可以提高资源的利用率。当物理节点处理待运行的目标任务时,若物理节点的剩余资源不满足目标任务运行的所需的条件,基于将低优先级的任务进行抢占,可以使物理节点优先处理重要任务,提高资源的利用率。
图7是本申请实施例提供的一种API服务器结构框图,如图7所示,所述API服务器包括:请求获取模块710和任务创建模块720。
请求获取模块710设置为获取任务的创建请求。
任务创建模块720设置为当检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务。
可选地,所述装置应用于Kubernetes系统中,请求获取模块710设置为获取pod的创建请求。
任务创建模块720设置为当检测到所述pod所属的namespace的quota里包含与所述pod的优先级匹配的资源quota值,且quota值满足所述pod的创建条件,根据所述创建请求创建pod。
上述任务创建装置可执行本申请任意实施例所提供的任务创建方法,具备执行任务创建方法相应的功能模块和有益效果。
图8a是本申请实施例提供的一种调度器的结构框图,如图8a所示,所述调度器包括:映射表形成模块810、筛选模块820和绑定模块830。
映射表形成模块810设置为从任务调度队列获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务的优先级的任务,形成节点-任务映射表。
筛选模块820设置为根据所述映射表以及预设筛选条件确定最符合所述预设筛选条件的目标物理节点。
绑定模块830设置为将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
可选地,筛选模块820设置为从所述映射表筛选出符合第一阶段筛选条件的物理节点,形成节点组;以及
根据所述映射表以及第二阶段优选条件,对所述节点组的物理节点进行评分,并筛选出分数最高的物理节点作为目标物理节点。
可选地,所述装置应用于Kubernetes系统中,映射表形成模块810设置为从pod调度队列中获取待调度的当前pod,并获取每个物理节点上大于等于所述当前pod指定优先级的pod,形成节点-pod映射表。
相应的,绑定模块830设置为将所述当前pod与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
其中,调度器结构还可以是其他的结构形式,以可以执行任务调度方法即可。例如,调度器中可包括调度系统,如图8b所示,调度系统可以包括四个部分:节点信息列表840、筛选算法库850、优选算法库860和未调度队列870。
其中,节点信息列表840设置为记载当前可用的物理节点信息,包括物理节点上的资源信息(全部资源和可用资源),以及已经在物理节点上运行的任务队列。这部分信息是调度方法指定时关键的信息,需要实时同步,以保证调度系统对资源以及任务有全面的认知。
筛选算法库850设置为预先定义了多种筛选物理节点的算法,保证去除不满足任务执行条件的物理节点。
优选算法库860设置为预先定义了多种优选节点的算法以及算法的权重,优选算法计算出打分最高的物理节点会被选为调度节点,即目标物理节点。
调度队列870设置为未调度的任务形成的队列,是一个优先级队列以保证高优先级的任务先调度。
上述装置可执行本申请任意实施例所提供的任务调度方法,具备执行任务调度方法相应的功能模块和有益效果。
图9是本申请实施例提供的一种任务抢占装置的结构框图,如图9所示,所述任务抢占装置包括:任务列表获取模块910、检测模块920、抢占模块930和任务执行模块940。
其中,任务列表获取模块910设置为当处理待运行的目标任务时,获取所述物理节点上正在运行的任务列表。
检测模块920设置为检测物理节点上的剩余资源是否满足所述目标任务运行所需的资源。
抢占模块930设置为当物理节点上的剩余资源不满足目标任务运行所需的资源时,所述物理节点将所述任务列表中的优先级低于所述目标任务的任务,按照优先级由低到高的顺序依次移入待移除队列(即按照优先级由低到高的顺序依次将任务移出任务列表),直至所述物理节点执行所述任务列表中的任务后所得到的剩余资源满足所述目标任务运行所需的资源,并采用所述目标任务抢占所述待移除队列中的任务。
任务执行模块940设置为所述物理节点调用执行环境,运行所述目标任务。
可选地,所述抢占模块还可以设置为每间隔设定时间获取资源使用信息。
若确定所述资源使用信息达到预设限制条件,将所述任务列表中的任务按照优先级由低到高的顺序依次移入到所述待移除队列,直至所述物理节点执行所述任务列表中的任务后所确定的资源使用信息没有达到预设限制条件,并停止所述待移除队列中的任务。
可选地,所述装置应用于Kubernetes系统中,所述目标任务为目标pod,所述任务列表为pod列表,所述任务列表中的任务为所述物理节点上正在运行的pod。
上述装置可执行本申请任意实施例所提供的任务抢占方法,具备执行任务抢占方法相应的功能模块和有益效果。
图10是本申请实施例提供的一种任务抢占系统的结构示意图,如图10所示,所述任务抢占系统包括上述实施例提供的API服务器1010、上述实施例提供的调度器1020以及上述实施例提供的任务抢占装置1030。
可选地,API服务器1010和调度器1020分别是集群管理平台的组件,集群管理平台集成在用户所用的计算机设备上。任务抢占装置1030可集成在物理节点中的物理机上。
图11是本申请实施例提供的一种设备结构示意图,如图11所示,该设备包括:一个或多个处理器1110,图11中以一个处理器1110为例;存储器1120。
所述设备还可以包括:输入装置1130和输出装置1140。
所述设备中的处理器1110、存储器1120、输入装置1130和输出装置1140可以通过总线或者其他方式连接,图11中以通过总线连接为例。
存储器1120作为一种非暂态计算机可读存储介质,可用于存储软件程序、计算机可执行程序以及模块,如本申请实施例中的一种任务创建方法对应的程序指令/模块(例如,附图7所示的请求获取模块710和任务创建模块720)或者如本申请实施例中的一种任务调度方法对应的程序指令/模块(例如,附图8所示的映射表形成模块810、筛选模块820和绑定模块830),或者如本申请实施例中的一种任务抢占方法对应的程序指令/模块(例如,附图9所示的任务列表获取模块910、检测模块920、抢占模块930和任务执行模块940)。处理器1110通过运行存储在存储器1120中的软件程序、指令以及模块,从而执行计算机设备的多种功能应用以及数据处理,即实现上述方法实施例的一种任务创建方法:API服务器获取任务的创建请求;以及当API服务器检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务。
或者处理器1110通过运行存储在存储器1120中的软件程序、指令以及模块,实现上述方法实施例的一种任务调度方法:调度器从任务调度队列中获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务的优先级的任务,形成节点-任务映射表;所述调度器根据所述映射表以及预设筛选条件确定最符合所述预设筛选条件的目标物理节点;以及所述调度器将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
或者处理器1110通过运行存储在存储器1120中的软件程序、指令以及模块,实现上述方法实施例的一种任务抢占方法:当物理节点处理待运行的目标任务时,获取所述物理节点上正在运行的任务列表;所述物理节点检测其剩余资源是否满足所述目标任务运行所需的资源;若物理节点检测其剩余资源不满足目标任务运行所需的资源,所述物理节点将所述任务列表中的优先级低于所述目标任务优先级的任务,按照优先级从低到高的顺序依次移入待移除队列,直至所述物理节点执行所述任务列表中的任务后所得到的剩余资源满足所述目标任务运行所需的资源,并采用所述目标任务抢占所述待移除队列中的任务;以及所述物理节点调用执行环境,运行所述目标任务。
存储器1120可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据计算机设备的使用所创建的数据等。此外,存储器1120可以包括高速随机存取存储器,还可以包括非暂态性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态性固态存储器件。在一些实施例中,存储器1120可选包括相对于处理器1110远程设置的存储器,这些远程存储器可以通过网络连接至终端设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
输入装置1130可用于接收输入的数字或字符信息,以及产生与计算机设备的用户设置以及功能控制有关的键信号输入。输出装置1140可包括显示屏 等显示设备。
本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时,实现如本申请实施例提供的一种任务创建方法:获取任务的创建请求;以及当检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务。
或者该程序被处理器执行时,实现上述方法实施例的一种任务调度方法,即:调度器从任务调度队列中获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务的优先级的任务,形成节点-任务映射表;所述调度器根据所述映射表以及预设筛选条件确定最符合所述预设筛选条件的目标物理节点;以及所述调度器将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
或者该程序被处理器执行时,实现上述方法实施例的一种任务抢占方法:物理节点处理待运行的目标任务时,获取所述物理节点上正在运行的任务列表;所述物理节点检测其剩余资源是否满足所述目标任务运行所需的资源;若物理节点检测其剩余资源不满足目标任务运行所需的资源,所述物理节点将所述任务列表中的优先级低于所述目标任务优先级的任务,按照优先级从低到高的顺序依次移入待移除队列,直至所述物理节点执行所述任务列表中的任务后所得到的剩余资源满足所述目标任务运行所需的资源,并采用所述目标任务抢占所述待移除队列中的任务;以及所述物理节点调用执行环境,运行所述目标任务。
可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括——但不限于——电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于——无线、电线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计 算机程序代码,所述程序设计语言包括面向对象的程序设计语言-诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言-诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)-连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。

Claims (15)

  1. 一种任务创建方法,包括:
    应用编码接口API服务器获取任务的创建请求;以及
    当API服务器检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务。
  2. 根据权利要求1所述的方法,其中,
    所述API服务器获取任务的创建请求包括:
    API服务器获取pod的创建请求;以及
    当API服务器检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务,包括:
    当所述API服务器检测到所述pod所属的namespace的quota里包含与所述pod对应的优先级匹配的资源quota值,且所述quota值满足所述pod的创建条件,根据所述创建请求创建pod。
  3. 一种任务调度方法,包括:
    调度器从任务调度队列中获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务优先级的任务,形成节点-任务映射表;
    所述调度器根据所述映射表以及预设筛选条件确定符合所述预设筛选条件的目标物理节点;以及
    所述调度器将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
  4. 根据权利要求3所述的方法,其中,所述调度器根据所述映射表以及预设筛选条件确定符合所述预设筛选条件的目标物理节点,包括:
    所述调度器从所述映射表筛选出符合第一阶段筛选条件的物理节点,形成节点组;以及
    根据所述映射表以及第二阶段优选条件,对所述节点组的物理节点进行评分,并筛选出分数最高的物理节点作为目标物理节点。
  5. 根据权利要求3或4所述的方法,其中,
    所述调度器从任务调度队列中获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务指定优先级的任务,形成节点-任务映射表,包括:所述调度器从pod调度队列中获取待调度的当前pod,并获取每个物理节点上大于等于所述当前pod优先级的pod,形成节点-pod映射表;以及
    所述调度器将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器,包括:
    所述调度器将所述当前pod与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
  6. 一种任务抢占方法,包括:
    当物理节点处理待运行的目标任务时,获取所述物理节点上正在运行的任务列表;
    所述物理节点检测其剩余资源是否满足所述目标任务运行所需的资源;
    若物理节点检测其剩余资源不满足目标任务运行所需的资源时,所述物理节点将所述任务列表中的优先级低于所述目标任务优先级的任务按照优先级从低到高的顺序依次移入待移除队列,直至所述物理节点执行所述任务列表中的任务后所得到的剩余资源满足所述目标任务运行所需的资源,并采用所述目标任务抢占所述待移除队列中的任务;以及
    所述物理节点调用执行环境,运行所述目标任务。
  7. 根据权利要求6所述的方法,还包括:
    所述物理节点每间隔设定时间获取资源使用信息;
    所述物理节点若确定所述资源使用信息达到预设限制条件,将所述任务列表中的任务按照优先级由低到高的顺序依次移入到所述待移除队列,直至所述物理节点执行所述任务列表中的任务时所确定的资源使用信息没有达到预设限制条件,并停止所述待移除队列中的任务。
  8. 根据权利要求6或者7所述的方法,其中,所述目标任务为目标pod,所述任务列表为pod列表,所述任务列表中的任务为所述物理节点上正在运行的pod。
  9. 一种基于抢占式调度的资源共享使用方法,包括:
    应用编码接口API服务器获取任务的创建请求;
    当所述API服务器检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务;
    调度器获取所述API服务器创建的任务,并形成任务调度队列;
    所述调度器从所述任务调度队列获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务指定优先级的任务,形成节点-任务映射表;
    所述调度器根据所述映射表以及预设筛选条件确定最符合所述预设筛选条件的目标物理节点;
    所述调度器将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器;
    物理节点监听所述API服务器中任务与物理节点的绑定信息,基于监听到的所述绑定信息获取对应的任务,并形成任务队列;
    当所述物理节点处理所述任务队列中待运行的目标任务时,获取所述物理节点上正在运行的任务列表;
    所述物理节点检测其执行所述任务列表中的任务后的剩余资源是否满足所述目标任务运行所需的资源;
    若不满足目标任务运行所需的资源,所述物理节点将所述任务列表中的优先级低于所述目标任务的任务,按照优先级从低到高的顺序依次移入待移除队列,直至所述物理节点的剩余资源满足所述目标任务运行所需的资源,并采用目标任务抢占所述待移除队列中的任务;以及
    所述物理节点调用执行环境,运行所述目标任务。
  10. 一种API服务器,包括:
    请求获取模块,设置为获取任务的创建请求;以及
    任务创建模块,设置为当检测到所述任务所属租户的配额里包含与所述任务的优先级匹配的资源,且匹配资源满足所述任务的创建条件,根据所述创建请求创建所述任务。
  11. 一种调度器,包括:
    映射表形成模块,设置为从任务调度队列中获取待调度的当前任务,并获取每个物理节点上大于等于所述当前任务优先级的任务,形成节点-任务映射表;
    筛选模块,设置为根据所述映射表以及预设筛选条件确定符合所述预设筛选条件的目标物理节点;以及
    绑定模块,设置为将所述当前任务与所述目标物理节点绑定,并将绑定的信息发送到API服务器。
  12. 一种任务抢占装置,包括:
    任务列表获取模块,设置为当处理待运行的目标任务时,获取物理节点上正在运行的任务列表;
    检测模块,设置为检测物理节点上的剩余资源是否满足所述目标任务运行所需的资源;
    抢占模块,设置为当检测模块检测到物理节点上的剩余资源不满足目标任务运行所需的资源时,将所述任务列表中的任务按照优先级由低到高的顺序依次移入待移除队列,直至所述物理节点执行所述任务列表中的任务后所得到的剩余资源满足所述目标任务运行所需的资源,且所述待移除队列中任务的优先级小于所述目标任务的优先级,并采用所述目标任务抢占所述待移除队列中的任务;
    任务执行模块,设置为调用执行环境,运行所述目标任务。
  13. 一种基于抢占式调度的资源共享使用系统,包括如权利要求10所述的API服务器、权利要求11所述的调度器以及权利要求12所述的任务抢占装置。
  14. 一种设备,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1或2所述的任务创建方法,或者实现如权利要求3-5中任一所述的任务调度方法,或者实现如权利要求6-8中任一所述的任务抢占方法。
  15. 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1或2所述的任务创建方法,或者实现如权利要求3-5中任一所述的任务调度方法,或者实现如权利要求6-8中任一所述的任务抢占方法。
PCT/CN2018/123464 2018-06-25 2018-12-25 基于抢占式调度的资源共享使用方法、系统及设备 WO2020000944A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2020573022A JP7060724B2 (ja) 2018-06-25 2018-12-25 タスクスケジューリング方法、リソース共有使用方法、スケジューラ、コンピュータ可読記憶媒体および装置
CA3104806A CA3104806C (en) 2018-06-25 2018-12-25 Method for scheduling a task, resource sharing use method and system based on preemptive scheduling, scheduler, device, and storage medium
EP18924598.8A EP3799390A4 (en) 2018-06-25 2018-12-25 METHOD, SYSTEM AND DEVICE FOR USING PRE-EMPTIVE SCHEDULING RESOURCE SHARING
SG11202013049XA SG11202013049XA (en) 2018-06-25 2018-12-25 Method for scheduling a task, resource sharing use method and system based on preemptive scheduling, scheduler, device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810659298.5A CN108769254B (zh) 2018-06-25 2018-06-25 基于抢占式调度的资源共享使用方法、系统及设备
CN201810659298.5 2018-06-25

Publications (1)

Publication Number Publication Date
WO2020000944A1 true WO2020000944A1 (zh) 2020-01-02

Family

ID=63977138

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/123464 WO2020000944A1 (zh) 2018-06-25 2018-12-25 基于抢占式调度的资源共享使用方法、系统及设备

Country Status (6)

Country Link
EP (1) EP3799390A4 (zh)
JP (1) JP7060724B2 (zh)
CN (1) CN108769254B (zh)
CA (1) CA3104806C (zh)
SG (1) SG11202013049XA (zh)
WO (1) WO2020000944A1 (zh)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111399989A (zh) * 2020-04-10 2020-07-10 中国人民解放军国防科技大学 一种面向容器云的任务抢占调度方法及系统
CN111488206A (zh) * 2020-03-08 2020-08-04 苏州浪潮智能科技有限公司 一种深度学习任务调度方法、系统、终端及存储介质
CN111796933A (zh) * 2020-06-28 2020-10-20 北京小米松果电子有限公司 资源调度方法、装置、存储介质和电子设备
CN112181645A (zh) * 2020-09-21 2021-01-05 中国建设银行股份有限公司 一种资源调度的方法、装置、设备及存储介质
CN112328403A (zh) * 2020-11-25 2021-02-05 北京中天孔明科技股份有限公司 一种SparkContext的配置方法、装置及服务端
CN112445591A (zh) * 2020-11-03 2021-03-05 北京电子工程总体研究所 一种面向复杂任务集的任务调度系统及方法
CN112486648A (zh) * 2020-11-30 2021-03-12 北京百度网讯科技有限公司 任务调度方法、装置、系统、电子设备和存储介质
CN112685158A (zh) * 2020-12-29 2021-04-20 杭州海康威视数字技术股份有限公司 一种任务调度方法、装置、电子设备及存储介质
CN112749000A (zh) * 2021-01-31 2021-05-04 云知声智能科技股份有限公司 基于k8s自动拓展强化学习任务调度方法、装置及系统
CN112749221A (zh) * 2021-01-15 2021-05-04 长鑫存储技术有限公司 数据任务调度方法、装置、存储介质及调度工具
CN112783659A (zh) * 2021-02-01 2021-05-11 北京百度网讯科技有限公司 一种资源分配方法、装置、计算机设备及存储介质
CN113076188A (zh) * 2020-01-03 2021-07-06 阿里巴巴集团控股有限公司 一种分布式系统的调度方法及装置
CN113110927A (zh) * 2021-04-19 2021-07-13 上海商汤科技开发有限公司 一种任务调度方法、装置、计算机设备和存储介质
CN113301087A (zh) * 2020-07-21 2021-08-24 阿里巴巴集团控股有限公司 资源调度方法、装置、计算设备和介质
CN113419831A (zh) * 2021-06-23 2021-09-21 上海观安信息技术股份有限公司 一种沙箱任务调度方法和系统
CN113434270A (zh) * 2021-06-15 2021-09-24 北京百度网讯科技有限公司 数据资源调度方法、装置、电子设备及存储介质
CN113672391A (zh) * 2021-08-23 2021-11-19 烽火通信科技股份有限公司 一种基于Kubernetes的并行计算任务调度方法与系统
CN113742036A (zh) * 2020-05-28 2021-12-03 阿里巴巴集团控股有限公司 指标处理方法、装置及电子设备
KR20220011063A (ko) * 2020-07-20 2022-01-27 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. 서버 리소스 할당 방법, 장치, 전자 기기 및 저장 매체
CN114064296A (zh) * 2022-01-18 2022-02-18 北京建筑大学 一种Kubernetes调度方法、装置和存储介质
CN114513547A (zh) * 2020-10-29 2022-05-17 浙江宇视科技有限公司 模块的节点调度方法、装置、电子设备及存储介质
CN115145711A (zh) * 2022-09-02 2022-10-04 北京睿企信息科技有限公司 一种获取有向无环图任务结果的数据处理系统
CN115277579A (zh) * 2022-07-25 2022-11-01 广州品唯软件有限公司 仓库视频调取方法及云平台
CN115915457A (zh) * 2023-01-30 2023-04-04 阿里巴巴(中国)有限公司 资源调度方法、车辆控制方法、设备及系统
CN116192222A (zh) * 2023-04-27 2023-05-30 中国西安卫星测控中心 面向天线组阵需求的资源调度方法、装置和计算机设备
CN116719628A (zh) * 2023-08-09 2023-09-08 东莞信宝电子产品检测有限公司 一种并发任务抢占式调度方法、系统及介质
US11861397B2 (en) 2021-02-15 2024-01-02 Kyndryl, Inc. Container scheduler with multiple queues for special workloads
CN117435142A (zh) * 2023-12-12 2024-01-23 苏州元脑智能科技有限公司 Io请求调度方法及存储装置
CN111831390B (zh) * 2020-01-08 2024-04-16 北京嘀嘀无限科技发展有限公司 服务器的资源管理方法、装置及服务器

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108769254B (zh) * 2018-06-25 2019-09-20 星环信息科技(上海)有限公司 基于抢占式调度的资源共享使用方法、系统及设备
CN109656716B (zh) * 2018-12-13 2020-12-01 苏州浪潮智能科技有限公司 一种Slurm作业调度方法及系统
CN111381956B (zh) * 2018-12-28 2024-02-27 杭州海康威视数字技术股份有限公司 一种任务处理的方法、装置及云分析系统
CN109960585B (zh) * 2019-02-02 2021-05-14 浙江工业大学 一种基于kubernetes的资源调度方法
CN109933420A (zh) * 2019-04-02 2019-06-25 深圳市网心科技有限公司 节点任务调度方法、电子设备及系统
CN111800446B (zh) * 2019-04-12 2023-11-07 北京沃东天骏信息技术有限公司 调度处理方法、装置、设备和存储介质
CN112114958A (zh) * 2019-06-21 2020-12-22 上海哔哩哔哩科技有限公司 资源隔离方法、分布式平台、计算机设备和存储介质
CN112214288B (zh) * 2019-07-10 2023-04-25 中国移动通信集团上海有限公司 基于Kubernetes集群的Pod调度方法、装置、设备和介质
CN110362407A (zh) * 2019-07-19 2019-10-22 中国工商银行股份有限公司 计算资源调度方法及装置
CN110457135A (zh) * 2019-08-09 2019-11-15 重庆紫光华山智安科技有限公司 一种资源调度方法、装置及共享gpu显存的方法
CN110515730A (zh) * 2019-08-22 2019-11-29 北京宝兰德软件股份有限公司 基于kubernetes容器编排系统的资源二次调度方法及装置
CN110515704B (zh) * 2019-08-30 2023-08-04 广东浪潮大数据研究有限公司 基于Kubernetes系统的资源调度方法及装置
CN110737572B (zh) * 2019-08-31 2023-01-10 苏州浪潮智能科技有限公司 大数据平台资源抢占测试方法、系统、终端及存储介质
CN110532082A (zh) * 2019-09-04 2019-12-03 厦门商集网络科技有限责任公司 一种基于任务预分配的任务申请装置和方法
CN110727512B (zh) * 2019-09-30 2020-06-26 星环信息科技(上海)有限公司 集群资源调度方法、装置、设备及储存介质
CN110716809B (zh) * 2019-10-21 2022-06-21 北京百度网讯科技有限公司 用于调度云资源的方法和装置
CN110851236A (zh) * 2019-11-11 2020-02-28 星环信息科技(上海)有限公司 一种实时资源调度方法、装置、计算机设备及存储介质
CN110990154B (zh) * 2019-11-28 2024-02-23 曙光信息产业股份有限公司 一种大数据应用优化方法、装置及存储介质
CN111736965A (zh) * 2019-12-11 2020-10-02 西安宇视信息科技有限公司 任务调度方法、装置、调度服务器和机器可读存储介质
CN113051064A (zh) * 2019-12-26 2021-06-29 中移(上海)信息通信科技有限公司 任务调度方法、装置、设备及存储介质
CN113127178B (zh) * 2019-12-30 2024-03-29 医渡云(北京)技术有限公司 资源抢占方法及装置、计算机可读存储介质、电子设备
CN111459666A (zh) * 2020-03-26 2020-07-28 北京金山云网络技术有限公司 任务派发方法、装置、任务执行系统和服务器
CN111506404A (zh) * 2020-04-07 2020-08-07 上海德拓信息技术股份有限公司 一种基于Kubernetes的共享GPU调度方法
CN111464659A (zh) * 2020-04-27 2020-07-28 广州虎牙科技有限公司 节点的调度、节点的预选处理方法、装置、设备及介质
CN111641678A (zh) * 2020-04-29 2020-09-08 深圳壹账通智能科技有限公司 任务调度方法、装置、电子设备及介质
CN111694646B (zh) * 2020-05-29 2023-11-07 北京百度网讯科技有限公司 资源调度方法、装置、电子设备及计算机可读存储介质
CN112015549B (zh) * 2020-08-07 2023-01-06 苏州浪潮智能科技有限公司 一种基于服务器集群的调度节点的选择抢占方法及系统
CN112181517A (zh) * 2020-09-24 2021-01-05 北京达佳互联信息技术有限公司 一种应用软件的启动方法、装置、设备和介质
CN112035220A (zh) * 2020-09-30 2020-12-04 北京百度网讯科技有限公司 开发机操作任务的处理方法、装置、设备以及存储介质
CN112162865B (zh) * 2020-11-03 2023-09-01 中国工商银行股份有限公司 服务器的调度方法、装置和服务器
CN112486642B (zh) * 2020-11-25 2024-01-19 广州虎牙科技有限公司 资源调度方法、装置、电子设备及计算机可读存储介质
CN112528450A (zh) * 2021-01-15 2021-03-19 博智安全科技股份有限公司 网络拓扑结构构建方法、终端设备和计算机可读存储介质
CN112799787B (zh) * 2021-02-07 2023-10-03 北京华如科技股份有限公司 一种在仿真运行中改进的并行行为执行冲突消解方法及其存储介质
CN115033352A (zh) * 2021-02-23 2022-09-09 阿里云计算有限公司 多核处理器任务调度方法、装置及设备、存储介质
CN113608852A (zh) * 2021-08-03 2021-11-05 科大讯飞股份有限公司 任务调度方法、调度模块、推理节点和协同作业系统
CN113783797B (zh) * 2021-09-13 2023-11-07 京东科技信息技术有限公司 云原生容器的网络流量控制方法、装置、设备及存储介质
CN113886052A (zh) * 2021-10-26 2022-01-04 上海商汤科技开发有限公司 任务调度方法、装置、设备、存储介质
CN114138500B (zh) * 2022-01-29 2022-07-08 阿里云计算有限公司 资源调度系统及方法
FR3133934A1 (fr) * 2022-03-24 2023-09-29 Vitesco Technologies Procédé de gestion d’exécution d’une pluralité de fonctions
CN114860403B (zh) * 2022-05-11 2023-07-07 科东(广州)软件科技有限公司 一种任务调度方法、装置、设备和存储介质
CN115658332A (zh) * 2022-12-28 2023-01-31 摩尔线程智能科技(北京)有限责任公司 一种gpu共享方法及装置、电子设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073546A (zh) * 2010-12-13 2011-05-25 北京航空航天大学 一种云计算环境中分布式计算模式下的任务动态调度方法
CN103810046A (zh) * 2012-11-15 2014-05-21 百度在线网络技术(北京)有限公司 一种单机资源管理方法及系统
CN104838360A (zh) * 2012-09-04 2015-08-12 微软技术许可有限责任公司 基于配额的资源管理
AU2018100381A4 (en) * 2018-03-27 2018-05-10 Chongqing University Of Posts And Telecommunications A physical resource scheduling method in cloud cluster
CN108769254A (zh) * 2018-06-25 2018-11-06 星环信息科技(上海)有限公司 基于抢占式调度的资源共享使用方法、系统及设备

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8782047B2 (en) * 2009-10-30 2014-07-15 Hitachi Data Systems Corporation Fixed content storage within a partitioned content platform using namespaces
CN101227713A (zh) * 2007-01-19 2008-07-23 华为技术有限公司 一种用户接入控制的方法及其装置
US8458712B2 (en) * 2008-04-30 2013-06-04 International Business Machines Corporation System and method for multi-level preemption scheduling in high performance processing
EP2893683A1 (en) * 2012-09-07 2015-07-15 Oracle International Corporation Ldap-based multi-customer in-cloud identity management system
US9684787B2 (en) * 2014-04-08 2017-06-20 Qualcomm Incorporated Method and system for inferring application states by performing behavioral analysis operations in a mobile device
CN107491351B (zh) * 2016-06-13 2021-07-27 阿里巴巴集团控股有限公司 一种基于优先级的资源分配方法、装置和设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073546A (zh) * 2010-12-13 2011-05-25 北京航空航天大学 一种云计算环境中分布式计算模式下的任务动态调度方法
CN104838360A (zh) * 2012-09-04 2015-08-12 微软技术许可有限责任公司 基于配额的资源管理
CN103810046A (zh) * 2012-11-15 2014-05-21 百度在线网络技术(北京)有限公司 一种单机资源管理方法及系统
AU2018100381A4 (en) * 2018-03-27 2018-05-10 Chongqing University Of Posts And Telecommunications A physical resource scheduling method in cloud cluster
CN108769254A (zh) * 2018-06-25 2018-11-06 星环信息科技(上海)有限公司 基于抢占式调度的资源共享使用方法、系统及设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3799390A4

Cited By (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076188A (zh) * 2020-01-03 2021-07-06 阿里巴巴集团控股有限公司 一种分布式系统的调度方法及装置
CN111831390B (zh) * 2020-01-08 2024-04-16 北京嘀嘀无限科技发展有限公司 服务器的资源管理方法、装置及服务器
CN111488206A (zh) * 2020-03-08 2020-08-04 苏州浪潮智能科技有限公司 一种深度学习任务调度方法、系统、终端及存储介质
CN111399989B (zh) * 2020-04-10 2022-11-18 中国人民解放军国防科技大学 一种面向容器云的任务抢占调度方法及系统
CN111399989A (zh) * 2020-04-10 2020-07-10 中国人民解放军国防科技大学 一种面向容器云的任务抢占调度方法及系统
CN113742036A (zh) * 2020-05-28 2021-12-03 阿里巴巴集团控股有限公司 指标处理方法、装置及电子设备
CN113742036B (zh) * 2020-05-28 2024-01-30 阿里巴巴集团控股有限公司 指标处理方法、装置及电子设备
CN111796933B (zh) * 2020-06-28 2023-11-21 北京小米松果电子有限公司 资源调度方法、装置、存储介质和电子设备
CN111796933A (zh) * 2020-06-28 2020-10-20 北京小米松果电子有限公司 资源调度方法、装置、存储介质和电子设备
JP2022023769A (ja) * 2020-07-20 2022-02-08 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド サーバリソースを割り当てるための方法、装置、電子デバイス、コンピュータ可読記憶媒体及びコンピュータプログラム
KR20220011063A (ko) * 2020-07-20 2022-01-27 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. 서버 리소스 할당 방법, 장치, 전자 기기 및 저장 매체
US11601378B2 (en) 2020-07-20 2023-03-07 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for allocating server resource, electronic device and storage medium
KR102549821B1 (ko) 2020-07-20 2023-06-29 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. 서버 리소스 할당 방법, 장치, 전자 기기 및 저장 매체
CN113301087A (zh) * 2020-07-21 2021-08-24 阿里巴巴集团控股有限公司 资源调度方法、装置、计算设备和介质
CN113301087B (zh) * 2020-07-21 2024-04-02 阿里巴巴集团控股有限公司 资源调度方法、装置、计算设备和介质
CN112181645A (zh) * 2020-09-21 2021-01-05 中国建设银行股份有限公司 一种资源调度的方法、装置、设备及存储介质
CN114513547B (zh) * 2020-10-29 2024-02-13 浙江宇视科技有限公司 模块的节点调度方法、装置、电子设备及存储介质
CN114513547A (zh) * 2020-10-29 2022-05-17 浙江宇视科技有限公司 模块的节点调度方法、装置、电子设备及存储介质
CN112445591A (zh) * 2020-11-03 2021-03-05 北京电子工程总体研究所 一种面向复杂任务集的任务调度系统及方法
CN112328403A (zh) * 2020-11-25 2021-02-05 北京中天孔明科技股份有限公司 一种SparkContext的配置方法、装置及服务端
CN112486648A (zh) * 2020-11-30 2021-03-12 北京百度网讯科技有限公司 任务调度方法、装置、系统、电子设备和存储介质
CN112685158B (zh) * 2020-12-29 2023-08-04 杭州海康威视数字技术股份有限公司 一种任务调度方法、装置、电子设备及存储介质
CN112685158A (zh) * 2020-12-29 2021-04-20 杭州海康威视数字技术股份有限公司 一种任务调度方法、装置、电子设备及存储介质
CN112749221A (zh) * 2021-01-15 2021-05-04 长鑫存储技术有限公司 数据任务调度方法、装置、存储介质及调度工具
CN112749000A (zh) * 2021-01-31 2021-05-04 云知声智能科技股份有限公司 基于k8s自动拓展强化学习任务调度方法、装置及系统
CN112783659B (zh) * 2021-02-01 2023-08-04 北京百度网讯科技有限公司 一种资源分配方法、装置、计算机设备及存储介质
CN112783659A (zh) * 2021-02-01 2021-05-11 北京百度网讯科技有限公司 一种资源分配方法、装置、计算机设备及存储介质
US11861397B2 (en) 2021-02-15 2024-01-02 Kyndryl, Inc. Container scheduler with multiple queues for special workloads
CN113110927A (zh) * 2021-04-19 2021-07-13 上海商汤科技开发有限公司 一种任务调度方法、装置、计算机设备和存储介质
CN113434270A (zh) * 2021-06-15 2021-09-24 北京百度网讯科技有限公司 数据资源调度方法、装置、电子设备及存储介质
CN113434270B (zh) * 2021-06-15 2023-06-23 北京百度网讯科技有限公司 数据资源调度方法、装置、电子设备及存储介质
CN113419831A (zh) * 2021-06-23 2021-09-21 上海观安信息技术股份有限公司 一种沙箱任务调度方法和系统
CN113419831B (zh) * 2021-06-23 2023-04-11 上海观安信息技术股份有限公司 一种沙箱任务调度方法和系统
CN113672391A (zh) * 2021-08-23 2021-11-19 烽火通信科技股份有限公司 一种基于Kubernetes的并行计算任务调度方法与系统
CN113672391B (zh) * 2021-08-23 2023-11-28 烽火通信科技股份有限公司 一种基于Kubernetes的并行计算任务调度方法与系统
CN114064296A (zh) * 2022-01-18 2022-02-18 北京建筑大学 一种Kubernetes调度方法、装置和存储介质
CN115277579A (zh) * 2022-07-25 2022-11-01 广州品唯软件有限公司 仓库视频调取方法及云平台
CN115277579B (zh) * 2022-07-25 2024-03-19 广州品唯软件有限公司 仓库视频调取方法及云平台
CN115145711A (zh) * 2022-09-02 2022-10-04 北京睿企信息科技有限公司 一种获取有向无环图任务结果的数据处理系统
CN115915457A (zh) * 2023-01-30 2023-04-04 阿里巴巴(中国)有限公司 资源调度方法、车辆控制方法、设备及系统
CN116192222B (zh) * 2023-04-27 2023-08-29 中国西安卫星测控中心 面向天线组阵需求的资源调度方法、装置和计算机设备
CN116192222A (zh) * 2023-04-27 2023-05-30 中国西安卫星测控中心 面向天线组阵需求的资源调度方法、装置和计算机设备
CN116719628A (zh) * 2023-08-09 2023-09-08 东莞信宝电子产品检测有限公司 一种并发任务抢占式调度方法、系统及介质
CN116719628B (zh) * 2023-08-09 2024-04-19 东莞信宝电子产品检测有限公司 一种并发任务抢占式调度方法、系统及介质
CN117435142A (zh) * 2023-12-12 2024-01-23 苏州元脑智能科技有限公司 Io请求调度方法及存储装置
CN117435142B (zh) * 2023-12-12 2024-03-01 苏州元脑智能科技有限公司 Io请求调度方法及存储装置

Also Published As

Publication number Publication date
EP3799390A4 (en) 2022-06-22
CN108769254B (zh) 2019-09-20
JP7060724B2 (ja) 2022-04-26
EP3799390A1 (en) 2021-03-31
CN108769254A (zh) 2018-11-06
JP2021522621A (ja) 2021-08-30
SG11202013049XA (en) 2021-02-25
CA3104806A1 (en) 2020-01-02
CA3104806C (en) 2021-05-18

Similar Documents

Publication Publication Date Title
WO2020000944A1 (zh) 基于抢占式调度的资源共享使用方法、系统及设备
WO2021063339A1 (zh) 集群资源调度方法、装置、设备及储存介质
US10908954B2 (en) Quality of service classes
US9501319B2 (en) Method and apparatus for scheduling blocking tasks
US10896065B2 (en) Efficient critical thread scheduling for non privileged thread requests
JP2002063148A (ja) 多重プロセッサ・システム
WO2015101091A1 (zh) 一种分布式资源调度方法及装置
TW200401529A (en) System and method for the allocation of grid computing workload to network workstations
WO2021093783A1 (zh) 实时资源调度方法、装置、计算机设备及存储介质
WO2024007849A1 (zh) 面向智能计算的分布式训练容器调度
US9882973B2 (en) Breadth-first resource allocation system and methods
CN110837401A (zh) 一种java线程池分级处理方法和装置
WO2024021489A1 (zh) 一种任务调度方法、装置及Kubernetes调度器
US8392577B2 (en) Reduction of message flow between bus-connected consumers and producers
US20240061710A1 (en) Resource allocation method and system after system restart and related component
CN113626173B (zh) 调度方法、装置及存储介质
US20190266005A1 (en) Method and apparatus for a virtual machine
US7822918B2 (en) Preallocated disk queuing
CN113760499A (zh) 调度计算单元的方法、装置、计算设备及介质
CN107634978B (zh) 一种资源调度方法及装置
US9213575B2 (en) Methods and systems for energy management in a virtualized data center
CN112685158B (zh) 一种任务调度方法、装置、电子设备及存储介质
CN116185772B (zh) 文件批量检测方法及装置
US20230418667A1 (en) Computing device for handling tasks in a multi-core processor, and method for operating computing device
CN111459653A (zh) 集群调度方法、装置和系统以及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18924598

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 3104806

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2020573022

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018924598

Country of ref document: EP

Effective date: 20201222