WO2020052301A1 - 一种资源调度方法及装置 - Google Patents

一种资源调度方法及装置 Download PDF

Info

Publication number
WO2020052301A1
WO2020052301A1 PCT/CN2019/090886 CN2019090886W WO2020052301A1 WO 2020052301 A1 WO2020052301 A1 WO 2020052301A1 CN 2019090886 W CN2019090886 W CN 2019090886W WO 2020052301 A1 WO2020052301 A1 WO 2020052301A1
Authority
WO
WIPO (PCT)
Prior art keywords
resource
task
resources
request message
scheduling request
Prior art date
Application number
PCT/CN2019/090886
Other languages
English (en)
French (fr)
Inventor
易小萌
顾炯炯
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020052301A1 publication Critical patent/WO2020052301A1/zh
Priority to US17/199,121 priority Critical patent/US20210200587A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4887Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues involving deadlines, e.g. rate based, periodic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4812Task transfer initiation or dispatching by interrupt, e.g. masked
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5041Network service management, e.g. ensuring proper service fulfilment according to agreements characterised by the time relationship between creation and deployment of a service
    • H04L41/5051Service on demand, e.g. definition and deployment of services in real time
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1014Server selection for load balancing based on the content of a request
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1029Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers using data related to the state of servers by a load balancer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning

Definitions

  • the present application relates to the field of computer technology, and in particular, to a method and an apparatus for resource scheduling.
  • the operators of the cloud data center invested a large amount of money to purchase computing facilities such as servers and switches to provide computing resources for cloud computing services.
  • operators use resource reuse technologies such as virtualization to schedule computing tasks of different tenants into the same computing facilities.
  • the tenant selects a cloud host with the appropriate resource configuration from the list of cloud host types provided by the operator for lease according to the needs of its own task execution.
  • the operator selects a physical server among all the physical servers in the cloud data center based on the resource configuration of the cloud host selected by the tenant through the public cloud scheduling system, and starts a physical server on it.
  • the virtual machine serves as the cloud host rented by the tenant.
  • a reasonable virtual machine scheduling method can effectively reduce resource fragments in each physical server in the cloud data center, thereby ensuring higher resource utilization.
  • the embodiments of the present application provide a resource scheduling method and device, which are used to improve resource utilization and improve service quality.
  • an embodiment of the present application provides a resource scheduling method.
  • a first scheduling request message is obtained, a first resource is determined from a resource pool according to a first number of resources requested by the first scheduling request message.
  • a server and schedules the first number of resources in the first resource server; the resource pool includes at least one resource server; the first scheduling request message is used to request resources for a first type of task;
  • the second scheduling request message is obtained, if it is determined according to the resource load rate of the resource pool that the resource is scheduled for the task corresponding to the second scheduling request message, then the second quantity requested by the second scheduling request message is determined.
  • Resources determine a second resource server from the resource pool, and schedule a third number of resources in the second resource server; the third number is less than or equal to the second number; the second scheduling request Messages are used to request resources for tasks of the second type.
  • the first type of task and the second type of task can be scheduled to the same resource server, so that the first type of server can be effectively used.
  • the method further includes: if it is determined that a resource load rate of the resource pool is greater than or equal to a first threshold, selecting M number of the first number from a plurality of tasks performed by the at least one resource server. Two types of tasks and release the resources occupied by the M tasks of the second type, where M is an integer greater than 0.
  • the method further includes: if it is determined that the number of idle resources in the second resource server is less than a second threshold, selecting N locations from a plurality of tasks performed by the second resource server. The second type of task is described, and the resources occupied by the N tasks of the second type are released, where N is an integer greater than 0.
  • the second type of task on the resource server with a higher load can be interrupted in time to prevent the second type of task from preempting the resources required by the first type of task. To avoid affecting the use of resources of the first type of task.
  • the method further includes: placing the N tasks of the second type in a waiting queue, where the waiting queue includes a scheduling request message for at least one task of the second type.
  • the first resource server when determining the first resource server from the resource pool according to the first number of resources requested by the first scheduling request message, the first resource server may be selected from at least one resource server included in the resource pool, Selecting a resource server whose number of idle resources is greater than the first number as the first resource server.
  • a second resource server when a second resource server is determined from the resource pool according to a second number of resources requested by the second scheduling request message, at least one resource server included in the resource pool may be determined. And selecting a resource server whose number of idle resources is greater than the third number as the second resource server.
  • the first type of task may be a task for resource scheduling based on a resource request amount
  • the second type of task may be a task for resource scheduling based on a resource usage amount
  • an embodiment of the present application provides a resource scheduling apparatus, where the resource scheduling apparatus includes a processor, and the processor is coupled to a memory, where: the memory is used to store instructions; the processor is used to execute the instructions stored in the memory, A method for implementing the first aspect or any possible design in the first aspect.
  • an embodiment of the present application provides a resource scheduling apparatus for implementing the first aspect or any one of the first aspects of the method, including corresponding function modules, for example, including a first scheduler and a second scheduler. And load control modules are used to implement the steps in the above methods.
  • an embodiment of the present application provides a computer-readable storage medium, where the computer-readable instructions are stored in the computer storage medium, and when the computer reads and executes the computer-readable instructions, the computer is caused to execute any of the foregoing Aspect or any of the possible designs.
  • an embodiment of the present application provides a computer program product.
  • the computer reads and executes the computer program product, the computer is caused to execute the method in any one of the foregoing aspects or any one of the possible designs.
  • an embodiment of the present application provides a chip that is connected to a memory and is configured to read and execute a software program stored in the memory to implement any one of the foregoing aspects or any one of the possible aspects. In design.
  • an embodiment of the present application provides a resource scheduling system, including the resource scheduling apparatus and multiple resource servers in the second aspect.
  • FIG. 1 is a schematic structural diagram of a resource scheduling system according to an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a resource scheduling method according to an embodiment of the present application.
  • FIG. 3 is a structural framework diagram of a resource scheduling device according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of monitoring data sending according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of resource scheduling of a first type of task according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of resource scheduling provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of resource scheduling provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of resource scheduling of a second type of task according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of resource scheduling according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a resource scheduling apparatus according to an embodiment of the present application.
  • FIG. 1 shows a schematic architecture diagram of a public cloud service system applicable to an embodiment of the present application.
  • a resource pool and a resource scheduling device for controlling the resource pool are included, where the resource pool may include at least one resource server.
  • the tenant submits scheduling request messages for different types of tasks to the resource scheduling device through different service interfaces through consoles (not shown in FIG. 1) of the resource scheduling device of the public cloud service.
  • the resource scheduling device schedules the resource server from the resource pool according to the scheduling request messages of different types of tasks, and schedules the corresponding processing resources from the scheduled resource server to be allocated to the tenant for use.
  • FIG. 2 is a schematic flowchart of a resource scheduling method according to an embodiment of the present application.
  • the method includes:
  • Step 201 When the resource scheduling device obtains a first scheduling request message, the first scheduling request message is used to request resources for a first type of task, and the resource scheduling device requests the first scheduling request message according to the first scheduling request message.
  • the number of resources, a first resource server is determined from the resource pool, and the first number of resources are scheduled in the first resource server.
  • the task type here refers to the business type corresponding to the task.
  • a task used to perform a type of business type can be called a type of task.
  • the resources here include, but are not limited to, processor resources, storage resources, bandwidth resources, and so on.
  • Step 202 When the resource scheduling device obtains a second scheduling request message, the second scheduling request message is used to request resources for a second type of task. If the resource scheduling device determines the resource scheduling device according to the current resource load rate of the resource pool, When the task corresponding to the second scheduling request message schedules resources, a second resource server is determined from the resource pool according to the second amount of resources requested by the second scheduling request message, and Scheduling a third number of resources; wherein the third number is less than or equal to the second number.
  • the first resource server and the second resource server may be the same resource server, or may be two different resource servers, which is not limited in this embodiment of the present application.
  • the resource scheduling device includes a first scheduler, a second scheduler, a load control module, a waiting queue, and a message queue module.
  • the scheduling request message of the first type of task may be submitted to the first scheduler, and the scheduling request message of the second type of task may be submitted to the second scheduler.
  • the first scheduler is configured to obtain a scheduling request message for a task of a first type
  • the second scheduler is configured to obtain a scheduling request message for a task of a second type.
  • the first type of task may be a task for resource scheduling based on a resource request amount
  • the second type of task may be a task for resource scheduling based on a resource usage amount.
  • the first type of task may also be referred to as a service level agreement (SLA) sensitive task
  • the second type of task may also be referred to as an SLA insensitive task.
  • SLA-sensitive tasks can obtain resources that do not exceed their resource request amount at any time during the execution process.
  • SLA-insensitive tasks can obtain resources that are less than the amount of their resource requests during the execution process, and when the resource pool load is too high, they can recycle the resources they are using, leading to interruption of task execution.
  • corresponding types are set for each task in advance, that is, the first type and the second type.
  • Each type of task can only apply for resources through the corresponding scheduler, so that resources can be promoted through the tasks of the second type. Utilization while avoiding impact on the first type of task.
  • the first scheduler After the first scheduler obtains the scheduling request message of the task of the first type, according to the resource request amount of the task of the first type, it is ensured that the task of the type can obtain the amount of resources equal to the requested amount at any time.
  • the second scheduler After the second scheduler obtains the scheduling request message of the task of the second type, it does not allocate the requested amount of resources to it at the beginning, but first puts the scheduling request message of the task into the waiting queue to wait. When determining resources for this type of task according to the resource load rate of the resource pool, according to its actual resource usage, it allocates at most the number of resources that is not greater than its requested amount.
  • the second scheduler can monitor and predict the actual resource usage of each resource server through the load control module. When the predicted value of the actual resource usage of the task increases, it closes some second-type tasks in time to ensure the first-type tasks. There can be sufficient resources to use.
  • Each resource server in the resource pool includes an agent module.
  • the agent module is responsible for executing the resource allocation decision of the scheduler on the one hand, and responsible for the resource load of the resource server on which it is located and the actual tasks of the resource server. Monitor resource usage.
  • the scheduler selects a resource server for task execution, it transmits the relevant data and information of the task to the agent module on the resource server.
  • the agent module prepares the execution environment in the resource server for the tasks to be executed, allocates the resources it needs and creates task instances.
  • the scheduler decides to interrupt some SLA-insensitive tasks on a resource server, it passes the relevant information of the interrupted task to the agent module on the resource server, and the agent module interrupts the execution of the task and releases the occupied space. Resources.
  • the agent module in each resource server periodically reads the actual resource usage data of each task in the resource server. After analysis and summary, the agent module in each resource server periodically sends a message queue to the message queue. The module sends monitoring data.
  • the monitoring data includes, but is not limited to, the resource load rate of the resource server, the actual resource usage of each task performed, and the task type of each task performed.
  • FIG. 4 it is a schematic diagram of a monitoring data sending process according to an embodiment of the present application.
  • Step 401 The resource server 1 to the resource server K included in the resource pool periodically send monitoring data to the message queue module.
  • step 402 the message queue module sorts and summarizes the monitoring data sent by each resource server, and finally provides it to the load control module for reading.
  • Step 403 The load control module periodically sends a monitoring data request message to the message queue module, and the monitoring data request message shown is used to request to obtain monitoring data.
  • the message queue module sends a monitoring data response message to the load control module, and the monitoring data response message includes the monitoring data requested by the load control module.
  • the load control module reads the monitoring data of each agent module in the message queue module, and based on the read monitoring data, predicts and analyzes the resource load of the resource pool and the actual resource usage of the tenant task within a period of time (such as 1 hour). , And make two logical judgments based on the prediction results. On the one hand, the load control module determines whether to select a task cached in the request queue for execution according to the predicted result. When the predicted load is low, the load control module obtains a scheduling request message (a scheduling request message that is still being executed or is interrupted) from the waiting queue, filters the tasks that are suitable for execution, and allocates computing resources to it through the scheduler.
  • a scheduling request message a scheduling request message that is still being executed or is interrupted
  • the load control module also needs to determine whether it is necessary to interrupt the running SLA-insensitive tasks.
  • the load control model will pass the information of the resource server to the second scheduler and the second scheduler will choose to close it.
  • Some SLA-insensitive tasks on the resource server ensure that sufficient resources can be obtained when the remaining tasks are executed. It should be noted that how the load control module specifically performs prediction analysis on the resource load of the resource pool and the actual resource usage of the tenant task based on the read monitoring data, which is not limited in the embodiment of the present application, and is not repeated here. To repeat.
  • the process of creating and closing the first type of task may be as shown in FIG. 5. Including the following processing steps:
  • Step 501 The tenant sends a first scheduling request message to the console of the resource scheduling device.
  • the first scheduling request message is used to request a first number of resources for a first type of task.
  • the first scheduling request message may further include identity information of the tenant, information of a corresponding task, and the like, which are not limited in the embodiment of the present application, and are not described herein again.
  • Step 502 The console of the resource scheduling device verifies the identity information of the tenant and the validity of the first scheduling request message. How to perform the verification on the console is not limited in the embodiment of the present application, and details are not described herein again. After the console passes the authentication, step 503 is performed, otherwise the tenant's request is rejected. The authentication is passed as an example.
  • Step 503 The console of the resource scheduling device submits the first scheduling request message to the first scheduler.
  • Step 504 The first scheduler determines a first resource server from a resource pool according to the first number of resources requested by the first scheduling request message.
  • the first scheduler may select, from at least one resource server included in the resource pool, a resource server whose number of idle resources is greater than the first number as the first resource server.
  • the first scheduler of the public cloud can perform resource scheduling according to the amount of task resource requests. As shown in FIG. 6, it is assumed that the tenant submits task 1, task 2 and task 3 in this order. The first scheduler performs resource scheduling according to the task resource request amount, and schedules task 1 and task 2 to the resource server 1. When the tenant submits the execution request of task 3, if the resource server 1 still has unused resources, but the remaining resource amount is less than the resource request amount of task 3, the first scheduler will schedule task 3 to Free resource server 2.
  • Step 505 The first scheduler sends a task creation request to the first resource server, where the task creation request is used to request creation of a task corresponding to the first scheduling request message, and requests to schedule a first number of resources for the task.
  • Step 506 The proxy module in the first resource server creates a task according to the task creation request, and schedules a first number of resources.
  • Step 507 The first resource server sends a task creation response to the first scheduler, where the task creation response is used to indicate a request result of the task creation request.
  • Step 508 The first resource server sends a task creation notification message to the console, where the task creation notification message is used to indicate a request result of the first scheduling request message.
  • Step 509 The console sends a first scheduling response message to the tenant according to the task creation notification message, where the first scheduling response message is used to indicate a request result of the first scheduling request message.
  • the resource is scheduled for the task corresponding to the first scheduling request message according to the first scheduling request message of the tenant, and a task is created.
  • M tasks of the second type may be selected from a plurality of tasks performed by the at least one resource server, and the The resources occupied by the M tasks of the second type, M is an integer greater than 0.
  • the M tasks of the second type may also be placed in a waiting queue, waiting for subsequent calls to be executed.
  • a resource server may execute a task of a first type and a task of a second type simultaneously.
  • the first scheduler analyzes and predicts the resource load of each resource server in the future by using the amount of resources used by each task in each resource server.
  • the first scheduler determines that The number of idle resources is less than the second threshold, or the load rate of the first resource server is greater than a preset load rate, such as greater than 90%, selecting at least one task of the second type from the first resource server and releasing the Resources occupied by at least one task of the second type.
  • the embodiment of the present application schedules the tasks of the first type according to the resource request amount, and by monitoring and predicting and analyzing the resource load, the tasks of the second type on the resource server with a higher load are interrupted in time. It is possible to avoid the situation in which the second type of task preempts the resources required by the first type of task, and to avoid affecting the resource usage of the first type of task. That is, through the above method, the resource allocation of the tasks of the first type can be preferentially guaranteed.
  • Task 1 is the first type of task
  • task 2 and task 3 are the second type of task
  • the first scheduler performs resource scheduling according to the task resource request amount, and schedules task 1 to resource server 1.
  • the second scheduler is based on the task.
  • the resource usage is used for resource scheduling, and tasks 2 and 3 are scheduled to the resource server 1.
  • the amount of resources scheduled for task 1 is equal to its resource request amount
  • the amount of resources scheduled for task 2 and task 3 are less than the respective resource request amounts.
  • the actual resource usage of tasks 2 and 3 has gradually increased and reached their respective resource requests. The actual resource usage of the three tasks will exceed the total resource capacity of resource server 1.
  • the execution of some tasks may need to be interrupted to ensure that other tasks can obtain sufficient computing resources.
  • the interruption of the first type of task during the execution process will seriously affect the tenant experience, resulting in loss of the operator's brand and reputation.
  • the second type of task can be forcibly interrupted, and at this time, all tasks 2 and 3 can be interrupted and the resources can be released. Further, the at least one task of the second type may be re-entered into a waiting queue, waiting for subsequent calls to be executed again.
  • the tenant may also actively request to close the task. Please continue to refer to FIG. 5 above, and specifically refer to the following process:
  • Step 510 The tenant sends a task close request to the console, where the task close request is used to request to close the task corresponding to the first scheduling request message.
  • Step 511 The console forwards the task close request to the first resource server.
  • Step 512 The first resource server closes the task corresponding to the first scheduling request message according to the task shutdown request, and releases resources scheduled for the task.
  • Step 513 The first resource server sends a task shutdown completion notification message to the console, where the task shutdown completion notification message is used to indicate a result of task shutdown.
  • Step 514 The console forwards the task shutdown completion notification message to the tenant.
  • the scheduling request message of the task of the second type cannot obtain the requested resource immediately, but needs to be queued.
  • the process of creating and interrupting the second type of task can be as shown in FIG. 8. This can include:
  • Step 801 The tenant sends a second scheduling request message to the console of the resource scheduling device.
  • the second scheduling request message is used to request a second amount of resources for a second type of task.
  • the second scheduling request message may further include identity information of the tenant, information of the corresponding task, and the like, which are not limited in the embodiment of the present application, and are not described herein again.
  • Step 802 The console of the resource scheduling device verifies the identity information of the tenant and the validity of the second scheduling request message. How to perform the verification on the console is not limited in the embodiment of the present application, and details are not described herein again. After the console passes the authentication, step 803 is performed, otherwise the tenant's request is rejected, and the authentication is taken as an example for description below.
  • Step 803 The console places the second scheduling request message in the waiting queue.
  • Step 804 The console sends a queue notification message to the tenant, where the queue notification message is used to indicate that the second scheduling request message is in a waiting queue.
  • Step 805 The load control module sends a queuing information request message, where the queuing information request message is used to request to obtain all the scheduling request messages in the waiting queue.
  • Step 806 The load control module receives a queuing information response message, and the queuing information response message includes information such as all queued scheduling request messages in the waiting queue.
  • Step 807 The load control module determines resource scheduling for the task corresponding to the second scheduling request message. It should be noted that when the load control module predicts that the resource load ratio of the resource pool is less than the first threshold, it will filter the scheduling request messages that are being queued in the waiting queue. Thereafter, the filtered scheduling request message and the load rate in each resource server are submitted to the second scheduler as a task scheduling request.
  • the task corresponding to the second scheduling request message is: the task with the smallest amount of resources requested in the waiting queue or the task with the longest waiting time
  • the task may be determined as a task scheduling resource corresponding to the second scheduling request message.
  • Step 808 The load control module sends a second scheduling request message to the second scheduler.
  • Step 809 The second scheduler determines a second resource server from the resource pool according to the second number of resources requested by the second scheduling request message.
  • the second scheduler may select, from at least one resource server included in the resource pool, a resource server whose number of idle resources is greater than a third number as the second resource server.
  • the third quantity may be the actual resource usage of the task corresponding to the second scheduling request message, or may be the product of the second quantity and the budget weight value.
  • the preset weight value is greater than 0 and less than or equal to 1. The number.
  • the second scheduler of the public cloud can perform resource scheduling according to the task resource usage. As shown in FIG. 9, it is assumed that the tenant submits task 1, task 2 and task 3 in this order. Task 1 is the first type of task, task 2 and task 3 are the second type of task, the first scheduler performs resource scheduling according to the task resource request amount, and schedules task 1 to resource server 1. The second scheduler is based on the task. Resource usage is used for resource scheduling, and task 2 is scheduled to resource server 1.
  • the second scheduler Task 3 can still be scheduled to resource server 1, which can effectively avoid resource fragments (resources that cannot be allocated to tenant cloud hosts) in each resource server in the cloud data center, thereby ensuring higher resource utilization.
  • Step 810 The second scheduler sends a task creation request to the second resource server, where the task creation request is used to request creation of a task corresponding to the second scheduling request message, and requests to schedule a third number of resources for the task.
  • Step 811 The proxy module in the second resource server creates a task according to the task creation request, and schedules a third number of resources.
  • Step 812 The second resource server sends a task creation response to the second scheduler, where the task creation response is used to indicate a request result of the task creation request.
  • Step 813 The second resource server sends a task creation notification message to the console, where the task creation notification message is used to indicate a request result of the second scheduling request message.
  • Step 814 The console sends a second scheduling response message to the tenant according to the task creation notification message, where the second scheduling response message is used to indicate a request result of the second scheduling request message.
  • the resource is scheduled for the task corresponding to the second scheduling request message according to the second scheduling request message of the tenant, and a task is created.
  • a resource server may execute a task of a first type and a task of a second type simultaneously. If it is determined that the number of idle resources in the second resource server is less than a second threshold, then selecting N tasks of the second type from a plurality of tasks performed by the second resource server, and releasing the N resources In the resource occupied by the second type of task, N is an integer greater than 0.
  • the N tasks of the second type may also be placed in a waiting queue for subsequent calls to be executed. For details, please continue to refer to FIG. 8, and may further include the following process.
  • Step 815 When the load control module predicts that the number of idle resources in the second resource server is less than a second threshold, it sends a resource release request to the second scheduler.
  • the resource release request is used to request to release some resources.
  • Step 816 The second scheduler determines at least one task of the second type that needs to be interrupted.
  • the second scheduler may determine the M tasks of the second type with the most resource usage as tasks that need to be interrupted, or determine tasks that need to be interrupted according to other methods.
  • the following description uses the second scheduler to determine that the task corresponding to the interruption of the second scheduling request message is taken as an example, and other situations are not described again.
  • Step 817 The second scheduler sends a task interrupt request to the second resource server, where the task interrupt request is used to request that the task corresponding to the second scheduling request message be interrupted and the corresponding resource is released.
  • Step 818 The second resource server interrupts execution of the task corresponding to the second scheduling request message according to the task interruption request, and releases resources scheduled for the task.
  • Step 819 The second resource server sends a task interrupt response to the second scheduler, where the task interrupt response is used to indicate a result of the task interrupt.
  • Step 820 The second resource server sends a task interrupt notification message to the console, where the task interrupt notification message is used to indicate that the task corresponding to the second scheduling request message is interrupted.
  • Step 821 The console forwards the task interruption notification message to the tenant.
  • the method of prioritizing SLA sensitive tasks according to the amount of resource requests is adopted to ensure that each SLA sensitive task can be obtained during the execution To sufficient resources.
  • a scheduling method based on the actual usage of resources is used to fully control the creation and interruption of SLA-insensitive tasks through load monitoring and forecasting while making full use of unused resources requested by SLA-sensitive tasks. Impact on the use of resources for SLA-sensitive tasks.
  • the device 1000 includes a communication module 1001, a processor 1002, a memory 1003, and the like.
  • the communication module 1001 in the embodiment of the present application may be a communication chip with wired or wireless communication capabilities, such as a radio frequency transceiver or a network cable interface, etc., and is configured to obtain the first scheduling request message and the first Second, the processing of scheduling request messages.
  • the processor 1002 in the embodiment of the present application may be an integrated circuit chip and has a signal processing capability.
  • each step of the foregoing method embodiment may be completed by using an integrated logic circuit of hardware in a processor or an instruction in a form of software.
  • the above processor may be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA), or other Programming logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • operations such as determining whether to schedule a resource server and scheduling corresponding resources for a corresponding scheduling request message in the determined resource server may be implemented, and a resource that needs to be released may also be selected.
  • Task which will release the server resources occupied by the selected task.
  • the memory 1003 in the embodiment of the present application may be a random storage, a flash memory, a read-only memory, a programmable read-only memory, or an electrically erasable programmable memory, a register, and other mature storage media in the field.
  • the processor 1002 reads the information in the memory 1003 and, in combination with its hardware, can complete the steps of the above method.
  • this application may be provided as a method, a system, or a computer program product. Therefore, this application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, this application may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, etc.) containing computer-usable program code.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions
  • the device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)
  • Computer And Data Communications (AREA)

Abstract

一种资源调度方法及装置,其中方法包括:在获取到第一调度请求消息时,根据所述第一调度请求消息所请求的第一数量的资源,从资源池中确定第一资源服务器,并在所述第一资源服务器中调度所述第一数量的资源;所述资源池包括至少一个资源服务器;所述第一调度请求消息用于为第一类型的任务请求资源;在获取到第二调度请求消息时,若根据所述资源池的资源负载率确定为所述第二调度请求消息对应的任务调度资源,则根据所述第二调度请求消息所请求的第二数量的资源,从所述资源池中确定第二资源服务器,并在所述第二资源服务器中调度第三数量的资源;所述第三数量小于或等于所述第二数量;所述第二调度请求消息用于为第二类型的任务请求资源。

Description

一种资源调度方法及装置 技术领域
本申请涉及计算机技术领域,尤其涉及一种资源调度方法及装置。
背景技术
在建设云数据中心时,云数据中心的运营商投入了大量的资金用于购置服务器、交换机等计算设施,用于为云计算服务提供计算资源。为了提高云数据中心的计算资源的利用率,运营商通过虚拟化等资源复用技术,将不同租户的计算任务调度到相同的计算设施中。以云主机服务为例,租户根据自身任务执行的需求,从运营商提供的云主机类型列表中选择适当资源配置的云主机进行租用。当租户发出云主机的启动请求时,运营商通过公有云调度系统,根据租户选择的云主机的资源配置,在云数据中心的所有物理服务器中选择一台物理服务器,并在其上启动一台虚拟机作为租户所租用的云主机。在该过程中,合理的虚拟机调度方法可以有效的减少云数据中心中各物理服务器中的资源碎片,从而保证较高的资源利用率。
因此,如何较优的调度资源,是一个亟待解决的问题。
发明内容
本申请实施例提供一种资源调度方法及装置,用以提高资源利用率的同时,提高服务质量。
第一方面,本申请实施例提供一种资源调度方法,在获取到第一调度请求消息时,根据所述第一调度请求消息所请求的第一数量的资源,从资源池中确定第一资源服务器,并在所述第一资源服务器中调度所述第一数量的资源;所述资源池包括至少一个资源服务器;所述第一调度请求消息用于为第一类型的任务请求资源;在获取到第二调度请求消息时,若根据所述资源池的资源负载率确定为所述第二调度请求消息对应的任务调度资源时,则根据所述第二调度请求消息所请求的第二数量的资源,从所述资源池中确定第二资源服务器,并在所述第二资源服务器中调度第三数量的资源;所述第三数量小于或等于所述第二数量;所述第二调度请求消息用于为第二类型的任务请求资源。
通过上述方法,首先,第一资源服务器与第二资源服务器为同一个资源服务器时,第一类型的任务和第二类型的任务可以被调度到同一台资源服务器,从而可以有效利用被第一类型的任务申请到、但未被使用的服务器资源,有效避免公有云场景下的资源浪费、且公有云运营商可以减少购置用于执行第二类型的任务的服务器等硬件资源,从而节省运营商的服务成本。
一种可能的设计中,所述获取到第二调度请求消息之后,还可以将所述第二调度请求消息放入等待队列,所述等待队列中包括至少一个第二类型的任务的调度请求消息;这样根据所述资源池的资源负载率确定为所述第二调度请求消息对应的任务调度资源,可以包括:若确定所述资源池的资源负载率小于第一阈值,且所述第二调度请求消息 对应的任务为:所述等待队列中请求资源数量最小的任务或等待时间最长的任务,则确定为所述第二调度请求消息对应的任务调度资源。
一种可能的设计中,所述方法还包括:若确定所述资源池的资源负载率大于或等于第一阈值,则从所述至少一个资源服务器执行的多个任务中选择M个所述第二类型的任务,并释放所述M个所述第二类型的任务所占用的资源,M为大于0的整数。
另一种可能的设计中,所述方法还包括:若确定所述第二资源服务器中空闲资源的数量小于第二阈值,则从所述第二资源服务器执行的多个任务中选择N个所述第二类型的任务,并释放所述N个所述第二类型的任务所占用的资源,N为大于0的整数。
通过上述方法,可以通过对资源负载进行监控和预测分析,及时中断负载较高的资源服务器上的第二类型的任务,避免第二类型的任务抢占第一类型的任务所需要的资源的情况发生,避免对第一类型的任务的资源使用造成影响。
一种可能的设计中,所述方法还包括:将所述N个所述第二类型的任务放入等待队列中,所述等待队列中包括至少一个第二类型的任务的调度请求消息。
一种可能的设计中,根据所述第一调度请求消息所请求的第一数量的资源,从资源池中确定第一资源服务器时,可以从所述资源池所包括的至少一个资源服务器中,选择一个空闲资源的数量大于所述第一数量的资源服务器作为所述第一资源服务器。
一种可能的设计中,根据所述第二调度请求消息所请求的第二数量的资源,从所述资源池中确定第二资源服务器时,可以从所述资源池所包括的至少一个资源服务器中,选择一个空闲资源的数量大于所述第三数量的资源服务器作为所述第二资源服务器。
上述方法中,所述第一类型的任务可以为基于资源请求量进行资源调度的任务,所述第二类型的任务可以为基于资源使用量进行资源调度的任务。
第二方面,本申请实施例提供一种资源调度装置,所述资源调度装置包括处理器,所述处理器与存储器耦合,其中:存储器用于存储指令;处理器用于根据执行存储器存储的指令,用于执行上述第一方面或第一方面中任一种可能的设计中的方法。
第三方面,本申请实施例提供一种资源调度装置,用于实现上述第一方面或第一方面中的任意一种方法,包括相应的功能模块,例如包括第一调度器、第二调度器和负载控制模块等,分别用于实现以上方法中的步骤。
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机存储介质中存储有计算机可读指令,当计算机读取并执行所述计算机可读指令时,使得计算机执行上述任一方面或任一方面中任一种可能的设计中的方法。
第五方面,本申请实施例提供一种计算机程序产品,当计算机读取并执行所述计算机程序产品时,使得计算机执行上述任一方面或任一方面中任一种可能的设计中的方法。
第六方面,本申请实施例提供一种芯片,所述芯片与存储器相连,用于读取并执行所述存储器中存储的软件程序,以实现上述任一方面或任一方面中任一种可能的设计中的方法。
第七方面,本申请实施例提供一种资源调度系统,包括上述第二方面中的资源调度装置和多个资源服务器。
附图说明
图1为本申请实施例提供的一种资源调度系统结构示意图;
图2为本申请实施例提供的一种资源调度方法流程示意图;
图3为本申请实施例提供的资源调度设备的结构框架图;
图4为本申请实施例提供的一种监控数据发送流程示意图;
图5为本申请实施例提供的一种第一类型的任务的资源调度示意图;
图6为本申请实施例提供的一种资源调度示意图;
图7为本申请实施例提供的一种资源调度示意图;
图8为本申请实施例提供的一种第二类型的任务的资源调度示意图;
图9为本申请实施例提供的一种资源调度示意图;
图10为本申请实施例提供的一种资源调度装置结构示意图。
具体实施方式
下面结合说明书附图对本申请实施例做详细描述。
为便于理解本申请实施例,首先以图1中示出的系统架构为例详细说明本申请实施例适用于的系统。图1示出了适用于本申请实施例的公有云服务系统的架构示意图。如图1所示,包括资源池以及用于控制资源池的资源调度设备,其中资源池可以包括至少一个资源服务器。
租户通过公有云服务的资源调度设备中的控制台(图1中未示出),从不同的服务接口分别提交针对不同类型的任务的调度请求消息给资源调度设备。资源调度设备根据不同类型的任务的调度请求消息从资源池中调度资源服务器,并从调度的资源服务器中调度相应的处理资源分配给租户使用。
上述本申请实施例描述的架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的系统架构同样适用。
结合上述应用场景,参见图2,为本申请实施例提供的一种资源调度方法流程示意图。
该方法包括:
步骤201:资源调度设备在获取到第一调度请求消息时,所述第一调度请求消息用于为第一类型的任务请求资源,资源调度设备根据所述第一调度请求消息所请求的第一数量的资源,从资源池中确定第一资源服务器,并在所述第一资源服务器中调度所述第一数量的资源。
这里任务类型指的是任务对应的业务类型,用于执行一类业务类型的任务可以称之为一种类型的任务。这里的资源包括但不限于处理器资源、存储资源、带宽资源等等。
步骤202:资源调度设备在获取到第二调度请求消息时,所述第二调度请求消息用于为第二类型的任务请求资源,若资源调度设备根据资源池当前的资源负载率确定为所述第二调度请求消息对应的任务调度资源时,则根据所述第二调度请求消息所请求的第二数量的资源,从所述资源池中确定第二资源服务器,并在所述第二资源服务器 中调度第三数量的资源;其中所述第三数量小于或等于所述第二数量。
第一资源服务器与第二资源服务器可以为同一个资源服务器,也可以为两个不同的资源服务器,本申请实施例对此并不限定。
通过上述实施例,通过对两种不同类型的任务使用两种不同的调度方法,可以提高资源利用率,节省运营商的成本,并可避免对服务等级协议(service level agreement,SLA)敏感任务的影响。首先,通过在按资源请求量调度的系统中加入按资源实际使用量的调度方法,使得第一类型的任务和第二类型的任务可以被调度到同一台资源服务器,从而实现有效利用同一个资源服务器上被第一类型的任务申请但未被使用的资源,可有效避免公有云场景下的资源浪费。其次,由于第二类型的任务可以被调度到第一类型的任务空余出的资源,公有云运营商可以减少购置用于执行第二类型的任务的服务器等硬件资源,从而节省运营商的服务成本。
如图3所示,为适用于本申请实施例的公有云服务系统中资源调度设备的架构示意图。如图3所示,资源调度设备包括第一调度器、第二调度器、负载控制模块、等待队列、消息队列模块等模块。上述第一类型的任务的调度请求消息可以提交到第一调度器,将第二类型的任务的调度请求消息提交到第二调度器。
第一调度器用于获取第一类型的任务的调度请求消息,第二调度器用于获取第二类型的任务的调度请求消息。其中,所述第一类型的任务可以为基于资源请求量进行资源调度的任务;所述第二类型的任务可以为基于资源使用量进行资源调度的任务。本申请实施例中,第一类型的任务也可以称为服务等级协议(service level agreement,SLA)敏感型任务,第二类型的任务也可以称为SLA不敏感型任务。其中,SLA敏感型任务,在执行的过程当中,随时都能够根据其需求获取到不超过其资源请求量的资源。而SLA不敏感型任务在执行过程中,可以获得小于其资源请求量的资源,并且当资源池负载过高时,可以将其正在使用的资源回收,进而导致任务执行的中断。
本申请实施例中,预先为每个任务设置相应的类型,即第一类型和第二类型,每种类型的任务只能通过相应的调度器申请资源,从而可以通过第二类型的任务提升资源利用率,同时避免对第一类型的任务的影响。
第一调度器获取到第一类型的任务的调度请求消息之后,根据第一类型的任务的资源请求量,确保该类型的任务在任何时候都能够获取到等于其请求量的资源的数量。
第二调度器获取到第二类型的任务的调度请求消息之后,不会一开始就为其分配所请求的数量的资源,而是先将该任务的调度请求消息放入等待队列进行等待。当根据资源池的资源负载率确定为该类型的任务调度资源时,根据其实际资源使用量的情况,为其最多分配不大于其请求量的资源的数量。第二调度器可以通过负载控制模块对各资源服务器的实际资源使用进行监控和预测,当任务实际资源使用量的预测值增加时,及时关闭部分第二类型的任务,从而保障第一类型的任务可以有充足资源使用。
资源池中的每个资源服务器中包括一个代理(agent)模块,代理模块一方面负责执行调度器的资源分配决策,另一方面负责对其所在资源服务器的资源负载以及资源服务器中各任务的实际资源使用量进行监控。调度器在为任务执行选择了一台资源服务器之后,会将任务的相关数据和信息传送给该资源服务器上的代理模块。代理模块根据调度器的决策,在资源服务器中为将要执行的任务准备执行环境,分配其所需要 的资源并创建任务实例。当调度器决定中断某台资源服务器上的部分SLA不敏感型任务后,将被中断任务的相关信息传递给该资源服务器上的代理模块,由代理模块中断任务的执行,并释放其所占用的资源。
本申请实施例中,每个资源服务器中的代理模块会定期读取资源服务器中各任务的资源实际使用数据,在分析和汇总之后,每个资源服务器中的代理模块会周期性的向消息队列模块发送监控数据。监控数据包括但不限于资源服务器的资源负载率、执行的每种任务实际的资源使用量、执行的每种任务的任务类型等。
举例来说,如图4所示,为本申请实施例提供的一种监控数据发送流程示意图。
步骤401,资源池包括的资源服务器1至资源服务器K,周期性的向消息队列模块发送监控数据。
步骤402,消息队列模块对各资源服务器所发送的监控数据进行分类和汇总,并最终提供给负载控制模块读取。
步骤403,负载控制模块定期地向消息队列模块发送监控数据请求消息,所示监控数据请求消息用于请求获取监控数据。
相应的,步骤404,消息队列模块向负载控制模块发送监控数据响应消息,所述监控数据响应消息中包括负载控制模块请求的监控数据。
负载控制模块读取消息队列模块中各代理模块的监控数据,基于读取到的监控数据,对未来一段时间内(如1小时)资源池的资源负载和租户任务的实际资源使用量进行预测分析,并基于预测结果进行两个方面的逻辑判断。一方面,该负载控制模块根据预测的结果判断是否选择请求队列中所缓存的任务进行执行。当预测的负载较低时,负载控制模块从等待队列中获取调度请求消息(尚执行或被中断执行的调度请求消息),筛选其中适合执行的任务,并通过调度器为其分配计算资源。另一方面,负载控制模块还需要判断是否需要中断正在运行中的SLA不敏感型任务。当某台资源服务器预测的资源负载较高,存在任务实际资源使用量超过服务器资源容量的风险时,负载控制模型会将该资源服务器的信息传递给第二调度器并由第二调度器选择关闭该资源服务器上的部分SLA不敏感型任务,从而确保剩余的任务在执行时能够获取到充足的资源。需要说明的是,负载控制模块具体如何基于读取到的监控数据,对资源池的资源负载和租户任务的实际资源使用量进行预测分析,本申请实施例对此并不限定,在此不再赘述。
本申请实施例中,第一类型的任务的创建和关闭的流程,可以如图5所示。包括如下处理步骤:
步骤501:租户向资源调度设备的控制台发送第一调度请求消息。其中,所述第一调度请求消息用于为第一类型的任务请求第一数量的资源。第一调度请求消息中还可以包括租户的身份信息、对应的任务的信息等,本申请实施例对此并不限定,在此不再赘述。
步骤502:资源调度设备的控制台对租户的身份信息和第一调度请求消息的合法性进行验证。控制台具体如何进行验证,本申请实施例对此并不限定,在此不再赘述。控制台认证通过后,执行步骤503,否则拒绝租户的请求,下面以认证通过为例进行描述。
步骤503:资源调度设备的控制台将第一调度请求消息提交到第一调度器。
步骤504:第一调度器根据所述第一调度请求消息所请求的第一数量的资源,从资源池中确定第一资源服务器。第一调度器可以从所述资源池所包括的至少一个资源服务器中,选择一个空闲资源的数量大于所述第一数量的资源服务器作为第一资源服务器。
为了保证租户提交的第一类型的任务能够获得足够的资源,公有云的第一调度器可以根据任务资源请求量进行资源调度。如图6所示,假设租户依次提交任务1、任务2和任务3。第一调度器根据任务资源请求量进行资源调度,将任务1和任务2调度到资源服务器1上。而当租户提交任务3的执行请求时,若资源服务器1上仍然剩余了未被实际使用的资源,但剩余的资源量小于任务3的资源请求量时,第一调度器会将任务3调度到空闲的资源服务器2中。
步骤505:第一调度器向第一资源服务器发送任务创建请求,所述任务创建请求用于请求创建第一调度请求消息对应的任务,且请求为该任务调度第一数量的资源。
步骤506:第一资源服务器中的代理模块根据所述任务创建请求创建任务,并调度第一数量的资源。
步骤507:第一资源服务器向第一调度器发送任务创建响应,所述任务创建响应用于指示任务创建请求的请求结果。
步骤508:第一资源服务器向控制台发送任务创建通知消息,所述任务创建通知消息用于指示第一调度请求消息的请求结果。
步骤509:控制台根据任务创建通知消息向租户发送第一调度响应消息,所述第一调度响应消息用于指示第一调度请求消息的请求结果。
通过上述过程,实现根据租户的第一调度请求消息,为第一调度请求消息对应的任务调度资源,并创建任务。
进一步的,若确定所述资源池的资源负载率大于或等于第一阈值,则可以从所述至少一个资源服务器执行的多个任务中选择M个所述第二类型的任务,并释放所述M个所述第二类型的任务所占用的资源,M为大于0的整数。可选的,还可以将所述M个所述第二类型的任务放入等待队列中,等待后续被调用执行。
进一步的,本申请实施例中,一个资源服务器中可以同时执行第一类型的任务和第二类型的任务。第一调度器通过对每个资源服务器中各任务的使用的资源量,对每个资源服务器在未来一段时间内的资源负载进行分析和预测,当第一调度器确定所述第一资源服务器中空闲资源的数量小于第二阈值,或者第一资源服务器的负载率大于预设负载率,例如大于90%,则从所述第一资源服务器中选择至少一个第二类型的任务,并释放所述至少一个第二类型的任务所占用的资源。因此,本申请实施例对第一类型的任务进行按资源请求量调度,并通过对资源负载进行监控和预测分析,及时中断负载较高资源服务器上的第二类型的任务。可以避免第二类型的任务抢占第一类型的任务所需要的资源的情况发生,避免对第一类型的任务的资源使用造成影响。即通过上述方法,可以优先保证第一类型的任务的资源分配。
举例来说,如图7所示,假设租户依次提交任务1、任务2和任务3。任务1是第一类型的任务,任务2和任务3是第二类型的任务,第一调度器根据任务资源请求量 进行资源调度,将任务1调度到资源服务器1上;第二调度器根据任务资源使用量进行资源调度,将任务2和任务3调度到资源服务器1上。其中,为任务1调度的资源量等于其资源请求量,为任务2和任务3调度的资源量均小于各自的资源请求量。随着时间变化,任务2和任务3的实际资源使用量逐渐增长并达到了各自的资源请求量,三个任务的实际资源使用量将要超过资源服务器1的资源总容量。此时,可能需要中断部分任务的执行,从而保证其他任务能够获得足够的计算资源。第一类型的任务在执行过程中发生中断将严重影响租户的体验,从而导致运营商的品牌和口碑受到损失。为此,可以将第二类型的任务强行中断,此时可以将任务2和任务3全部中断执行,并释放资源。进一步的,还可以将所述至少一个第二类型的任务重新放入等待队列中,等待后续被调用再次执行。
可选的,本申请实施例中,租户还可以主动请求关闭任务,请继续参照上述图5所示,具体可以参考下面的流程:
步骤510:租户向控制台发送任务关闭请求,所述任务关闭请求用于请求关闭第一调度请求消息对应的任务。
步骤511:控制台将所述任务关闭请求转发至第一资源服务器。
步骤512:第一资源服务器根据所述任务关闭请求,关闭第一调度请求消息对应的任务,并释放调度给该任务的资源。
步骤513:第一资源服务器向控制台发送任务关闭完成通知消息,所述任务关闭完成通知消息用于指示任务关闭的结果。
步骤514:控制台向租户转发任务关闭完成通知消息。
本申请实施例中,与第一类型的任务不同,第二类型的任务的调度请求消息并不能立即获得请求的资源,而是需要排队。具体的,第二类型的任务的创建和中断的流程,可以如图8所示。具体可以包括:
步骤801:租户向资源调度设备的控制台发送第二调度请求消息。其中,所述第二调度请求消息用于为第二类型的任务请求第二数量的资源。第二调度请求消息中还可以包括租户的身份信息、对应的任务的信息等,本申请实施例对此并不限定,在此不再赘述。
步骤802:资源调度设备的控制台对租户的身份信息和第二调度请求消息的合法性进行验证。控制台具体如何进行验证,本申请实施例对此并不限定,在此不再赘述。控制台认证通过后,执行步骤803,否则拒绝租户的请求,下面以认证通过为例进行描述。
步骤803:控制台将第二调度请求消息放入等待队列。
步骤804:控制台向租户发送排队通知消息,所述排队通知消息用于指示所述第二调度请求消息位于等待队列中。
步骤805:负载控制模块发送排队信息请求消息,所述排队信息请求消息用于请求获取等待队列中所有正在排队的调度请求消息。
步骤806:负载控制模块接收排队信息响应消息,所述排队信息响应消息中包括等待队列中所有正在排队的调度请求消息等信息。
步骤807:负载控制模块确定为所述第二调度请求消息对应的任务调度资源。需要 说明的是,当负载控制模块预测到资源池的资源负载率小于第一阈值时,会对等待队列中正在排队的调度请求消息进行筛选。之后将筛选后的调度请求消息以及各资源服务器中的负载率一同作为任务调度请求向第二调度器提交。
举例来说,若确定所述资源池的资源负载率小于第一阈值,且所述第二调度请求消息对应的任务为:所述等待队列中请求的资源数量最小的任务或等待时间最长的任务,则可以确定为所述第二调度请求消息对应的任务调度资源。
步骤808:负载控制模块向第二调度器发送第二调度请求消息。
步骤809:第二调度器根据所述第二调度请求消息所请求的第二数量的资源,从资源池中确定第二资源服务器。第二调度器可以从所述资源池所包括的至少一个资源服务器中,选择一个空闲资源的数量大于第三数量的资源服务器作为第二资源服务器。
本申请实施例中,第三数量可以为第二调度请求消息对应的任务的实际资源使用量,也可以为第二数量与预算权重值的乘积,预设权重值为大于0且小于或等于1的数。
为了保证租户提交的第一类型的任务能够获得足够的资源,且提高资源利用率。公有云的第二调度器可以根据任务资源使用量进行资源调度。如图9所示,假设租户依次提交任务1、任务2和任务3。任务1是第一类型的任务,任务2和任务3是第二类型的任务,第一调度器根据任务资源请求量进行资源调度,将任务1调度到资源服务器1上;第二调度器根据任务资源使用量进行资源调度,将任务2调度到资源服务器1上。而当租户提交任务3的执行请求时,若资源服务器1上仍然剩余了未被实际使用的资源,但剩余的资源量小于任务3的资源请求量时,为了提高资源利用率,第二调度器仍然可以将任务3调度到资源服务器1中,从而可以有效的避免云数据中心中各资源服务器中的资源碎片(无法分配给租户云主机的资源),从而保证较高的资源利用率。
步骤810:第二调度器向第二资源服务器发送任务创建请求,所述任务创建请求用于请求创建第二调度请求消息对应的任务,且请求为该任务调度第三数量的资源。
步骤811:第二资源服务器中的代理模块根据所述任务创建请求创建任务,并调度第三数量的资源。
步骤812:第二资源服务器向第二调度器发送任务创建响应,所述任务创建响应用于指示任务创建请求的请求结果。
步骤813:第二资源服务器向控制台发送任务创建通知消息,所述任务创建通知消息用于指示第二调度请求消息的请求结果。
步骤814:控制台根据任务创建通知消息向租户发送第二调度响应消息,所述第二调度响应消息用于指示第二调度请求消息的请求结果。
通过上述过程,实现根据租户的第二调度请求消息,为第二调度请求消息对应的任务调度资源,并创建任务。
进一步的,本申请实施例中,一个资源服务器中可以同时执行第一类型的任务和第二类型的任务。若确定所述第二资源服务器中空闲资源的数量小于第二阈值,则从所述第二资源服务器执行的多个任务中选择N个所述第二类型的任务,并释放所述N个所述第二类型的任务所占用的资源,N为大于0的整数。可选的,还可以将所述N个所述第二类型的任务放入等待队列中等待后续被调用执行。具体请继续参照图8所示, 还可以包括下面的流程。
步骤815:负载控制模块预测到第二资源服务器中空闲资源的数量小于第二阈值时,向第二调度器发送资源释放请求。所述资源释放请求用于请求释放部分资源。
步骤816:第二调度器确定需要中断执行的至少一个第二类型的任务。第二调度器可以将资源使用量最多的M个第二类型的任务,确定为需要中断执行的任务,也可以根据其他方法确定需要中断执行的任务。下面以第二调度器确定中断第二调度请求消息对应的任务为例进行描述,其他情况不再赘述。
步骤817:第二调度器向第二资源服务器发送任务中断请求,所述任务中断请求用于请求中断执行第二调度请求消息对应的任务,并释放对应的资源。
步骤818:第二资源服务器根据所述任务中断请求,中断执行第二调度请求消息对应的任务,并释放调度给该任务的资源。
步骤819:第二资源服务器向第二调度器发送任务中断响应,所述任务中断响应用于指示任务中断的结果。
步骤820:第二资源服务器向控制台发送任务中断通知消息,所述任务中断通知消息用于指示第二调度请求消息对应的任务被中断执行。
步骤821:控制台向租户转发任务中断通知消息。
通过上述方法,本申请实施例中,根据公有云环境下租户任务的差异性,对SLA敏感型任务采用根据资源请求量优先调度的方法,保证每个SLA敏感型任务在执行过程中都能够获取到足够的资源。对SLA不敏感的任务,采用根据资源实际使用量的调度方法,在充分使用SLA敏感任务请求但未使用的资源的同时,通过负载监控和预测,动态控制SLA不敏感任务的创建与中断,避免对SLA敏感任务的资源使用造成影响。
如图10所示,为本申请实施例提供的一种资源调度装置结构示意图。该装置1000包括:通信模块1001、处理器1002、存储器1003等。
本申请实施例中的通信模块1001可以为具有有线或无线通信能力的通信芯片,比如可以为射频收发器,也可以为网线接口等,用于执行上述方法流程中获取第一调度请求消息和第二调度请求消息的处理等。
本申请实施例中的处理器1002可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以实现上述过程中的确定是否调度资源服务器,并在确定的资源服务器中为相应的调度请求消息调度相应的资源等操作,还可以选择需要释放资源的任务,将为选择的任务释放占用的服务器资源等。
本申请实施例中的存储器1003可以为随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。处理器 1002读取存储器1003中的信息,结合其硬件可以完成上述方法的步骤。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (19)

  1. 一种资源调度方法,其特征在于,包括:
    在获取到第一调度请求消息时,根据所述第一调度请求消息所请求的第一数量的资源,从资源池中确定第一资源服务器,并在所述第一资源服务器中调度所述第一数量的资源;所述资源池包括至少一个资源服务器;所述第一调度请求消息用于为第一类型的任务请求资源;
    在获取到第二调度请求消息时,若根据所述资源池的资源负载率确定为所述第二调度请求消息对应的任务调度资源时,则根据所述第二调度请求消息所请求的第二数量的资源,从所述资源池中确定第二资源服务器,并在所述第二资源服务器中调度第三数量的资源;所述第三数量小于或等于所述第二数量;所述第二调度请求消息用于为第二类型的任务请求资源。
  2. 根据权利要求1所述的方法,其特征在于,所述获取到第二调度请求消息之后,所述方法还包括:
    将所述第二调度请求消息放入等待队列,所述等待队列中包括至少一个第二类型的任务的调度请求消息;
    所述根据所述资源池的资源负载率确定为所述第二调度请求消息对应的任务调度资源,包括:
    若确定所述资源池的资源负载率小于第一阈值,且所述第二调度请求消息对应的任务为:所述等待队列中请求资源数量最小的任务或等待时间最长的任务,则确定为所述第二调度请求消息对应的任务调度资源。
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    若确定所述资源池的资源负载率大于或等于第一阈值,则从所述至少一个资源服务器执行的多个任务中选择M个所述第二类型的任务,并释放所述M个所述第二类型的任务所占用的资源,M为大于0的整数。
  4. 根据权利要求1至3任一所述的方法,其特征在于,所述方法还包括:
    若确定所述第二资源服务器中空闲资源的数量小于第二阈值,则从所述第二资源服务器执行的多个任务中选择N个所述第二类型的任务,并释放所述N个所述第二类型的任务所占用的资源,N为大于0的整数。
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    将所述N个所述第二类型的任务放入等待队列中,所述等待队列中包括至少一个第二类型的任务的调度请求消息。
  6. 根据权利要求1至5任一所述的方法,其特征在于,所述根据所述第一调度请求消息所请求的第一数量的资源,从资源池中确定第一资源服务器,包括:
    从所述资源池所包括的至少一个资源服务器中,选择一个空闲资源的数量大于所述第一数量的资源服务器作为所述第一资源服务器。
  7. 根据权利要求1至6任一所述的方法,其特征在于,所述根据所述第二调度请求消息所请求的第二数量的资源,从所述资源池中确定第二资源服务器,包括:
    从所述资源池所包括的至少一个资源服务器中,选择一个空闲资源的数量大于所述第三数量的资源服务器作为所述第二资源服务器。
  8. 根据权利要求1至7任一所述的方法,其特征在于,所述第一类型的任务为基于资源请求量进行资源调度的任务;
    所述第二类型的任务为基于资源使用量进行资源调度的任务。
  9. 一种资源调度装置,其特征在于,包括:
    第一调度器,用于获取第一调度请求消息;根据所述第一调度请求消息所请求的第一数量的资源,从资源池中确定第一资源服务器,并在所述第一资源服务器中调度所述第一数量的资源;所述资源池包括至少一个资源服务器;所述第一调度请求消息用于为第一类型的任务请求资源;
    负载控制模块,用于确定所述资源池的资源负载率;
    第二调度器,用于获取第二调度请求消息,并获取所述负载控制模块确定的所述资源池的资源负载率;若根据所述资源池的资源负载率确定为所述第二调度请求消息对应的任务调度资源时,则根据所述第二调度请求消息所请求的第二数量的资源,从所述资源池中确定第二资源服务器,并在所述第二资源服务器中调度第三数量的资源;所述第三数量小于或等于所述第二数量;所述第二调度请求消息用于为第二类型的任务请求资源。
  10. 根据权利要求9所述的装置,其特征在于,所述第二调度器,还用于获取到第二调度请求消息之后,将所述第二调度请求消息放入等待队列,所述等待队列中包括至少一个第二类型的任务的调度请求消息;
    所述第二调度器在根据所述资源池的资源负载率确定为所述第二调度请求消息对应的任务调度资源时,具体用于:
    若确定所述资源池的资源负载率小于第一阈值,且所述第二调度请求消息对应的任务为:所述等待队列中请求资源数量最小的任务或等待时间最长的任务,则确定为所述第二调度请求消息对应的任务调度资源。
  11. 根据权利要求9或10所述的装置,其特征在于,所述第二调度器还用于:
    若确定所述资源池的资源负载率大于或等于第一阈值,则从所述至少一个资源服务器执行的多个任务中选择M个所述第二类型的任务,并释放所述M个所述第二类型的任务所占用的资源,M为大于0的整数。
  12. 根据权利要求9至11任一所述的装置,其特征在于,所述第二调度器还用于:
    若确定所述第二资源服务器中空闲资源的数量小于第二阈值,则从所述第二资源服务器执行的多个任务中选择N个所述第二类型的任务,并释放所述N个所述第二类型的任务所占用的资源,N为大于0的整数。
  13. 根据权利要求9至12任一所述的装置,其特征在于,所述第二调度器还用于:
    将所述N个所述第二类型的任务放入等待队列中,所述等待队列中包括至少一个第二类型的任务的调度请求消息。
  14. 根据权利要求9至13任一所述的装置,其特征在于,所述第一调度器从资源池中确定第一资源服务器时,具体用于:
    从所述资源池所包括的至少一个资源服务器中,选择一个空闲资源的数量大于所述第一数量的资源服务器作为所述第一资源服务器。
  15. 根据权利要求9至14任一所述的装置,其特征在于,所述第二调度器从资源 池中确定第二资源服务器时,具体用于:
    从所述资源池所包括的至少一个资源服务器中,选择一个空闲资源的数量大于所述第三数量的资源服务器作为所述第二资源服务器。
  16. 根据权利要求9至15任一所述的装置,其特征在于,所述第一类型的任务为基于资源请求量进行资源调度的任务;
    所述第二类型的任务为基于资源使用量进行资源调度的任务。
  17. 一种资源调度装置,其特征在于,包括至少一个处理器,所述至少一个处理器与至少一个存储器耦合:
    所述至少一个处理器,用于执行所述至少一个存储器中存储的计算机程序或指令,以使得所述装置执行如权利要求1至8中任一项所述的方法。
  18. 一种资源调度系统,其特征在于,包括至少一个资源服务器和如权利要求9~17任一所述的资源调度装置。
  19. 一种可读存储介质,其特征在于,包括程序或指令,当所述程序或指令被执行时,如权利要求1至8中任意一项所述的方法被执行。
PCT/CN2019/090886 2018-09-11 2019-06-12 一种资源调度方法及装置 WO2020052301A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/199,121 US20210200587A1 (en) 2018-09-11 2021-03-11 Resource scheduling method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811055961.7 2018-09-11
CN201811055961.7A CN109298936B (zh) 2018-09-11 2018-09-11 一种资源调度方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/199,121 Continuation US20210200587A1 (en) 2018-09-11 2021-03-11 Resource scheduling method and apparatus

Publications (1)

Publication Number Publication Date
WO2020052301A1 true WO2020052301A1 (zh) 2020-03-19

Family

ID=65166860

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/090886 WO2020052301A1 (zh) 2018-09-11 2019-06-12 一种资源调度方法及装置

Country Status (3)

Country Link
US (1) US20210200587A1 (zh)
CN (2) CN113407317A (zh)
WO (1) WO2020052301A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282395A (zh) * 2021-06-09 2021-08-20 中国农业银行股份有限公司 基于Redis的作业请求调度方法、装置、设备及介质

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113407317A (zh) * 2018-09-11 2021-09-17 华为技术有限公司 一种资源调度方法及装置
CN109871489A (zh) 2019-03-06 2019-06-11 网宿科技股份有限公司 一种智能识别系统中的资源检索方法及智能识别系统
US11765618B2 (en) * 2020-03-20 2023-09-19 Nokia Technologies Oy Wireless communication system
CN111679900B (zh) * 2020-06-15 2023-10-31 杭州海康威视数字技术股份有限公司 任务处理方法和装置
CN113590326B (zh) * 2021-07-30 2024-02-02 北京百度网讯科技有限公司 服务资源调度方法和装置
CN114679495B (zh) * 2022-02-08 2024-01-05 阿里云计算有限公司 一种资源服务操作请求的调度编排方法和调度执行方法
CN114327841B (zh) * 2022-03-16 2022-06-21 上海闪马智能科技有限公司 一种资源调度方法、装置、存储介质及电子装置
CN115061800A (zh) * 2022-06-30 2022-09-16 中国联合网络通信集团有限公司 边缘计算任务的处理方法、边缘服务器及存储介质
CN115292006B (zh) * 2022-09-02 2023-04-14 北京睿芯高通量科技有限公司 一种PaaS平台中的资源同步方法
CN115994019B (zh) * 2023-01-10 2023-06-06 杭州比智科技有限公司 基于大数据集群下多租户资源动态计算的策略方法及系统
CN115834714B (zh) * 2023-02-09 2023-06-16 中国证券登记结算有限责任公司 一种跨平台任务调度方法、服务器和系统
CN116185310B (zh) * 2023-04-27 2023-07-14 中茵微电子(南京)有限公司 一种存储器数据读写调度方法及装置
CN117407178B (zh) * 2023-12-14 2024-04-02 成都凯迪飞研科技有限责任公司 一种自适应负载分配的加速子卡管理方法及系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102937912A (zh) * 2012-11-28 2013-02-20 华为技术有限公司 虚拟机调度方法和设备
CN104079503A (zh) * 2013-03-27 2014-10-01 华为技术有限公司 一种资源分配方法及装置
US9262224B2 (en) * 2012-09-07 2016-02-16 International Business Machines Corporation Resource management via iterative negotiation
CN107634978A (zh) * 2016-07-19 2018-01-26 华为技术有限公司 一种资源调度方法及装置
CN108268318A (zh) * 2016-12-30 2018-07-10 华为技术有限公司 一种分布式系统任务分配的方法和装置
CN108429631A (zh) * 2017-02-15 2018-08-21 华为技术有限公司 一种网络业务实例化的方法及装置
CN109298936A (zh) * 2018-09-11 2019-02-01 华为技术有限公司 一种资源调度方法及装置

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7203943B2 (en) * 2001-10-31 2007-04-10 Avaya Technology Corp. Dynamic allocation of processing tasks using variable performance hardware platforms
US20070016907A1 (en) * 2005-07-12 2007-01-18 Fabio Benedetti Method, system and computer program for automatic provisioning of resources to scheduled jobs
JP5256744B2 (ja) * 2008-01-16 2013-08-07 日本電気株式会社 資源割当てシステム、資源割当て方法及びプログラム
EP2369478A1 (en) * 2010-02-22 2011-09-28 Telefonaktiebolaget L M Ericsson (PUBL) Technique of scheduling tasks in a system
JP5733302B2 (ja) * 2010-03-11 2015-06-10 日本電気株式会社 リソース配分装置及びプログラム
ES2413562B1 (es) * 2011-07-01 2014-08-18 Telefónica, S.A. Método y sistema para gestionar la asignación de recursos en despliegues escalables
CN103795804A (zh) * 2014-02-24 2014-05-14 华为技术有限公司 存储资源调度方法及存储计算系统
US9749208B2 (en) * 2014-06-30 2017-08-29 Microsoft Technology Licensing, Llc Integrated global resource allocation and load balancing
CN105988872B (zh) * 2015-02-03 2020-02-18 阿里巴巴集团控股有限公司 一种cpu资源分配的方法、装置及电子设备
CN106326002B (zh) * 2015-07-10 2020-10-20 阿里巴巴集团控股有限公司 资源调度方法、装置及设备
US9678796B2 (en) * 2015-07-24 2017-06-13 Xerox Corporation Methods and systems for determining computational resource requirement
US9569277B1 (en) * 2016-01-29 2017-02-14 International Business Machines Corporation Rebalancing virtual resources for virtual machines based on multiple resource capacities
CN107018091B (zh) * 2016-02-29 2021-04-27 阿里巴巴集团控股有限公司 资源请求的调度方法和装置
CN106201723A (zh) * 2016-07-13 2016-12-07 浪潮(北京)电子信息产业有限公司 一种数据中心的资源调度方法及装置
US10733024B2 (en) * 2017-05-24 2020-08-04 Qubole Inc. Task packing scheduling process for long running applications
JP6924083B2 (ja) * 2017-06-22 2021-08-25 株式会社日立製作所 情報処理システムおよびリソース割り当て方法
CN107357661B (zh) * 2017-07-12 2020-07-10 北京航空航天大学 一种针对混合负载的细粒度gpu资源管理方法
WO2019033428A1 (zh) * 2017-08-18 2019-02-21 北京小米移动软件有限公司 上行资源分配方法、装置和终端
US20190391851A1 (en) * 2018-06-21 2019-12-26 Nutanix, Inc. System and method for managing memory in virtual machines

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9262224B2 (en) * 2012-09-07 2016-02-16 International Business Machines Corporation Resource management via iterative negotiation
CN102937912A (zh) * 2012-11-28 2013-02-20 华为技术有限公司 虚拟机调度方法和设备
CN104079503A (zh) * 2013-03-27 2014-10-01 华为技术有限公司 一种资源分配方法及装置
CN107634978A (zh) * 2016-07-19 2018-01-26 华为技术有限公司 一种资源调度方法及装置
CN108268318A (zh) * 2016-12-30 2018-07-10 华为技术有限公司 一种分布式系统任务分配的方法和装置
CN108429631A (zh) * 2017-02-15 2018-08-21 华为技术有限公司 一种网络业务实例化的方法及装置
CN109298936A (zh) * 2018-09-11 2019-02-01 华为技术有限公司 一种资源调度方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282395A (zh) * 2021-06-09 2021-08-20 中国农业银行股份有限公司 基于Redis的作业请求调度方法、装置、设备及介质

Also Published As

Publication number Publication date
CN109298936B (zh) 2021-05-18
CN113407317A (zh) 2021-09-17
US20210200587A1 (en) 2021-07-01
CN109298936A (zh) 2019-02-01

Similar Documents

Publication Publication Date Title
WO2020052301A1 (zh) 一种资源调度方法及装置
US9727372B2 (en) Scheduling computer jobs for execution
US8424007B1 (en) Prioritizing tasks from virtual machines
CN107291547B (zh) 一种任务调度处理方法、装置及系统
US8694644B2 (en) Network-aware coordination of virtual machine migrations in enterprise data centers and clouds
US20190303200A1 (en) Dynamic Storage-Aware Job Scheduling
JP5954074B2 (ja) 情報処理方法、情報処理装置、及びプログラム。
CN109697122B (zh) 任务处理方法、设备及计算机存储介质
US8056083B2 (en) Dividing a computer job into micro-jobs for execution
US20160306647A1 (en) Method for affinity binding of interrupt of virtual network interface card, and computer device
US20140033212A1 (en) Multi-Tenant Queue Controller
US20180365075A1 (en) Resource Management Method and Apparatus
CN105022668B (zh) 一种作业调度方法及系统
CN107430526B (zh) 用于调度数据处理的方法和节点
CN115617497B (zh) 线程处理方法、调度组件、监测组件、服务器和存储介质
CN113626173B (zh) 调度方法、装置及存储介质
CN114327894A (zh) 资源分配方法、装置、电子设备及存储介质
CN111831408A (zh) 异步任务处理方法、装置、电子设备及介质
US20140380304A1 (en) Methods and systems for energy management in a virtualized data center
US11388050B2 (en) Accelerating machine learning and profiling over a network
US10303580B2 (en) Controlling debug processing
CN105159620A (zh) 存储QoS控制策略的实现方法及装置
WO2024109787A1 (zh) 数据处理方法、装置和系统
Liu et al. Cooperative job scheduling and data allocation for busy data-intensive parallel computing clusters
CN117472570A (zh) 用于调度加速器资源的方法、装置、电子设备和介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19859710

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19859710

Country of ref document: EP

Kind code of ref document: A1