WO2022002247A1 - 资源调度方法、电子设备及存储介质 - Google Patents

资源调度方法、电子设备及存储介质 Download PDF

Info

Publication number
WO2022002247A1
WO2022002247A1 PCT/CN2021/104248 CN2021104248W WO2022002247A1 WO 2022002247 A1 WO2022002247 A1 WO 2022002247A1 CN 2021104248 W CN2021104248 W CN 2021104248W WO 2022002247 A1 WO2022002247 A1 WO 2022002247A1
Authority
WO
WIPO (PCT)
Prior art keywords
application
model
intelligent
resource
smart
Prior art date
Application number
PCT/CN2021/104248
Other languages
English (en)
French (fr)
Inventor
金士英
王振宇
韩炳涛
屠要峰
高洪
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to EP21834695.5A priority Critical patent/EP4177745A4/en
Priority to US18/014,125 priority patent/US20230273833A1/en
Publication of WO2022002247A1 publication Critical patent/WO2022002247A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5055Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5019Workload prediction

Definitions

  • the present application relates to the technical field of intelligent applications, and in particular, to a resource scheduling method, an electronic device and a storage medium.
  • Intelligent application refers to the application of intelligent technology and management mainly based on artificial intelligence applications and led by big data intelligence.
  • Build an intelligent application model database store commonly used intelligent application models in the database, and directly call the corresponding intelligent application models in the database under different intelligent application scenarios (such as smart home, intelligent transportation, intelligent education, intelligent retail, etc.), There is no need to repeat the creation process of the intelligent application model, so that the deployment of the intelligent application can be accelerated, which is of great significance for the deployment and promotion of the intelligent application.
  • the call to the smart application model needs to occupy a certain amount of system resources (for example, the number of CPU cores, GPU, memory, chip resources, etc.), and the known smart application models are deployed on devices with limited system resources. , it is easy to cause problems such as poor deployment flexibility of the intelligent application model, low overall operating efficiency of the device, and affecting the operation of the existing functional modules of the device.
  • Embodiments of the present application provide a resource scheduling method, an electronic device, and a storage medium.
  • the embodiments of the present application provide a resource scheduling method, which includes: acquiring an intelligent application processing request; acquiring current resource usage information; matching an intelligent application instance according to the intelligent application processing request; task to handle smart application processing requests.
  • an embodiment of the present application provides an electronic device.
  • the electronic device includes a memory, a processor, and a program stored in the memory and running on the processor.
  • the program is executed by the processor, the first aspect of the present application is implemented.
  • embodiments of the present application provide a storage medium for computer-readable storage, where the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement the present invention.
  • the resource scheduling method of some embodiments of the first aspect of the application is applied.
  • FIG. 1 is a flowchart of resource scheduling provided by an embodiment of the present application
  • FIG. 2 is a flowchart of an embodiment before step S0130 in FIG. 1;
  • FIG. 3 is a flowchart of an embodiment of step S0220 in FIG. 2;
  • Fig. 4 is a flowchart of another embodiment before step S0130 in Fig. 1;
  • FIG. 5 is a flowchart of an embodiment of step S0330 in FIG. 3;
  • FIG. 6 is a flowchart of an embodiment of step S0140 in FIG. 1;
  • FIG. 7 is a flowchart of an embodiment of step S0640 in FIG. 6;
  • FIG. 8 is a flowchart of an embodiment before step S0140 in FIG. 1;
  • FIG. 9 is a flowchart of another embodiment before step S0140 in FIG. 1;
  • FIG. 10 is a system block diagram of a resource scheduling system provided by an embodiment of the present application.
  • FIG. 11 is a functional schematic diagram of the resource monitoring unit in FIG. 10 .
  • the intelligent application model includes a face/fingerprint recognition model, an image/file classification model, a network traffic/data traffic/traffic control model, and the like.
  • the intelligent application processing request is the application requirement of face recognition
  • the corresponding face recognition model in the model database is queried or the instance corresponding to the face recognition model is directly queried, and the idle computing of the system is used.
  • the resource runs an instance of face recognition, creates a face recognition task, and then performs face recognition through the face recognition task.
  • intelligent application models are generally deployed on devices with "unlimited” system resources (which can be dynamically expanded), such as servers.
  • This method has large delays, poor flexibility, difficult management of intelligent application services, and high data security risks.
  • problem For example, in the CS architecture composed of servers and base stations, a common way to deploy intelligent applications is to use the system resources on existing servers (such as servers) in the core network to add one or more intelligent application services for a specific model.
  • edge devices such as base stations send requests such as inference (prediction) through the network interface, and the server processes and returns the results.
  • This approach has the following disadvantages:
  • the time delay is large. There is a certain communication overhead between the base station and the server. For some scenarios with high requirements for inference delay or large data volume, the effect is not satisfactory.
  • the base station is responsible for generating training and inference requests, but the types of requests generated are not single, and usually involve multiple services of multiple models (solving problems in different scenarios and different algorithms).
  • the method brings certain design difficulty and complexity to the base station, and also has certain inconvenience in management.
  • the data security risk is high. More importantly, in the current situation where data privacy is more important, if the base station and the server belong to different service providers, there is a risk of exposing user data.
  • the overall operation efficiency of the base station is low.
  • the models on the server do not affect each other and can be scheduled independently.
  • the independent scheduling between models will have the problem of resource preemption, resulting in the models with frequent demands being unable to execute tasks in time. , and the model with less demand has been occupying resources and not releasing, and the overall operating efficiency of the system is low;
  • the embodiments of the present application provide a resource scheduling method, an electronic device, and a storage medium, which can adaptively schedule system resources, use idle resources of the system to call an intelligent application model, and improve resource utilization, thereby effectively improving the intelligent application model.
  • the flexibility of deployment and the overall operation efficiency of the equipment do not affect the operation of the existing functional modules of the equipment.
  • an embodiment of the present application provides a resource scheduling method. Referring to FIG. 1 , the method includes the following specific steps:
  • one task can only process one smart application processing request, and one smart application instance can only process one task at a time.
  • obtain an intelligent application processing request match a corresponding intelligent application instance for the intelligent application processing request, obtain the resource requirements and current resource usage information of the intelligent application instance, if there are currently idle resources, and the amount of idle resources is not less than the intelligent application instance If the estimated resource of the application instance is used, a task is created by using the intelligent application model, and the task is used to process the processing request of the intelligent application.
  • the resource usage information includes resource information being used by the system and current idle resource information.
  • the resource requirement of an intelligent application instance is the amount of resources required to run the intelligent application instance, that is, the estimated resource amount of the intelligent application instance.
  • the estimated amount of resources is unknown, you need to record the amount of resources actually occupied by the smart application instance when running the smart application instance, and use the estimated amount of resources as the estimated amount of resources.
  • the estimated amount of resources can be run once the
  • the actual amount of resources recorded during the smart application instance can also be the average amount of resources, the minimum amount of resources, or the maximum amount of resources calculated based on the actual amount of resources recorded each time when the smart application instance is run multiple times.
  • step S0130 the following steps are further included:
  • S0220 Create at least one intelligent application instance according to the intelligent application model, resource requirements and resource usage information corresponding to the intelligent application model.
  • a smart application processing request is obtained, and a corresponding smart application instance is matched for the smart application processing request. If no corresponding smart application instance currently exists, a smart application instance needs to be created.
  • the resource usage information includes resource information being used by the system and current idle resource information.
  • the resource requirement of an intelligent application model is the amount of resources required to run the intelligent application model, that is, the estimated resource amount of the intelligent application model. If the estimated amount of resources is unknown, you need to record the amount of resources actually occupied by the intelligent application model when running the intelligent application model, and use the estimated amount of resources as the estimated amount of resources.
  • the estimated amount of resources can be run once.
  • the actual amount of resources recorded when the model is intelligently applied may also be the average amount of resources, the minimum amount of resources, or the maximum amount of resources calculated based on the actual amount of resources recorded each time when the intelligent application model is run multiple times.
  • each intelligent application model can be used to create a corresponding intelligent application instance. If there is a need for concurrent processing tasks, you need to create additional smart application instances.
  • the intelligent application model matching the processing request of the intelligent application is re-executed, and each intelligent application model is used again to create a corresponding intelligent application instance. In the process of traversing the model database, only one smart application instance can be created in each traversal, or multiple smart application instances can be created through one traversal.
  • multiple intelligent application instances can be created according to one intelligent application model through one traversal, or through multiple traversals, only one corresponding intelligent application instance can be created for each intelligent application model in each traversal.
  • the network topology structure of the model, weights obtained by training, data information, and the like may also be obtained according to the intelligent application model.
  • Each model has at least one corresponding instance, and the total number of models meets the following conditions: the total number of models is not greater than the total number of instances, and each model can provide external services and functions.
  • Each task has a corresponding instance, and the number of active tasks meets the following conditions: the total number of active tasks is not greater than the total number of active model instances.
  • step S0220 includes the following steps:
  • step S0330 Compare the estimated amount of the first resource with the amount of free resources, if the estimated amount of the first resource is not greater than the amount of free resources, then execute step S0340; if the estimated amount of the first resource is greater than the amount of free resources, execute step S0350;
  • an intelligent application instance To create an intelligent application instance, you need to obtain an intelligent application model, match a corresponding intelligent application model for the intelligent application processing request, and obtain the resource requirements and current resource usage information of the intelligent application model. If the resource amount is not less than the first estimated resource amount of the intelligent application model, an intelligent application instance is created by using the intelligent application model, and the intelligent application instance is used to create a task. If there is no idle resource currently, the intelligent application processing request is buffered in the queue, and the scheduling of the queue is awaited. The number of queues is not less than the number of intelligent application models stored in the model unit, and the scheduling of queues can be performed by using the Round Robin round-robin scheduling algorithm or according to the first-in, first-out rule.
  • step S0130 the following steps are further included:
  • the intelligent application models in the model database are prioritized, and the intelligent application models with high priority are used first.
  • the high-priority model has an advantage in initial resource allocation, and the initialization priority can be default or artificially configured.
  • the priority can be set according to the usage frequency of each smart application model, and the smart application model with high usage frequency is set as a high priority. If multiple smart application models have the same priority, one smart application model is randomly selected for calling.
  • an intelligent application processing request match a corresponding intelligent application processing request for the intelligent application processing request according to the priority information of the intelligent application model, obtain the resource requirements of the intelligent application model and the current resource usage information, if there are currently idle resources, And the amount of idle resources is not less than the first estimated amount of resources of the intelligent application model, then an intelligent application instance is created by using the intelligent application model, and the intelligent application instance is used to create a task.
  • step S0430 includes the following steps:
  • step S0550 Compare the estimated amount of the second resource with the amount of idle resources, and if the estimated amount of the second resource is not greater than the amount of idle resources, execute step S0560; if the estimated amount of the second resource is greater than the amount of idle resources, execute step S0570;
  • the number of queues is not less than the number of intelligent application models stored in the model unit.
  • the scheduling of queues can be performed by using the Round Robin round-robin scheduling algorithm or according to the first-in-first-out rule. If there is no idle resource currently, the intelligent application processing request is buffered in the queue, and the scheduling of the queue is awaited.
  • the number of queues is not less than the number of intelligent application models stored in the model unit.
  • the scheduling of queues can be performed by using the Round Robin round-robin scheduling algorithm or according to the first-in-first-out rule.
  • step S0140 further includes: merging, caching or sorting the smart application processing request and the current pending request according to the current pending request information.
  • the pending request information includes the number of pending smart application processing requests buffered in the queue, the scheduling sequence of the queue, and the data information of each pending smart application processing request. If there are multiple pending smart application processing requests of the same or the same type in the queue, the multiple pending smart application processing requests may be combined into one pending smart application processing request.
  • step S0140 includes the following specific steps:
  • the task concurrency information includes the number of concurrent tasks, and the number of concurrent tasks refers to the number of tasks that are processed at the same time.
  • the status information of the smart application instance refers to whether the smart application instance is in an idle state or a running state.
  • the current status information of a smart application instance depends on the current resource usage information and the resource requirements of the smart application instance.
  • the current resource usage information is affected by the current number of concurrent tasks. The more, the less corresponding free resources.
  • the processing order of the intelligent application processing request is determined according to the scheduling order of the queue.
  • the intelligent application processing request corresponds to an intelligent application instance.
  • the queue starts to schedule the intelligent application processing request, if the intelligent application instance is currently in an idle state, a task can be created by using the intelligent application instance, and the task is used for the intelligent application instance.
  • the application handles the request for processing. If the smart application instance is currently running, continue to cache the smart application processing request in the queue.
  • step S0640 includes the following specific steps:
  • step S0720 Compare the number of concurrent tasks with the maximum number of concurrent tasks. If the number of concurrent tasks is not greater than the number of concurrent tasks, perform step S0730; if the number of concurrent tasks is greater than the number of concurrent tasks, perform step S0740.
  • the current status information of a smart application instance depends on the current resource usage information and the resource requirements of the smart application instance.
  • the current resource usage information is affected by the current number of concurrent tasks. The more, the less corresponding free resources.
  • the maximum number of concurrent tasks is the upper limit of the number of concurrent tasks, which is limited by the amount of system resources. If the current number of concurrent tasks is not greater than the maximum number of concurrent tasks, it is considered that there are currently idle resources; if the current number of concurrent tasks is greater than the maximum number of concurrent tasks, it is considered that there are currently no idle resources available for scheduling. Acquires a smart application processing request.
  • the smart application processing request needs to be cached in a queue, and multiple pending smart application processing requests can be buffered in the queue.
  • the processing order of the intelligent application processing request is determined according to the scheduling order of the queue.
  • the intelligent application processing request corresponds to an intelligent application instance. If the current number of concurrent tasks is not greater than the maximum number of concurrent tasks, it means that there are currently idle resources.
  • the queue starts to schedule the intelligent application processing request, if the current intelligent application instance is idle state, a task can be created by using the smart application instance, and the task is used to process the processing request of the smart application. If the smart application instance is currently running, continue to cache the smart application processing request in the queue.
  • the current resource usage information can be obtained, and then the status information of the smart application instance can be obtained according to the current resource usage information and the resource requirements of the smart application instance.
  • step S0720 before step S0720, it also includes: obtaining the current number of waiting tasks, if the current number of waiting tasks is greater than the preset number of waiting tasks threshold, then according to the current maximum concurrent number of tasks, the system upper limit task and the preset adjustment factor to adjust the maximum number of concurrent tasks.
  • the number of waiting tasks refers to the number of cached tasks in the queue. If at a certain moment, the number of cached tasks in the queue exceeds the upper limit of the number of cached tasks, that is, the current number of waiting tasks is greater than the preset number of waiting tasks, the current maximum number of tasks needs to be adjusted.
  • the number of concurrent tasks Use equation (1) to adjust the maximum number of concurrent tasks:
  • T t+1 min(T t (1+a),T top ) (1)
  • T t represents the maximum number of concurrent tasks at time t
  • T t+1 represents the maximum number of concurrent tasks at time t+1
  • T top represents the maximum number of tasks in the system, which is limited by the amount of system resources.
  • a represents the adjustment factor, and a is a natural number. Use the adjustment factor to adjust the current maximum number of concurrent tasks, and compare the adjusted maximum number of concurrent tasks with the maximum number of tasks in the system. If the adjusted maximum number of concurrent tasks is greater than the maximum number of tasks in the system, the maximum number of tasks in the system is determined as the next moment.
  • the adjusted maximum number of concurrent tasks is less than the maximum number of tasks in the system, the adjusted maximum number of concurrent tasks will be determined as the maximum number of concurrent tasks at the next moment; if the adjusted maximum number of concurrent tasks and the system If the number of upper limit tasks is equal, the adjusted maximum number of concurrent tasks or the number of system upper limit tasks can be determined as the maximum number of concurrent tasks at the next moment.
  • step S0140 the following specific steps are further included:
  • a smart application model can create at least one smart application instance, and a smart application instance can create at least one task.
  • One task is used to process a smart application processing request. Therefore, a smart application instance has a corresponding smart application model, and a smart application processing There is one smart application instance corresponding to the request, that is, one smart application model can be used to process at least one smart application processing request. All pending requests corresponding to an intelligent application model can be extracted from the pending request information, so as to know the number of pending requests corresponding to the intelligent application model.
  • the number of concurrent instances is the number of instances running at the same time, and the number of concurrent instances is limited by the amount of system resources. The current resource usage information is affected by the number of concurrent instances.
  • step S0140 the following specific steps are further included:
  • each smart application model has an initial priority that does not take into account pending requests in the cache queue.
  • the initial priority is determined by prioritizing all intelligent application models in the model database, and intelligent application models with higher priorities are used first.
  • the initial priority can be set according to the usage frequency of each smart application model, and the smart application model with high usage frequency is set as a high priority. If multiple smart application models have the same priority, one smart application model is randomly selected for calling.
  • the initial priority information for each smart application model can be obtained.
  • the pending requests corresponding to each intelligent application model can be extracted from the pending request information, so that the number of pending requests and all pending requests corresponding to each intelligent application model can be obtained, and then the pending requests corresponding to an intelligent application model can be obtained.
  • the priority of the intelligent application model can be adjusted, and then the priority of each intelligent application model can be obtained, and each intelligent application can be adjusted according to the order of priority.
  • the number of concurrent instances corresponding to the model Use equation (2) to adjust the priority of the smart application model:
  • P i represents the original priority of model i
  • P i ' represents the updated priority of model i
  • b represents the weight factor when updating the priority
  • the weight factor b is the weight of the number of pending requests corresponding to model i to the total number of pending requests
  • b is a natural number
  • Cache i represents the number of pending requests for model i in a period of time
  • Cache j represents the number of pending requests for all models in this period of time.
  • system resources can be adaptively scheduled, and the intelligent application model can be called by using the idle resources of the system to improve resource utilization, and then quickly and effectively process intelligent applications.
  • Process requests improve the processing efficiency of intelligent applications, improve the flexibility of intelligent application model deployment and the overall operating efficiency of the device, and do not affect the operation of existing functional modules of the device.
  • the relationship between computing resources and computing tasks can be dynamically adjusted to achieve a balance between the two.
  • a resource scheduling system 1000 is shown.
  • the resource scheduling system 1000 can execute the resource scheduling method of the above-mentioned embodiments, which includes: a model unit 1010 configured to store at least one Intelligent application model, recording the first resource requirement and first concurrent requirement of each intelligent application model; the instance unit 1020, connected to the model unit 1010, is configured to store at least one intelligent application instance and record the second resource of each intelligent application instance requirements and second concurrent requirements; the task unit 1030, connected to the instance unit 1020, is configured to obtain at least one intelligent application processing request and process the intelligent application processing request; the resource monitoring unit 1040 is connected to the model unit 1010, the instance unit 1020 and the The task unit 1030 is configured to monitor the resource usage of the resource scheduling system 1000 in real time and obtain resource usage information.
  • the first resource requirement of an intelligent application model is the amount of resources required to run the intelligent application model, that is, the estimated resource amount of the intelligent application model. If the estimated amount of resources is unknown, you need to record the amount of resources actually occupied by the intelligent application model when running the intelligent application model, and use the estimated amount of resources as the estimated amount of resources.
  • the estimated amount of resources can be run once.
  • the actual amount of resources recorded when the model is intelligently applied may also be the average amount of resources, the minimum amount of resources, or the maximum amount of resources calculated based on the actual amount of resources recorded each time when the intelligent application model is run multiple times.
  • the first concurrency requirement of an intelligent application model is the number of models expected to run the intelligent application model at the same time, the number of models is preset to 1, and can be adjusted according to actual task requests.
  • the second resource requirement of an intelligent application instance is the amount of resources occupied by running the intelligent application instance, that is, the estimated resource amount of the intelligent application instance. If the estimated amount of resources is unknown, you need to record the amount of resources actually occupied by the smart application instance when running the smart application instance, and use the estimated amount of resources as the estimated amount of resources.
  • the estimated amount of resources can be run once the
  • the actual amount of resources recorded during the smart application instance can also be the average amount of resources, the minimum amount of resources, or the maximum amount of resources calculated based on the actual amount of resources recorded each time when the smart application instance is run multiple times.
  • the second concurrency requirement of an intelligent application instance is the number of instances expected to run the intelligent application instance at the same time, the number of instances is preset to 1, and can be adjusted according to actual task requests.
  • a task can only process one smart application processing request. If there are not enough resources to process all smart application processing requests, the unprocessed smart application processing requests will be cached in the queue, and the number of queues is not less than the model unit storage. According to the number of intelligent application models, the scheduling of the queue can be performed by using the Round Robin round-robin scheduling algorithm or according to the first-in-first-out rule. To create a task, you need to find an instance that can process the task. If the estimated resource of the instance is not greater than the current idle resource, create a task based on the instance; if the estimated resource of the instance is greater than the current idle resource resources, the corresponding intelligent application processing requests are buffered in the queue, waiting for the scheduling of the queue.
  • the resource monitoring unit monitors the resource usage of the resource scheduling system in real time, and synchronizes the resource usage information to the model unit, instance unit, and task unit in real time, so as to further adjust the relationship between resources and operating conditions.
  • the resource usage information includes the resource information being used by the system and the current idle resource information.
  • the resource monitoring unit checks whether there is an active and idle instance of the corresponding model requested. If there is, it uses the request and instance to create a task and allocate resources to run. Otherwise, according to the policy, it decides to cache the request and Wait or return failure directly.
  • the resource monitoring unit will dynamically track the running status of tasks and the active status of instances, and adjust the upper limit of the number of instances and tasks if necessary, that is, to establish a closed loop between system resource occupation and allocation , to achieve the purpose of rational utilization of resources by dynamically adjusting the number of model concurrency and task concurrency.
  • the system reports the number of active tasks, waiting tasks, active instances or cache requests
  • the resource monitoring unit monitors the operation, and outputs and adjusts the number of active tasks, the number of waiting tasks, the number of active tasks, and the number of active tasks. Number of instances or cache requests.
  • the model unit is further configured to set the priority of the intelligent application model, and the intelligent application model with a higher priority is used preferentially.
  • the high-priority model has an advantage in initial resource allocation, and the initialization priority can be default or artificially configured.
  • the priority can be set according to the usage frequency of each smart application model, and the smart application model with high usage frequency is set as a high priority. If multiple smart application models have the same priority, one smart application model is randomly selected for calling.
  • the model unit is further configured to store network topology, weight and data information of each intelligent application model.
  • the instance unit is further configured to create an intelligent application instance according to the intelligent application model, resource requirements corresponding to the intelligent application model, and current resource usage information.
  • An intelligent application model can be configured to create at least one intelligent application instance, and one intelligent application instance can be configured to create at least one task, and one task can only handle one intelligent application processing request.
  • an intelligent application instance can only process one task, so the number of concurrent tasks is not greater than the number of concurrent instances.
  • the number of concurrent tasks refers to the number of tasks being processed at the same time, and the number of concurrent instances refers to the number of concurrently running instances.
  • the corresponding number of active model instances is determined according to the current resource situation, the number of models, the estimated resource of the model, and the expected concurrency of the model, and a corresponding number of running instances are created for each model.
  • it is determined whether to create a task to prepare for execution or to cache data according to the idleness and activity of the corresponding instance.
  • For the created task determine whether to process the task immediately or wait according to the task concurrency of the current system.
  • the active running status of the instance, the time-sharing running status of the task, and the data cache status are counted and evaluated to determine whether to dynamically load the instance or release the resources of the instance.
  • the system creates a model instance when it is powered on, and sorts the models in the model database according to the priority from high to low.
  • the priority of the model can be configured by default or manually.
  • Prepare to start traversing all models for the first time, let i 0, for the i-th model, prepare to create the first running instance for it, judge whether the resources requested by model i are less than the system idle resources, and if so, create an instance for model i , otherwise end the power-on process.
  • Create an instance for model i and then if model i knows the demand for its own resources, judge whether it has traversed all the models in the warehouse, otherwise, record the minimum amount of resources required for the model to start the instance, which is convenient for evaluation during subsequent system startups calculate.
  • each model has at least one instance corresponding to it, and the total number of models satisfies the following conditions: the total number of models is not greater than the total number of instances, and each model can provide external services and functions.
  • Each task has a corresponding instance, and the number of active tasks meets the following conditions: the total number of active tasks is not greater than the total number of active model instances.
  • Models can be deployed and scheduled flexibly, making full use of resources without wasting, and can dynamically adjust the relationship between computing resources and computing tasks to achieve a balance between the two.
  • model inference and training since data processing is only performed inside the device and does not interact with other devices, user data can be protected and task processing delays can be reduced.
  • an embodiment of the present application provides an electronic device.
  • the electronic device includes a memory, a processor, and a program stored in the memory and running on the processor.
  • the program is executed by the processor, the first aspect of the present application is implemented.
  • the electronic device may be a mobile terminal device or a non-mobile terminal device.
  • Mobile terminal devices can be mobile phones, tablet computers, notebook computers, PDAs, vehicle-mounted terminal devices, wearable devices, super mobile personal computers, netbooks, personal digital assistants, etc.
  • non-mobile terminal devices can be personal computers, televisions, ATMs or Self-service machines, etc. The embodiments of this application are not specifically limited.
  • embodiments of the present application provide a storage medium for computer-readable storage, where the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement the present invention.
  • the resource scheduling method of some embodiments of the first aspect of the application is applied.
  • the embodiments of the present application include: acquiring smart application processing requests; acquiring current resource usage information; matching smart application instances according to smart application processing requests; creating tasks according to the resource usage information and the smart application instances to process the smart applications Process the request.
  • the embodiments of the present application can adaptively schedule system resources, use the idle resources of the system to call the intelligent application model, and improve resource utilization, thereby effectively improving the flexibility of intelligent application model deployment and the overall operating efficiency of the device, and does not affect existing equipment. Function module operation.
  • the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components Components execute cooperatively.
  • Some or all physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit .
  • Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • Computer storage media includes both volatile and nonvolatile implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules or other data flexible, removable and non-removable media.
  • Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices, or may Any other medium used to store desired information and which can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and can include any information delivery media, as is well known to those of ordinary skill in the art .

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种资源调度方法、电子设备及存储介质。本申请实施例包括:获取智能应用处理请求(S0110);获取当前的资源使用信息(S0120);根据智能应用处理请求匹配智能应用实例(S0130);根据所述资源使用信息和所述智能应用实例创建任务,以处理所述智能应用处理请求(S0140)。

Description

资源调度方法、电子设备及存储介质
相关申请的交叉引用
本申请基于申请号为202010635990.1、申请日为2020年07月03日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及智能应用技术领域,尤其涉及一种资源调度方法、电子设备及存储介质。
背景技术
智能应用是指以人工智能应用为主,以大数据智能化为引领的智能化技术与管理的应用。建立智能应用模型数据库,将常用的智能应用模型存储在数据库中,在不同的智能应用场景(例如,智能家居、智能交通、智能教育、智能零售等)下直接调用数据库中对应的智能应用模型,而无需重复智能应用模型的创建过程,从而能够加快智能应用的部署,对于智能应用的部署和推广具有重要意义。
然而,对智能应用模型的调用需占用一定的系统资源(例如,CPU核数、GPU、内存、芯片资源等),已知的智能应用模型部署在系统资源有限的设备上,由于设备系统资源有限,容易产生智能应用模型部署灵活性差、设备整体的运行效率较低、影响设备现有功能模块运行等问题。
发明内容
本申请实施例提供了一种资源调度方法、电子设备及存储介质。
第一方面,本申请实施例提供了一种资源调度方法,包括:获取智能应用处理请求;获取当前的资源使用信息;根据智能应用处理请求匹配智能应用实例;根据资源使用信息和智能应用实例创建任务,以处理智能应用处理请求。
第二方面,本申请实施例提供了一种电子设备,电子设备包括存储器、处理器以及存储在存储器上并可在处理器上运行的程序,程序被处理器执行时,实现本申请第一方面一些实施例的资源调度方法。
第三方面,本申请实施例提供了一种存储介质,用于计算机可读存储,存储介质存储有一个或者多个程序,一个或者多个程序可被一个或者多个处理器执行,以实现本申请第一方面一些实施例的资源调度方法。
本申请的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请而了解。本申请的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。
附图说明
图1是本申请一实施例提供的资源调度的流程图;
图2是图1中步骤S0130之前的一实施例的流程图;
图3是图2中步骤S0220的一实施例的流程图;
图4是图1中步骤S0130之前的另一实施例的流程图;
图5是图3中步骤S0330的一实施例的流程图;
图6是图1中步骤S0140的一实施例的流程图;
图7是图6中步骤S0640的一实施例的流程图;
图8是图1中步骤S0140之前的一实施例的流程图;
图9是图1中步骤S0140之前的另一实施例的流程图;
图10是本申请一实施例提供的资源调度系统的系统框图;
图11是图10中资源监控单元的功能示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
需要说明的是,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于流程图中的顺序执行所示出或描述的步骤。说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
在本申请实施例中,智能应用模型包括人脸/指纹识别模型、图像/文件分类模型、网络流量/数据流量/话务量控制模型等。例如,若智能应用处理请求为人脸识别的应用需求,则根据人脸识别的应用需求在模型数据库中查询相应的人脸识别模型或者直接查询该人脸识别模型对应的实例,利用系统空闲的计算资源运行人脸识别的实例,创建一个人脸识别的任务,进而通过人脸识别的任务进行人脸识别。
随着人工智能、深度学习在各行各业的应用场景越来越广泛,智能应用能够完成的任务类型也越来越多,通常涉及多个模型的多个服务(解决不同场景、不同算法的问题)。
在一些情况中,智能应用模型一般部署在系统资源“无限的”(可以动态扩展)的设备上,如服务器,这种方式存在时延大、灵活性差、智能应用服务管理难、数据安全风险高等问题。例如,在服务器、基站组成的CS架构,常见的部署智能应用的方式是利用核心网现有的服务器(比如服务器)上的系统资源,在其上增加一个或者多个针对特定模型的智能应用服务,而基站等边缘设备通过网络接口发送推理(预测)等请求,服务器进行处理并返回结果。这种方式存在以下缺点:
1)时延大。基站到服务器存在一定的通信开销,对于某些对推理时延要求比较高、或者数据量比较大的场景,效果不能尽如人意。
2)灵活性差。将所有模型都部署在服务器上,灵活性不足,比如在某些基站上,可能存在需要根据实际情况灵活调整模型的部署方式和运行情况的需求(比如调整模型支持的并发请求数),这种方案中不能灵活调整,即不能做到个性化。
3)智能应用服务管理难。基站作为真正的数据产生“源”,负责产生训练和推理请求, 但产生的请求类型并不是单一的,通常涉及多个模型的多个服务(解决不同场景、不同算法的问题),采用这种方式给基站带来了一定的设计难度和复杂性,管理上也存在一定不便。
4)数据安全风险高。更为重要的是,在当前数据隐私比较看重的情况下,如果基站和服务器属于不同的服务商,存在暴露用户数据的风险。
而在资源受限的设备中,由于设备系统资源有限,容易产生智能应用模型部署灵活性差、设备整体的运行效率较低、影响设备现有功能模块运行等问题。例如,在服务器、基站组成的CS架构,如将智能应用模型部署在基站(资源受限设备)中,存在以下问题:
1)部署灵活性差。基站等设备上虽然有“空闲”的资源,但这也是有上限的,如果也是单个模型单个部署,后续任务运行时就必然会受到影响,任务存在严重的“排队”现象。
2)基站整体的运行效率较低。服务器上的模型之间是互不影响的,各自可以独立调度,但是在基站上因为是在有限的资源条件下,模型间独立调度会存在资源抢占的问题,导致需求频繁的模型不能及时执行任务,而需求少的模型又一直占着资源不释放,系统整体的运行效率较低;
3)影响基站现有功能模块。基站本身提供的功能很多,其上已经驻留了很多功能模块,智能系统部署得不当,极限情况下会影响现有功能模块的运行,反而带来了隐患。
基于此,本申请实施例提供了一种资源调度方法、电子设备及存储介质,能够自适应地调度系统资源,利用系统的空闲资源调用智能应用模型,提高资源利用率,从而有效提升智能应用模型部署的灵活性和设备整体的运行效率,并且不影响设备现有功能模块的运行。
第一方面,本申请实施例提供了一种资源调度方法,参照图1,方法包括如下具体步骤:
S0110.获取智能应用处理请求;
S0120.获取当前的资源使用信息;
S0130.根据智能应用处理请求匹配智能应用实例;
S0140.根据资源使用信息和智能应用实例创建任务,以处理智能应用处理请求。
在一些实施例中,一个任务仅能处理一个智能应用处理请求,一个智能应用实例在某一时刻仅能处理一个任务。获取一个智能应用处理请求,为该智能应用处理请求匹配一个相应的智能应用实例,获取该智能应用实例的资源需求和当前的资源使用信息,如果当前存在空闲资源,且空闲资源量不小于该智能应用实例的资源预估量,则利用该智能应用模型创建一个任务,该任务用于处理该智能应用处理请求。其中,资源使用信息包括系统正在使用的资源信息和当前的空闲资源信息。一个智能应用实例的资源需求为运行该智能应用实例所需占用的资源量,即该智能应用实例的资源预估量。该资源预估量如果是未知的,则需要在运行该智能应用实例时,记录该智能应用实例实际占用的资源量,将该资源量作为资源预估量,资源预估量可以是运行一次该智能应用实例时记录的实际资源量,也可以是运行多次该智能应用实例时,根据每一次记录的实际资源量计算得到的平均资源量、最低资源量或最高资源量。
在一些实施例中,参照图1和图2,步骤S0130之前还包括如下步骤:
S0210.根据智能应用处理请求获取至少一个智能应用模型;
S0220.根据智能应用模型、智能应用模型对应的资源需求和资源使用信息创建至少一个智能应用实例。
获取一个智能应用处理请求,为该智能应用处理请求匹配一个相应的智能应用实例,如果当前不存在相应的智能应用实例,则需要创建一个智能应用实例。创建一个智能应用实例,需获取一个智能应用模型,通过遍历模型数据库为该智能应用处理请求匹配一个相应的智能应用模型,获取该智能应用模型的资源需求和当前的资源使用信息,如果当前存在空闲资源,且空闲资源量不小于该智能应用模型的资源预估量,则利用该智能应用模型创建一个智能应用实例,该智能应用实例用于创建一个任务。其中,资源使用信息包括系统正在使用的资源信息和当前的空闲资源信息。一个智能应用模型的资源需求为运行该智能应用模型所需占用的资源量,即该智能应用模型的资源预估量。该资源预估量如果是未知的,则需要在运行该智能应用模型时,记录该智能应用模型实际占用的资源量,将该资源量作为资源预估量,资源预估量可以是运行一次该智能应用模型时记录的实际资源量,也可以是运行多次该智能应用模型时,根据每一次记录的实际资源量计算得到的平均资源量、最低资源量或最高资源量。在一些实施例中,根据智能应用处理请求遍历一次模型数据库,能够找到对应的至少一个智能应用模型,利用每一个智能应用模型创建对应的智能应用实例。如果有并发处理任务的需求,则需要额外创建智能应用实例。通过第二次遍历模型数据库,重新执行与智能应用处理请求相匹配的智能应用模型,再次利用每一个智能应用模型创建对应的智能应用实例。在遍历模型数据库的过程中,每一次遍历可以仅创建一个智能应用实例,也可以通过一次遍历创建多个智能应用实例。当需要处理并发任务时,可以通过一次遍历,根据一个智能应用模型创建多个智能应用实例,也可以通过多次遍历,每一次遍历中每一个智能应用模型只创建一个对应的智能应用实例。在一些实施例中,根据智能应用模型还可以获取模型的网络拓扑架构、训练得到的权重、数据信息等。每个模型至少存在一个与之对应的实例,模型总数满足下列条件:模型总数不大于实例总数,且每个模型都可以对外提供服务和功能。每个任务都有一个与之对应的实例,活动的任务数满足下列条件:活动的任务总数不大于活动的模型实例总数。
在一些实施例中,参照图2和图3,步骤S0220包括如下步骤:
S0310.根据智能应用模型对应的资源需求获取第一资源预估量;
S0320.根据资源使用信息获取当前的空闲资源量;
S0330.比较第一资源预估量和空闲资源量,如果第一资源预估量不大于空闲资源量,则执行步骤S0340;如果第一资源预估量大于空闲资源量,则执行步骤S0350;
S0340.根据智能应用模型创建至少一个智能应用实例;
S0350.缓存智能应用处理请求。
创建一个智能应用实例,需获取一个智能应用模型,为该智能应用处理请求匹配一个相应的智能应用模型,获取该智能应用模型的资源需求和当前的资源使用信息,如果当前存在空闲资源,且空闲资源量不小于该智能应用模型的第一资源预估量,则利用该智能应用模型创建一个智能应用实例,该智能应用实例用于创建一个任务。如果当前不存在空闲资源,则将该智能应用处理请求缓存在队列中,等待队列的调度。队列的数目不小于模型单元存储的智能应用模型的数目,队列的调度可采用Round Robin轮询调度算法或按照先 进先出的规则进行调度。
在另一些实施例中,参照图1和图4,步骤S0130之前还包括如下步骤:
S0410.对模型数据库中的智能应用模型进行优先级排序,获取优先级信息;
S0420.根据智能应用处理请求和优先级信息获取一个智能应用模型;
S0430.根据智能应用模型、智能应用模型对应的资源需求和资源使用信息创建至少一个智能应用实例。
对模型数据库中的智能应用模型进行优先级排序,优先级高的智能应用模型被优先使用。高优先级的模型在初次资源分配上占有优势,初始化优先级可以采取默认或者人为配置。优先级可以根据各个智能应用模型的使用频率进行设置,使用频率高的智能应用模型被设置为高优先级。如果多个智能应用模型的优先级相同,则在其中随机选取一个智能应用模型进行调用。获取一个智能应用处理请求,根据智能应用模型的优先级信息为该智能应用处理请求匹配一个相应的智能应用模型,获取该智能应用模型的资源需求和当前的资源使用信息,如果当前存在空闲资源,且空闲资源量不小于该智能应用模型的第一资源预估量,则利用该智能应用模型创建一个智能应用实例,该智能应用实例用于创建一个任务。
在一些实施例中,参照图4和图5,步骤S0430包括如下步骤:
S0510.根据优先级信息获取智能应用模型的调度信息;
S0520.根据调度信息确定待调度智能应用模型;
S0530.根据待调度智能应用模型的资源需求获取第二资源预估量;
S0540.根据资源使用信息获取当前的空闲资源量;
S0550.比较第二资源预估量和空闲资源量,如果第二资源预估量不大于空闲资源量,则执行步骤S0560;如果第二资源预估量大于空闲资源量,则执行步骤S0570;
S0560.根据待调度智能应用模型创建至少一个智能应用实例;
S0570.缓存智能应用处理请求。
创建一个智能应用实例,需获取一个智能应用模型。根据智能应用模型的优先级排序获取智能应用模型的调度顺序,按照调度顺序从前到后为该智能应用处理请求匹配一个相应的智能应用模型,即待调度智能应用模型,获取待调度智能应用模型的资源需求和当前的资源使用信息,如果当前存在空闲资源,且空闲资源量不小于该智能应用模型的第二资源预估量,则利用待调度智能应用模型创建一个智能应用实例,该智能应用实例用于创建一个任务。如果当前不存在空闲资源,则将该智能应用处理请求缓存在队列中,等待队列的调度。队列的数目不小于模型单元存储的智能应用模型的数目,队列的调度可采用Round Robin轮询调度算法或按照先进先出的规则进行调度。如果当前不存在空闲资源,则将该智能应用处理请求缓存在队列中,等待队列的调度。队列的数目不小于模型单元存储的智能应用模型的数目,队列的调度可采用Round Robin轮询调度算法或按照先进先出的规则进行调度。
在一些实施例中,参照图1,步骤S0140之前还包括:根据当前的待处理请求信息将智能应用处理请求与当前的待处理请求进行合并、缓存或排序。
获取一个智能应用处理请求,如果当前没有空闲资源,则需将该智能应用处理请求缓存到队列中,队列中可以缓存多个待处理智能应用处理请求。待处理请求信息包括队列中缓存的待处理智能应用处理请求的数目、队列的调度顺序和每个待处理智能应用处理请求 的数据信息。如果队列中存在相同或相同类型的多个待处理智能应用处理请求,则可以将多个待处理智能应用处理请求合并成为一个待处理智能应用处理请求。
在一些实施例中,参照图1和图6,步骤S0140包括如下具体步骤:
S0610.获取智能应用实例的状态信息;
S0620.获取当前的任务并发信息;
S0630.根据智能应用实例、智能应用实例的状态信息和任务并发信息创建任务。
任务并发信息包括任务并发数,任务并发数是指同时进行处理的任务数。智能应用实例的状态信息是指智能应用实例是处于空闲状态还是运行状态。一个智能应用实例当前的状态信息取决于当前的资源使用信息和该智能应用实例的资源需求,当前的资源使用信息受到当前的任务并发数的影响,当前的任务并发数越大,正在使用的资源就越多,相应的空闲资源就越少。获取一个智能应用处理请求,如果当前没有空闲资源,则需将该智能应用处理请求缓存到队列中,队列中可以缓存多个待处理智能应用处理请求。根据队列的调度顺序确定该智能应用处理请求的处理顺序。该智能应用处理请求对应一个智能应用实例,在队列开始调度该智能应用处理请求时,如果当前该智能应用实例处于空闲状态,则可以利用该智能应用实例创建一个任务,该任务用于对该智能应用处理请求进行处理。如果当前该智能应用实例处于运行状态,则继续在队列中缓存该智能应用处理请求。
在一些实施例中,参照图6和图7,步骤S0640包括如下具体步骤:
S0710.根据任务并发信息获取任务并发数;
S0720.比较任务并发数和最大任务并发数,如果任务并发数不大于最大任务并发数,则执行步骤S0730;如果任务并发数大于最大任务并发数,则执行步骤S0740。
S0730.根据智能应用实例和智能应用实例的状态信息创建一个任务;
S0740.继续缓存智能应用处理请求。
一个智能应用实例当前的状态信息取决于当前的资源使用信息和该智能应用实例的资源需求,当前的资源使用信息受到当前的任务并发数的影响,当前的任务并发数越大,正在使用的资源就越多,相应的空闲资源就越少。在确定智能应用实例的状态信息之前需确认当前的资源使用信息,可以获取最大任务并发数,最大任务并发数是任务并发数的上限,该上限受限于系统资源量。如果当前的任务并发数不大于最大任务并发数,则认为当前存在空闲资源;如果当前的任务并发数大于最大任务并发数,则认为当前没有可供调度的空闲资源。获取一个智能应用处理请求,如果当前没有空闲资源,则需将该智能应用处理请求缓存到队列中,队列中可以缓存多个待处理智能应用处理请求。根据队列的调度顺序确定该智能应用处理请求的处理顺序。该智能应用处理请求对应一个智能应用实例,如果当前的任务并发数不大于最大任务并发数,则表示当前存在空闲资源,在队列开始调度该智能应用处理请求时,如果当前该智能应用实例处于空闲状态,则可以利用该智能应用实例创建一个任务,该任务用于对该智能应用处理请求进行处理。如果当前该智能应用实例处于运行状态,则继续在队列中缓存该智能应用处理请求。通过比较当前的任务并发数和最大任务并发数,能够得知当前的资源使用信息,进而根据当前的资源使用信息和智能应用实例的资源需求获取该智能应用实例的状态信息。
在一些实施例中,参照图7,步骤S0720之前还包括:获取当前的等待任务数,若当前的等待任务数大于预设的等待任务数阈值,则根据当前的最大任务并发数、系统上限任 务数和预设的调整因子调整最大任务并发数。
等待任务数是指队列中缓存任务的数量,如果某一时刻,队列中缓存任务的数量超过缓存数量的上限,即当前的等待任务数大于预设的等待任务数阈值,则需要调整当前的最大任务并发数。使用式(1)调整最大任务并发数:
T t+1=min(T t(1+a),T top)       (1)
其中,T t表示t时刻最大任务并发数,T t+1表示t+1时刻最大任务并发数,T top表示系统上限任务数,系统上限任务数受限于系统资源量。a表示调整因子,a为自然数。利用调整因子调整当前的最大任务并发数,比较调整后的最大任务并发数和系统上限任务数,如果调整后的最大任务并发数大于系统上限任务数,则将系统上限任务数确定为下一时刻的最大任务并发数;如果调整后的最大任务并发数小于系统上限任务数,则将调整后的最大任务并发数确定为下一时刻的最大任务并发数;如果调整后的最大任务并发数和系统上限任务数相等,则可以将调整后的最大任务并发数或系统上限任务数确定为下一时刻的最大任务并发数。
在一些实施例中,参照图1和图8,步骤S0140之前还包括如下具体步骤:
S0810.根据智能应用实例获取对应的智能应用模型;
S0820.获取缓存队列中智能应用模型对应的待处理请求数;
S0830.根据待处理请求数调整智能应用模型的实例并发数。
一个智能应用模型可以创建至少一个智能应用实例,一个智能应用实例可以创建至少一个任务,一个任务用于处理一个智能应用处理请求,因此一个智能应用实例存在对应的一个智能应用模型,一个智能应用处理请求存在对应的一个智能应用实例,也就是说,一个智能应用模型可以被用于处理至少一个智能应用处理请求。可以从待处理请求信息中提取一个智能应用模型对应的全部待处理请求,从而得知该智能应用模型对应的待处理请求数。实例并发数是同时运行的实例数,实例并发数的设置受限于系统资源量。当前的资源使用信息受到实例并发数的影响,当前的实例并发数越大,运行实例所占用的资源就越多,可供调度的空闲资源就越少。如果某一时刻,某个智能应用模型对应的待处理请求的数量达到缓存数量的上限,则需要调大当前该智能应用模型对应的实例并发数。如果某一时间段,某个智能应用模型对应的智能应用实例一直处于空闲状态,即该智能应用模型对应的待处理请求一直未被处理,则需要调小当前该智能应用模型对应的实例并发数。
在一些实施例中,参照图1和图9,步骤S0140之前还包括如下具体步骤:
S0910.根据当前并发的智能应用实例获取至少两个对应的智能应用模型;
S0920.获取缓存队列中每个智能应用模型对应的待处理请求数;
S0930.获取智能应用模型的优先级信息;
S0940.根据每个智能应用模型对应的待处理请求数获取智能应用模型对应的权重信息;
S0950.根据优先级信息和权重信息调整智能应用模型的优先级;
S0960.根据智能应用模型的优先级和待处理请求数调整智能应用模型对应的实例并发数。
如果某一时刻,缓存队列中的待处理请求的数量达到缓存数量的上限,则需要调大当 前的实例并发数,如果当前并发的智能应用实例涉及到多个不同的智能应用模型,则需要确定每个智能应用模型的优先级,按照优先级的顺序调整每个智能应用模型对应的实例并发数。每个智能应用模型存在初始优先级,初始优先级并未考虑缓存队列中的待处理请求。初始优先级是对模型数据库中的全部智能应用模型进行优先级排序所确定的,优先级高的智能应用模型被优先使用。初始优先级可以根据各个智能应用模型的使用频率进行设置,使用频率高的智能应用模型被设置为高优先级。如果多个智能应用模型的优先级相同,则在其中随机选取一个智能应用模型进行调用。可以获取每个智能应用模型的初始优先级信息。可以从待处理请求信息中提取每一个智能应用模型对应的待处理请求,从而能够得到每一个智能应用模型对应的待处理请求数和全部待处理请求数,进而能够获取一个智能应用模型对应的待处理请求数占全部待处理请求数的权重信息。权重信息可以根据该智能应用模型对应的待处理请求的重要程度或紧急程度或数据量大小进行设置。根据一个智能应用模型的初始优先级信息和该智能应用模型对应的权重信息调整该智能应用模型的优先级,进而能够获取每个智能应用模型的优先级,按照优先级的顺序调整每个智能应用模型对应的实例并发数。使用式(2)调整智能应用模型的优先级:
Figure PCTCN2021104248-appb-000001
其中,P i表示模型i原来的优先级,P i'表示模型i更新后的优先级。b表示更新优先级时的权重因子,权重因子b为模型i对应的待处理请求数占全部待处理请求数的权重,b为自然数。Cache i表示模型i在一段时间内的待处理请求数,Cache j表示在该段时间内所有模型的待处理请求数。
在本申请实施例中,通过对模型、实例、任务的逐层控制,能够自适应地调度系统资源,利用系统的空闲资源调用智能应用模型,提高资源利用率,进而快速、有效地处理智能应用处理请求,提高智能应用的处理效率,提升智能应用模型部署的灵活性和设备整体的运行效率,并且不影响设备现有功能模块的运行。可以动态调节计算资源和计算任务之间的关系,达到二者之间的平衡。在执行模型的推理和训练时,由于数据处理仅在设备内部进行,并不与其它设备进行数据交互,从而能够保护用户数据、降低任务处理时延。
在一些实施例中,参照图10和图11,示出了一种资源调度系统1000,资源调度系统1000能够执行上述实施例的资源调度方法,其包括:模型单元1010,被设置成储存至少一个智能应用模型、记录每个智能应用模型的第一资源需求和第一并发需求;实例单元1020,连接模型单元1010,被设置成存储至少一个智能应用实例、记录每个智能应用实例的第二资源需求和第二并发需求;任务单元1030,连接实例单元1020,被设置成获取至少一个智能应用处理请求并对智能应用处理请求进行处理;资源监控单元1040,分别连接模型单元1010、实例单元1020和任务单元1030,被设置成实时监控资源调度系统1000的资源使用情况并获取资源使用信息。
模型单元中,一个智能应用模型的第一资源需求为运行该智能应用模型所需占用的资源量,即该智能应用模型的资源预估量。该资源预估量如果是未知的,则需要在运行该智能应用模型时,记录该智能应用模型实际占用的资源量,将该资源量作为资源预估量,资 源预估量可以是运行一次该智能应用模型时记录的实际资源量,也可以是运行多次该智能应用模型时,根据每一次记录的实际资源量计算得到的平均资源量、最低资源量或最高资源量。一个智能应用模型的第一并发需求为期望同时运行该智能应用模型的模型数,该模型数预设为1,可根据实际任务请求进行调整。
实例单元中,一个智能应用实例的第二资源需求为运行该智能应用实例所需占用的资源量,即该智能应用实例的资源预估量。该资源预估量如果是未知的,则需要在运行该智能应用实例时,记录该智能应用实例实际占用的资源量,将该资源量作为资源预估量,资源预估量可以是运行一次该智能应用实例时记录的实际资源量,也可以是运行多次该智能应用实例时,根据每一次记录的实际资源量计算得到的平均资源量、最低资源量或最高资源量。一个智能应用实例的第二并发需求为期望同时运行该智能应用实例的实例数,该实例数预设为1,可根据实际任务请求进行调整。
任务单元中,一个任务仅能处理一个智能应用处理请求,如果没有足够的资源处理全部的智能应用处理请求,则未处理的智能应用处理请求被缓存在队列中,队列的数目不小于模型单元存储的智能应用模型的数目,队列的调度可采用Round Robin轮询调度算法或按照先进先出的规则进行调度。创建一个任务,需找到能够处理该任务的一个实例,如果该实例的资源预估量不大于当前的空闲资源量,则根据该实例创建一个任务;如果该实例的资源预估量大于当前的空闲资源量,则相应的智能应用处理请求被缓存在队列中,等待队列的调度。
资源监控单元实时监控资源调度系统的资源使用情况,将资源使用信息实时同步到模型单元、实例单元和任务单元,用于进一步调整资源和运行情况之间的关系。资源使用信息包括系统正在使用的资源信息和当前的空闲资源信息。对于任一请求,资源监控单元查找其请求的对应模型是否存在活动的且空闲状态的实例,如果存在,则使用该请求和实例创建出任务并分配资源运行,否则根据策略,决定是缓存请求并等待还是直接返回失败。在系统的运行过程中,资源监控单元会动态跟踪任务运行情况以及实例的活动状态,必要时对实例个数、任务个数的上限进行调整,即在系统资源占用、分配之间建立起一个闭环,达到通过动态调整模型并发数、任务并发数,实现合理利用资源的目的。在一些实施例中,如图11所示,系统将活动任务数、等待任务数、活动实例数或缓存请求数上报,资源监控单元进行运行监控,并输出调整活动任务数、等待任务数、活动实例数或缓存请求数。
在一些实施例中,模型单元还被设置成设置智能应用模型的优先级,优先级高的智能应用模型被优先使用。高优先级的模型在初次资源分配上占有优势,初始化优先级可以采取默认或者人为配置。优先级可以根据各个智能应用模型的使用频率进行设置,使用频率高的智能应用模型被设置为高优先级。如果多个智能应用模型的优先级相同,则在其中随机选取一个智能应用模型进行调用。在另一些实施例中,模型单元还被设置成存储每个智能应用模型的网络拓扑结构、权重和数据信息。
在一些实施例中,实例单元还被设置成根据智能应用模型、智能应用模型对应的资源需求和当前的资源使用信息创建智能应用实例。一个智能应用模型可以被设置成创建至少一个智能应用实例,一个智能应用实例可以被设置成创建至少一个任务,一个任务仅能处理一个智能应用处理请求。在某一时刻,一个智能应用实例仅能处理一个任务,因此任务并发数不大于实例并发数。任务并发数是指同时进行处理的任务数,实例并发数是指同时 运行的实例数。
在系统上电时,根据当前资源情况、模型个数、模型的资源预估量以及模型预期并发数,确定对应的活动模型实例数,为每个模型创建出对应数量的运行实例。在系统运行过程中,对于某个模型的具体训练或推理请求,根据对应实例的空闲、活动情况,确定是创建任务准备执行还是缓存数据。对于创建出的任务,根据当前系统的任务并发情况,确定任务是立即处理还是等待。在处理任务时,对实例的活动运行状态,任务的分时运行情况,以及数据的缓存情况进行统计和评估,确定是动态加载实例还是释放实例的资源。在一些实施例中,系统在上电启动时创建模型实例,对模型数据库中的模型按照优先级从高到低排序,初始时模型的优先级可以采取默认或者人为配置。准备开始第一次遍历所有模型,令i=0,对于第i个模型,准备为其创建第一个运行实例,判断模型i请求的资源是否小于系统空闲资源,如果是则为模型i创建实例,否则结束上电流程。为模型i创建实例,然后如果模型i已知自己资源的需求情况,则判断是否遍历完仓库中的所有模型,否则将模型启动实例所需的最低资源量记录下来,便于系统后续启动时的评估计算。判断是否遍历完仓库中的所有模型,如果是则准备第二次遍历模型数据库,否则执行下一个模型继续遍历。第二次遍历模型数据库,重新执行第一个模型,对于模型i,模型i如果有并发处理任务的需求,则认为其需要额外创建活动实例(即对于模型i,实例总数>=2)。如果需要创建额外的活动实例,则判断模型i请求的资源是否小于系统空闲资源,如果是则为模型i创建新的实例,否则判断是否遍历完所有模型。如果不需要创建额外的活动实例,则判断是否遍历完所有模型。判断是否遍历完所有模型,如果是则结束整个流程,否则指向下一个模型,继续遍历。系统保证在上电时对系统资源和模型进行了初次平衡,如果全部模型都上电成功,则意味着所有模型都可以对外提供基础服务,至少存在一个活动的实例,并且不会对现有系统造成影响;否则,则优先级高的模型首先创建实例成功,对外只能提供部分模型的服务功能。
在本申请实施例中,每个模型至少存在一个与之对应的实例,模型总数满足下列条件:模型总数不大于实例总数,且每个模型都可以对外提供服务和功能。每个任务都有一个与之对应的实例,活动的任务数满足下列条件:活动的任务总数不大于活动的模型实例总数。资源调度系统通过对模型、实例、任务的逐层控制,来保证在提供所需推理、训练功能的同时,最大化资源使用效率。在不影响原有功能的基础上,利用现有的空闲计算资源即可以搭建出需要的智能推理、训练系统。模型之间可以做到灵活部署和调度,做到资源的充分利用,不浪费,可以动态调节计算资源和计算任务之间的关系,达到二者之间的平衡。在执行模型的推理和训练时,由于数据处理仅在设备内部进行,并不与其它设备进行数据交互,从而能够保护用户数据、降低任务处理时延。
第二方面,本申请实施例提供了一种电子设备,电子设备包括存储器、处理器以及存储在存储器上并可在处理器上运行的程序,程序被处理器执行时,实现本申请第一方面一些实施例的资源调度方法。
在一些实施例中,电子设备可以为移动终端设备,也可以为非移动终端设备。移动终端设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载终端设备、可穿戴设备、超级移动个人计算机、上网本、个人数字助理等;非移动终端设备可以为个人计算机、电视机、柜员机或者自助机等;本申请实施方案不作具体限定。
第三方面,本申请实施例提供了一种存储介质,用于计算机可读存储,存储介质存储有一个或者多个程序,一个或者多个程序可被一个或者多个处理器执行,以实现本申请第一方面一些实施例的资源调度方法。
本申请实施例包括:获取智能应用处理请求;获取当前的资源使用信息;根据智能应用处理请求匹配智能应用实例;根据所述资源使用信息和所述智能应用实例创建任务,以处理所述智能应用处理请求。本申请实施例能够自适应地调度系统资源,利用系统的空闲资源调用智能应用模型,提高资源利用率,从而有效提升智能应用模型部署的灵活性和设备整体的运行效率,并且不影响设备现有功能模块的运行。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。
在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
以上是对本申请的一些实施例进行了具体说明,但本申请并不局限于上述实施方式,熟悉本领域的技术人员在不违背本申请范围的前提下还可做出各种各样的等同变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内。

Claims (12)

  1. 一种资源调度方法,包括:
    获取智能应用处理请求;
    获取当前的资源使用信息;
    根据智能应用处理请求匹配智能应用实例;
    根据所述资源使用信息和所述智能应用实例创建任务,以处理所述智能应用处理请求。
  2. 根据权利要求1所述的资源调度方法,其中,所述根据智能应用处理请求匹配智能应用实例之前,还包括:
    根据所述智能应用处理请求获取至少一个智能应用模型;
    根据所述智能应用模型、所述智能应用模型对应的资源需求和所述资源使用信息创建至少一个所述智能应用实例。
  3. 根据权利要求2所述的资源调度方法,其中,所述根据所述智能应用模型、所述智能应用模型对应的资源需求和所述资源使用信息创建至少一个所述智能应用实例,包括:
    根据所述智能应用模型对应的资源需求获取第一资源预估量;
    根据所述资源使用信息获取当前的空闲资源量;
    比较所述第一资源预估量和所述空闲资源量;
    若所述第一资源预估量不大于所述空闲资源量,则根据所述智能应用模型创建至少一个所述智能应用实例。
  4. 根据权利要求1所述的资源调度方法,其中,所述根据智能应用处理请求匹配智能应用实例之前,还包括:
    对模型数据库中的智能应用模型进行优先级排序,获取优先级信息;
    根据所述智能应用处理请求和所述优先级信息获取一个所述智能应用模型;
    根据所述智能应用模型、所述智能应用模型对应的资源需求和所述资源使用信息创建至少一个所述智能应用实例。
  5. 根据权利要求4所述的资源调度方法,其中,所述根据所述智能应用模型、所述智能应用模型对应的资源需求和所述资源使用信息创建至少一个所述智能应用实例,包括:
    根据所述优先级信息获取所述智能应用模型的调度信息;
    根据所述调度信息确定待调度智能应用模型;
    根据所述待调度智能应用模型的资源需求获取第二资源预估量;
    根据所述资源使用信息获取当前的空闲资源量;
    比较所述第二资源预估量和所述空闲资源量;
    若所述第二资源预估量不大于所述空闲资源量,则根据所述待调度智能应用模型创建至少一个所述智能应用实例。
  6. 根据权利要求1至5任一项所述的资源调度方法,其中,所述根据所述资源使用信息和所述智能应用实例创建任务,包括:
    获取所述智能应用实例的状态信息;
    获取当前的任务并发信息;
    根据所述智能应用实例、所述智能应用实例的状态信息和所述任务并发信息创建任务。
  7. 根据权利要求6所述的资源调度方法,其中,所述根据所述智能应用实例、所述智能应用实例的状态信息和所述任务并发信息创建任务,包括:
    根据所述任务并发信息获取任务并发数;
    比较所述任务并发数和最大任务并发数;
    若所述任务并发数不大于所述最大任务并发数,则根据所述智能应用实例和所述智能应用实例的状态信息创建任务。
  8. 根据权利要求7所述的资源调度方法,其中,所述比较所述任务并发数和最大任务并发数之前,还包括:
    获取当前的等待任务数;
    若所述当前的等待任务数大于预设的等待任务数阈值,则根据当前的最大任务并发数、系统上限任务数和预设的调整因子调整所述最大任务并发数。
  9. 根据权利要求1至5任一项所述的资源调度方法,其中,所述根据所述资源使用信息和所述智能应用实例创建任务之前,还包括:
    根据所述智能应用实例获取对应的智能应用模型;
    获取缓存队列中所述智能应用模型对应的待处理请求数;
    根据所述待处理请求数调整所述智能应用模型的实例并发数。
  10. 根据权利要求1至5任一项所述的资源调度方法,其中,所述根据所述资源使用信息和所述智能应用实例创建任务之前,还包括:
    根据当前并发的所述智能应用实例获取至少两个对应的智能应用模型;
    获取缓存队列中每个所述智能应用模型对应的待处理请求数;
    获取所述智能应用模型的优先级信息;
    根据每个所述智能应用模型对应的所述待处理请求数获取所述智能应用模型对应的权重信息;
    根据所述优先级信息和所述权重信息调整所述智能应用模型的优先级;
    根据所述智能应用模型的优先级和所述待处理请求数调整所述智能应用模型对应的实例并发数。
  11. 一种电子设备,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的程序,其中,所述程序被所述处理器执行时,实现权利要求1至10任一项所述的资源调度方法。
  12. 一种存储介质,用于计算机可读存储,其中,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现权利要求1至10中任一项所述的资源调度方法。
PCT/CN2021/104248 2020-07-03 2021-07-02 资源调度方法、电子设备及存储介质 WO2022002247A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21834695.5A EP4177745A4 (en) 2020-07-03 2021-07-02 RESOURCE PLANNING METHOD, ELECTRONIC DEVICE AND STORAGE MEDIA
US18/014,125 US20230273833A1 (en) 2020-07-03 2021-07-02 Resource scheduling method, electronic device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010635990.1 2020-07-03
CN202010635990.1A CN113886030A (zh) 2020-07-03 2020-07-03 资源调度方法、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022002247A1 true WO2022002247A1 (zh) 2022-01-06

Family

ID=79013309

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/104248 WO2022002247A1 (zh) 2020-07-03 2021-07-02 资源调度方法、电子设备及存储介质

Country Status (4)

Country Link
US (1) US20230273833A1 (zh)
EP (1) EP4177745A4 (zh)
CN (1) CN113886030A (zh)
WO (1) WO2022002247A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115061800A (zh) * 2022-06-30 2022-09-16 中国联合网络通信集团有限公司 边缘计算任务的处理方法、边缘服务器及存储介质
CN115499439A (zh) * 2022-09-13 2022-12-20 阿里巴巴(中国)有限公司 用于云服务的通信资源调度方法、装置及电子设备
CN117649069A (zh) * 2023-11-07 2024-03-05 北京城建设计发展集团股份有限公司 基于遗传算法的多片区运维资源统筹调度方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105975340A (zh) * 2016-03-24 2016-09-28 国云科技股份有限公司 一种虚拟机应用分配部署算法
CN108701149A (zh) * 2016-12-27 2018-10-23 华为技术有限公司 一种智能推荐方法和终端
CN109561024A (zh) * 2017-09-27 2019-04-02 南京中兴软件有限责任公司 容器调度处理方法及装置
CN110830759A (zh) * 2018-08-09 2020-02-21 华为技术有限公司 智能应用部署的方法、装置和系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105975340A (zh) * 2016-03-24 2016-09-28 国云科技股份有限公司 一种虚拟机应用分配部署算法
CN108701149A (zh) * 2016-12-27 2018-10-23 华为技术有限公司 一种智能推荐方法和终端
CN109561024A (zh) * 2017-09-27 2019-04-02 南京中兴软件有限责任公司 容器调度处理方法及装置
CN110830759A (zh) * 2018-08-09 2020-02-21 华为技术有限公司 智能应用部署的方法、装置和系统

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115061800A (zh) * 2022-06-30 2022-09-16 中国联合网络通信集团有限公司 边缘计算任务的处理方法、边缘服务器及存储介质
CN115499439A (zh) * 2022-09-13 2022-12-20 阿里巴巴(中国)有限公司 用于云服务的通信资源调度方法、装置及电子设备
CN117649069A (zh) * 2023-11-07 2024-03-05 北京城建设计发展集团股份有限公司 基于遗传算法的多片区运维资源统筹调度方法

Also Published As

Publication number Publication date
CN113886030A (zh) 2022-01-04
EP4177745A1 (en) 2023-05-10
EP4177745A4 (en) 2023-08-30
US20230273833A1 (en) 2023-08-31

Similar Documents

Publication Publication Date Title
WO2022002247A1 (zh) 资源调度方法、电子设备及存储介质
Santos et al. Towards network-aware resource provisioning in kubernetes for fog computing applications
US11582166B2 (en) Systems and methods for provision of a guaranteed batch
US11509596B2 (en) Throttling queue for a request scheduling and processing system
US8424007B1 (en) Prioritizing tasks from virtual machines
US10101910B1 (en) Adaptive maximum limit for out-of-memory-protected web browser processes on systems using a low memory manager
US20160210174A1 (en) Hybrid Scheduler and Power Manager
US10193973B2 (en) Optimal allocation of dynamically instantiated services among computation resources
US10289446B1 (en) Preserving web browser child processes by substituting a parent process with a stub process
WO2022095815A1 (zh) 显存管理方法、装置、设备及系统
US20160127382A1 (en) Determining variable wait time in an asynchronous call-back system based on calculated average sub-queue wait time
US20200159587A1 (en) Releasable resource based preemptive scheduling
KR102052964B1 (ko) 컴퓨팅 스케줄링 방법 및 시스템
CN112486642B (zh) 资源调度方法、装置、电子设备及计算机可读存储介质
US10248321B1 (en) Simulating multiple lower importance levels by actively feeding processes to a low-memory manager
CN111625339A (zh) 集群资源调度方法、装置、介质和计算设备
CN109783236A (zh) 用于输出信息的方法和装置
CN113127179A (zh) 资源调度方法、装置、电子设备及计算机可读介质
KR101377195B1 (ko) 컴퓨터 마이크로 작업
US11388050B2 (en) Accelerating machine learning and profiling over a network
CN117667332A (zh) 一种任务调度方法及系统
CN117632461A (zh) 任务调度方法、装置、存储介质及计算机设备
CN116467053A (zh) 资源调度方法及装置、设备、存储介质
CN114416349A (zh) 资源分配方法、装置、设备、存储介质以及程序产品
US12081454B2 (en) Systems and methods for provision of a guaranteed batch

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21834695

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2021834695

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE