CN113051054B - Method, apparatus and computer readable storage medium for scheduling artificial intelligence platform resources - Google Patents

Method, apparatus and computer readable storage medium for scheduling artificial intelligence platform resources Download PDF

Info

Publication number
CN113051054B
CN113051054B CN202110313956.7A CN202110313956A CN113051054B CN 113051054 B CN113051054 B CN 113051054B CN 202110313956 A CN202110313956 A CN 202110313956A CN 113051054 B CN113051054 B CN 113051054B
Authority
CN
China
Prior art keywords
task
priority
resources
platform
tasks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110313956.7A
Other languages
Chinese (zh)
Other versions
CN113051054A (en
Inventor
齐文
李劲
郭玮
苏力强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bohan Intelligent Shenzhen Co ltd
Original Assignee
Bohan Intelligent Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bohan Intelligent Shenzhen Co ltd filed Critical Bohan Intelligent Shenzhen Co ltd
Priority to CN202110313956.7A priority Critical patent/CN113051054B/en
Publication of CN113051054A publication Critical patent/CN113051054A/en
Application granted granted Critical
Publication of CN113051054B publication Critical patent/CN113051054B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/484Precedence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to the field of artificial intelligence, and provides a method, equipment and a computer readable storage medium for scheduling resources of an artificial intelligence platform so as to reasonably and efficiently schedule the resources of an AI platform. The method comprises the following steps: the artificial intelligent AI platform receives an AI task request, wherein the type of the task corresponding to the AI task request comprises a model training task, a model reasoning task or an interactive task; determining the priority of the task corresponding to the AI task request according to the resource attribute of the task corresponding to the AI task request; and according to the priority of the task corresponding to the AI task request, preferentially scheduling the resources of the AI platform to the task with relatively higher priority in the tasks. Compared with the defects brought by scheduling the AI platform resources only according to the sequence of task requests in the prior art, the technical scheme of the application always ensures that the tasks with relatively higher priority have available resources, and is a reasonable and efficient resource scheduling mode.

Description

Method, apparatus and computer readable storage medium for scheduling artificial intelligence platform resources
Technical Field
The present application relates to the field of artificial intelligence, and in particular, to a method, apparatus, and computer readable storage medium for scheduling resources of an artificial intelligence platform.
Background
With the rapid development of artificial intelligence (Artificial Intelligence, AI), AI technology is gradually applied to fields closely related to people's life. AI platforms are burdened with various types of tasks, e.g., some tasks are interactive tasks, some tasks are inference tasks, some tasks are training tasks, etc., and the different types of tasks have different requirements on resources.
When the existing artificial intelligent platform schedules resources, the resources of the AI platform are mainly scheduled according to a time sequence, namely, which task requests the AI platform before, the resources of the AI platform are scheduled for the task first, for example, task A belongs to an inference task, task B belongs to an interactive task, and task A applies resources to the AI platform before task B, then task A preferentially obtains the resources of the AI platform until the task A is completed, and the AI platform releases the resources for the task B to use.
However, the above-mentioned existing method for scheduling resources of an artificial intelligence platform has a disadvantage in that, although task a applies for resources to an AI platform in advance of task B, the priority of task B is actually higher than that of task a, and thus, task a may deplete resources in advance, and no resources will be available until task B having a higher priority applies for resources of an AI platform.
Disclosure of Invention
The application provides a method, a device, equipment and a computer readable storage medium for scheduling resources of an artificial intelligence platform, so as to reasonably and efficiently schedule the resources of an AI platform.
In one aspect, the present application provides a method for scheduling resources of an artificial intelligence platform, comprising:
the method comprises the steps that an artificial intelligent AI platform receives an AI task request, wherein the type of a task corresponding to the AI task request comprises a model training task, a model reasoning task or an interactive task;
determining the priority of the task corresponding to the AI task request according to the resource attribute of the task corresponding to the AI task request;
and according to the priority of the task corresponding to the AI task request, preferentially scheduling the resources of the AI platform to the task with relatively higher priority in the tasks.
In another aspect, the present application provides an apparatus for scheduling resources of an artificial intelligence platform, comprising:
the task request receiving module is used for receiving at least one AI task request by the artificial intelligence AI platform; the types of tasks corresponding to the received task requests comprise model training tasks, model reasoning tasks or interactive tasks;
the priority determining module is used for determining the priority of the task corresponding to the AI task request according to the resource attribute of the task corresponding to the AI task request;
and the scheduling module is used for preferentially scheduling the resources of the AI platform to the tasks with relatively higher priorities among the tasks according to the priorities of the tasks corresponding to the AI task requests.
In a third aspect, the present application provides an apparatus, the apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the technical solution of the method for scheduling artificial intelligence platform resources as described above when executing the computer program.
In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the solution of the method for scheduling artificial intelligence platform resources as described above.
According to the technical scheme provided by the application, after the artificial intelligent AI platform receives the AI task request, the priority of the task corresponding to the AI task request is determined according to the resource attribute of the task corresponding to the AI task request, and then the resource of the AI platform is preferentially scheduled to the task with relatively higher priority in the tasks according to the priority of the task corresponding to the AI task request. Compared with the defects brought by scheduling the AI platform resources only according to the sequence of task requests in the prior art, the technical scheme of the application always ensures that the tasks with relatively higher priority have available resources, and is a reasonable and efficient resource scheduling mode.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method of scheduling artificial intelligence platform resources provided by an embodiment of the present application;
FIG. 2 is a schematic diagram of processing a task interrupt and restart provided by an embodiment of the present application;
FIG. 3 is a schematic structural diagram of an apparatus for scheduling artificial intelligence platform resources according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of an apparatus for scheduling artificial intelligence platform resources according to another embodiment of the present application;
FIG. 5 is a schematic diagram of an apparatus for scheduling artificial intelligence platform resources according to another embodiment of the present application;
FIG. 6 is a schematic diagram of an apparatus for scheduling artificial intelligence platform resources according to another embodiment of the present application;
FIG. 7 is a schematic diagram of an apparatus for scheduling artificial intelligence platform resources according to another embodiment of the present application;
fig. 8 is a schematic structural diagram of an apparatus according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In this specification, adjectives such as first and second may be used solely to distinguish one element or action from another element or action without necessarily requiring or implying any actual such relationship or order. Where the environment permits, reference to an element or component or step (etc.) should not be construed as limited to only one of the element, component, or step, but may be one or more of the element, component, or step, etc.
In the present specification, for convenience of description, the dimensions of the various parts shown in the drawings are not drawn in actual scale.
The application provides a method for scheduling artificial intelligent platform resources. As shown in fig. 1, the method for scheduling the resources of the artificial intelligence platform mainly includes steps S101 to S103, which are described in detail as follows:
step S101: the artificial intelligence AI platform receives AI task requests, wherein the types of tasks corresponding to the AI task requests include model training tasks, model reasoning tasks, or interactive tasks.
In embodiments of the present application, the AI platform may be used to accomplish various types of tasks, such as interactive tasks, inference type tasks, training tasks, and the like. These tasks may be issued by the user to the AI platform in the form of AI task requests that the AI platform receives. In general, the types of AI task requests corresponding tasks can include model training tasks, model reasoning tasks or interactive tasks, and so forth.
Step S102: and determining the priority of the task corresponding to the AI task request according to the resource attribute of the task corresponding to the AI task request.
Whether the AI platform is running or a newly received task, there is a prioritization. The priority refers to a level at which other tasks are prioritized. In the embodiment of the application, the priorities mainly comprise the highest priority, the medium priority and the lowest priority, wherein the highest priority of the task with the highest priority is expressed in that the task is not stopped to run unless the user automatically exits the task, even if the AI platform receives the request of the task with the same priority. In the embodiment of the application, the priority of the task corresponding to the AI task request can be determined according to the resource attribute of the task corresponding to the AI task request, wherein the resource attribute of the task comprises that the resource of the task cannot be preempted or can be preempted, or the task can be divided into the task which cannot be preempted and the task which can be preempted, namely the task which cannot be preempted, the resource of the task which can be preempted is released only after the task is executed, and the resource of the task which can be preempted is possibly released by the AI platform at any time. According to the above-mentioned agreement, compared with the inference type task and/or training task, because the real-time performance of the interactive task is stronger, the inference type task is inferior, and the real-time performance requirement of the training task is weakest, therefore, in the three tasks of the interactive task, the inference type task and the training task, the resources of the interactive task have the attribute that can not be preempted, when the AI platform is executing the interactive task, unless the user automatically exits, the interactive task can not be stopped to be executed, and the inference type task and the training task can be preempted according to the actual application scene, and the resources can not be preempted.
Specifically, according to the resource attribute of the task corresponding to the AI task request, determining the priority of the task corresponding to the AI task request may be: and determining the priority of the interactive task as the highest priority according to the resource attribute of the interactive task which can not be preempted, determining the priority of the model reasoning task as the medium priority when the resource attributes of the model training task and the model reasoning task are preempted, and determining the priority of the model training task as the lowest priority. Obviously, when the resource attribute of the model training task and the model reasoning task is not preemptable, then, like the interactive task, the priority thereof can be determined as the highest priority.
Step S103: and according to the priority of the task corresponding to the AI task request, preferentially scheduling the resources of the AI platform to the task with relatively higher priority in the tasks.
In contrast to the prior art that the AI platform is only scheduled according to the sequence of task requests, so that the AI platform is possible to have no resource schedulable when the AI platform receives the AI task requests with higher priority, the technical scheme of the application is that the resources of the AI platform are preferentially scheduled to the tasks with relatively higher priority in the tasks according to the priorities of the tasks corresponding to the AI task requests, so that the tasks with relatively higher priority are always guaranteed to have available resources. As an embodiment of the present application, according to the priority of the task corresponding to the AI task request, the priority of scheduling the resources of the AI platform to the task with the relatively higher priority in the tasks may be implemented through step S1031 and step S1032:
step S1031: and calculating available resources currently possessed by the AI platform, wherein the available resources currently possessed by the AI platform comprise idle resources and/or resources being used by lower priority tasks, and the resources being used by lower priority tasks comprise resources being used by medium priority and/or lowest priority tasks.
The resources of the AI platform mainly refer to computing resources of the AI platform, for example, central processing unit resources, and other resources such as memory are also included. If the AI platform has free resources, these free resources are of course the AI platform's available resources, however, if the AI platform is running tasks, when the priority of these tasks is low, the resources that these tasks are using also belong to a potential available resource, since these lower priority tasks can be suspended, freeing up the resources that they are using. Thus, in an embodiment of the present application, the available resources currently available to the AI platform include free resources and/or resources being used by lower priority tasks, where the resources being used by lower priority tasks include resources being used by medium priority tasks and/or resources being used by lowest priority tasks.
Step S1032: when the AI task request corresponds to the task including the task with the highest priority, if the idle resource can meet the requirement of the task with the highest priority on the resource, the idle resource is scheduled to the task with the highest priority.
Through the determination in step S102, when the task corresponding to the AI task request includes the task with the highest priority, if the idle resource can meet the requirement of the task with the highest priority on the resource, the idle resource is scheduled to the task with the highest priority. Here, as previously described, the free resources belong to available resources currently possessed by the AI platform. If the idle resources can meet the task with the highest priority, the idle resources are only required to be scheduled to the task with the highest priority. For example, as described above, the real-time performance of the interactive task is relatively strong compared to the inference task and/or the training task, when the task corresponding to the AI task request received by the AI platform includes the interactive task, if the idle resources of the AI platform can satisfy the interactive task, the idle resources are scheduled to the interactive task.
Otherwise, if the idle resources of the AI platform cannot meet the demands of the task with the highest priority for the resources, and the running task includes the task with the medium priority and/or the task with the lowest priority, then the resources occupied by the task with the medium priority and/or the task with the lowest priority in running are released, and the idle resources and the released resources of the AI platform are scheduled to the task with the highest priority, where the resources occupied by the task with the medium priority and/or the task with the lowest priority in running are released, and the scheduling of the idle resources and the released resources of the AI platform to the task with the highest priority may be: firstly suspending tasks with the lowest priority, if resources released after suspending the tasks with the lowest priority and the previous idle resources can meet the demands of the tasks with the highest priority on the resources, scheduling the idle resources of the AI platform and the resources released due to suspending the tasks with the lowest priority to the tasks with the highest priority; if the idle resources of the AI platform together with the resources released by suspending all tasks with the lowest priority still cannot meet the demands of the tasks with the highest priority for the resources, suspending the tasks with the medium priority, and scheduling the idle resources of the AI platform together with the resources released by suspending the tasks with the medium priority and the tasks with the lowest priority to the interactive tasks. For example, when the AI platform receives an AI task request for an interactive task, if an idle resource of the AI platform cannot meet the requirement of the interactive task for the resource, and the AI platform is running an inference task and/or a training task, the training task is paused first, and if the resource released after the pause of the training task and the previous idle resource can meet the requirement of the interactive task for the resource, the idle resource of the AI platform and the resource released due to the pause of the training task are scheduled to the interactive task; if the idle resources of the AI platform and the resources released by suspending all training tasks still cannot meet the requirements of the interactive tasks on the resources, suspending the reasoning type tasks, and scheduling the idle resources of the AI platform and the resources released by suspending the reasoning type tasks and the training tasks to the interactive tasks.
The above embodiment is determined in step S102, when the AI task request corresponds to the task including the task with the highest priority, the AI platform resource scheduling scheme obviously also includes the determination in step S102 that the AI task request corresponds to the task without the task with the highest priority, that is, the AI platform receives the AI task request with the medium priority and/or the lowest priority. Aiming at the situation, the technical scheme of the application can be as follows: when the task corresponding to the AI task request does not contain the task with the highest priority, if the idle resources of the AI platform can meet the task with the medium priority and/or the task with the lowest priority, the idle resources of the AI platform are scheduled to the task with the medium priority and/or the task with the lowest priority; if the idle resources of the AI platform cannot meet the tasks with medium priority and the task being operated by the AI platform comprises the task with the lowest priority, releasing the resources occupied by the task with the lowest priority being operated by the AI platform; and scheduling the idle resources of the AI platform and the resources occupied by the task with the lowest priority which releases the running AI platform to the task with the medium priority.
As previously mentioned, the lowest priority task is a task that the AI platform can pause at any time, however, for medium priority tasks it is generally not as temporarily paused as the lowest priority task, but needs to be treated differently as appropriate. Specifically, the above embodiment may further include: counting the occupation condition of the running medium priority task on the resources of the AI platform; if the continuous non-use time of the resources of the AI platform by the medium priority task exceeds a preset threshold, suspending the medium priority task of which the continuous non-use time of the resources of the AI platform exceeds the preset threshold; if a new model reasoning request for a suspended medium priority task is received, the suspended medium priority task is restarted. Therefore, the resources of the AI platform released by suspending the tasks with medium priority can be scheduled for other tasks, so that the waste caused by long-term occupation of the resources but no use is avoided, and the tasks with medium priority are guaranteed to a certain extent in the use of the resources.
For those tasks that are restarted due to interruption in the above embodiment of the present application, the solution of the embodiment of the present application is: if the running model training task of the AI platform is interrupted, saving the weight information of the model at a breakpoint (check-point) during interruption; when these interrupted model training tasks restart, they are re-run from their breakpoints according to the model weight information saved at the breakpoints. As shown in fig. 2, a task of some lowest priority (e.g., a model training task) is assumed, the overall run of which involves 5 cycles (epoch), as shown in fig. 2 by the small squares identified with numerals 1 to 5. Because the AI platform receives the AI task request with the highest priority and the current AI platform has no idle resources, when the 2 nd period of the model training task with the lowest priority is run (each time the task is run, the small square is represented by a gray black small square), the model training task with the lowest priority is forced to be interrupted. At this time, the weight information at this time may be saved after the 2 nd cycle of the model training task of the lowest priority is run. After the task with the highest priority is operated and resources are released, the model training task with the lowest priority can acquire the released resources again, so that the model training task with the lowest priority is restarted, the operation is started from the 3 rd period of the model training task with the lowest priority, and the weight information stored at the break point before is utilized until the operation of the model training task with the lowest priority is completed. Of course, if the model training task with the lowest priority is forced to be interrupted again in the 3 rd cycle and the following cycles, the processing procedure is the same as the processing mode when the 2 nd cycle is completed, and details are not repeated.
It should be noted that, in the above embodiment of the present application, if the task being operated by the AI platform is a model inference task, when the model inference task is interrupted, it is not necessary to keep information at a breakpoint (check-point), and when the interrupted model inference task is restarted, the interrupted model inference task is directly resumed.
The above-described processing of a task restarted due to an interruption is directed to a task that can be interrupted. As previously described, for the highest priority task, the highest priority task is not stopped unless the user automatically exits the task. Thus, the above-described processing of a task restarted due to an interrupt is provided that the task belongs to a medium-priority task or a lowest-priority task, not a highest-priority task.
As can be seen from the above method for scheduling resources of the artificial intelligence platform illustrated in FIG. 1, after the artificial intelligence AI platform receives the AI task request, the priority of the task corresponding to the AI task request is determined according to the resource attribute of the task corresponding to the AI task request, and then the resources of the AI platform are scheduled to the task with relatively higher priority according to the priority of the task corresponding to the AI task request. Compared with the defects brought by scheduling the AI platform resources only according to the sequence of task requests in the prior art, the technical scheme of the application always ensures that the tasks with relatively higher priority have available resources, and is a reasonable and efficient resource scheduling mode.
Referring to fig. 3, an apparatus for scheduling resources of an artificial intelligence platform according to an embodiment of the present application may include a task request receiving module 301, a priority determining module 302, and a scheduling module 303, which are described in detail below:
the task request receiving module 301 is configured to receive an AI task request by using an artificial intelligence AI platform, where a type of a task corresponding to the AI task request includes a model training task, a model reasoning task or an interactive task;
the priority determining module 302 is configured to determine, according to the resource attribute of the task corresponding to the AI task request, a priority of the task corresponding to the AI task request;
and the scheduling module 303 is configured to schedule the resources of the AI platform to tasks with relatively higher priorities among the tasks preferentially according to the priorities of the tasks corresponding to the AI task requests.
Optionally, the priority determining module 302 illustrated in fig. 3 may include a first determining unit and a second determining unit, where:
a first determining unit, configured to determine, according to a resource attribute of the interactive task that cannot be preempted, a priority of the interactive task as a highest priority;
and the second determining unit is used for determining the priority of the model reasoning task as medium priority and determining the priority of the model training task as the lowest priority when the resource attributes of the model training task and the model reasoning task are preemptive.
Optionally, the scheduling module 303 illustrated in fig. 3 may include an available resource calculating unit 401 and an idle resource scheduling unit 402, where an apparatus for scheduling resources of an artificial intelligence platform according to another embodiment of the present application is shown in fig. 4, and the apparatus includes:
an available resource calculating unit 401, configured to calculate an available resource currently owned by the AI platform, where the available resource currently owned by the AI platform includes an idle resource and/or a resource being used by a task with a lower priority, and the resource being used by the task with a lower priority includes a resource being used by a task with a medium priority and/or a task with a lowest priority;
and the idle resource scheduling unit 402 is configured to schedule the idle resource of the AI platform to the task with the highest priority if the idle resource of the AI platform can meet the requirement of the task with the highest priority on the resource when the task with the highest priority is included in the AI task request.
Optionally, the apparatus illustrated in fig. 4 may further include a first resource releasing module 501 and a first resource scheduling module 503, where an apparatus for scheduling resources of an artificial intelligence platform according to another embodiment of the present application is shown in fig. 5, where:
the first resource releasing module 501 is configured to release the resources occupied by the running task with medium priority and/or lowest priority if the idle resources of the AI platform cannot meet the demands of the task with highest priority for the resources, and the running task includes the task with medium priority and/or lowest priority;
a first resource scheduling module 503, configured to schedule the idle resources of the AI platform and resources occupied by the task releasing the running medium priority and/or the task with the lowest priority to the task with the highest priority.
Optionally, the apparatus illustrated in fig. 4 may further include a second resource scheduling module 601, a second resource releasing module 602, and a third resource scheduling module 603, where the apparatus for scheduling resources of an artificial intelligence platform according to another embodiment of the present application is shown in fig. 6, where:
the second resource scheduling module 601 is configured to schedule, when the task corresponding to the AI task request does not include the task with the highest priority, the idle resource of the AI platform to the task with the medium priority and/or the task with the lowest priority if the idle resource of the AI platform can meet the task with the medium priority and/or the task with the lowest priority;
a second resource releasing module 602, configured to release, if the idle resources of the AI platform cannot meet the task with the medium priority and the task being operated by the AI platform includes the task with the lowest priority, the resources occupied by the task with the lowest priority being operated by the AI platform;
and a third resource scheduling module 603, configured to schedule the idle resources of the AI platform and the resources occupied by the task releasing the lowest priority being run to the task with the medium priority.
Optionally, the apparatus illustrated in fig. 3 may further include a statistics module, a suspension module, and a first task restart module, where:
the statistics module is used for counting the occupation condition of the running medium-priority task on the resources of the AI platform;
a suspension module, configured to suspend a medium-priority task whose continuous non-use time of the resources of the AI platform exceeds a preset threshold if the continuous non-use time of the resources of the AI platform exceeds the preset threshold;
and the first task restarting module is used for restarting the suspended medium-priority task if a new model reasoning request for the suspended medium-priority task is received.
Optionally, the apparatus illustrated in fig. 3 may further include a breakpoint save module 701 and a second task restart module 702, where an apparatus for scheduling artificial intelligence platform resources according to another embodiment of the present application is shown in fig. 7, and is shown in fig. 7, where:
the breakpoint saving module 701 is configured to save weight information of a model at a breakpoint when a running model training task is interrupted;
and the first task restarting module 702 is configured to, when the interrupted model training task is restarted, rerun the interrupted model training task from the breakpoint according to the saved model weight information.
From the description of the technical scheme, after the artificial intelligent AI platform receives the AI task request, determining the priority of the task corresponding to the AI task request according to the resource attribute of the task corresponding to the AI task request, and then, preferentially scheduling the resource of the AI platform to the task with relatively higher priority in the tasks according to the priority of the task corresponding to the AI task request. Compared with the defects brought by scheduling the AI platform resources only according to the sequence of task requests in the prior art, the technical scheme of the application always ensures that the tasks with relatively higher priority have available resources, and is a reasonable and efficient resource scheduling mode.
Fig. 8 is a schematic structural diagram of an apparatus according to an embodiment of the present application. As shown in fig. 8, the apparatus 8 of this embodiment mainly includes: a processor 80, a memory 81, and a computer program 82 stored in the memory 81 and executable on the processor 80, such as a program for a method of scheduling artificial intelligence platform resources. The steps in the above-described method embodiment of scheduling artificial intelligence platform resources, such as steps S101 to S103 shown in fig. 1, are implemented when the processor 80 executes the computer program 82. Alternatively, the processor 80 may implement the functions of the modules/units in the above-described apparatus embodiments when executing the computer program 82, such as the functions of the task request receiving module 301, the priority determining module 302, and the scheduling module 303 shown in fig. 3.
Illustratively, the computer program 82 of the method of scheduling artificial intelligence platform resources essentially comprises: the artificial intelligent AI platform receives an AI task request, wherein the type of the task corresponding to the AI task request comprises a model training task, a model reasoning task or an interactive task; determining the priority of the task corresponding to the AI task request according to the resource attribute of the task corresponding to the AI task request; and according to the priority of the task corresponding to the AI task request, preferentially scheduling the resources of the AI platform to the task with relatively higher priority in the tasks. The computer program 82 may be divided into one or more modules/units, which are stored in the memory 81 and executed by the processor 80 to complete the present application. One or more of the modules/units may be a series of computer program instruction segments capable of performing particular functions to describe the execution of the computer program 82 in the device 8. For example, the computer program 82 may be divided into functions of a task request receiving module 301, a priority determining module 302, and a scheduling module 303 (a module in a virtual device), each of which has the following specific functions: the task request receiving module 301 is configured to receive an AI task request by using an artificial intelligence AI platform, where a type of a task corresponding to the AI task request includes a model training task, a model reasoning task or an interactive task; the priority determining module 302 is configured to determine, according to the resource attribute of the task corresponding to the AI task request, a priority of the task corresponding to the AI task request; and the scheduling module 303 is configured to schedule the resources of the AI platform to tasks with relatively higher priorities among the tasks preferentially according to the priorities of the tasks corresponding to the AI task requests.
The device 8 may include, but is not limited to, a processor 80, a memory 81. It will be appreciated by those skilled in the art that fig. 8 is merely an example of device 8 and is not intended to be limiting of device 8, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., a computing device may also include an input-output device, a network access device, a bus, etc.
The processor 80 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf Programmable gate arrays (FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 81 may be an internal storage unit of the device 8, such as a hard disk or a memory of the device 8. The memory 81 may also be an external storage device of the device 8, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the device 8. Further, the memory 81 may also include both internal storage units of the device 8 and external storage devices. The memory 81 is used to store computer programs and other programs and data required by the device. The memory 81 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that the above-described functional units and modules are merely illustrated for convenience and brevity of description, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above device may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/device and method may be implemented in other manners. For example, the apparatus/device embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another apparatus, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a non-transitory computer readable storage medium. Based on such understanding, the present application may implement all or part of the processes in the methods of the above embodiments, or may be implemented by a computer program for instructing related hardware to implement the methods of scheduling resources of an artificial intelligence platform, where the computer program, when executed by a processor, may implement the steps of the various method embodiments described above, i.e., the artificial intelligence AI platform receives an AI task request, where the type of task to which the AI task request corresponds includes a model training task, a model reasoning task, or an interactive task; determining the priority of the task corresponding to the AI task request according to the resource attribute of the task corresponding to the AI task request; and according to the priority of the task corresponding to the AI task request, preferentially scheduling the resources of the AI platform to the task with relatively higher priority in the tasks. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The non-transitory computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a USB flash disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, randomAccess Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the non-transitory computer readable medium may include content that is suitably scaled according to the requirements of jurisdictions in which the legislation and patent practice, such as in some jurisdictions, the non-transitory computer readable medium does not include electrical carrier signals and telecommunication signals according to the legislation and patent practice. The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application. The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the application, and is not meant to limit the scope of the application, but to limit the application to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the application are intended to be included within the scope of the application.

Claims (9)

1. A method of scheduling artificial intelligence platform resources, the method comprising:
the method comprises the steps that an artificial intelligent AI platform receives an AI task request, wherein the type of a task corresponding to the AI task request comprises a model training task, a model reasoning task or an interactive task;
determining the priority of a task corresponding to the AI task request according to the resource attribute of the task corresponding to the AI task request, wherein the resource attribute of the task comprises that the resource of the task can not be preempted or can be preempted, or the resource attribute of the task comprises that the task can be divided into a task which can not be preempted and a task which can be preempted;
according to the priority of the task corresponding to the AI task request, preferentially scheduling the resources of the AI platform to the task with relatively higher priority in the tasks;
the method comprises the steps that available resources which are currently possessed by the AI platform are calculated, wherein the available resources which are currently possessed by the AI platform comprise idle resources and/or resources which are being used by tasks with lower priority, and the resources which are being used by the tasks with lower priority comprise resources which are being used by tasks with medium priority and/or lowest priority;
when the task corresponding to the AI task request does not contain the task with the highest priority, if the idle resource of the AI platform cannot meet the task with the medium priority and the running task of the AI platform does not contain the task with the lowest priority, the occupation condition of the running task with the medium priority on the resource of the AI platform is counted; if the continuous non-use time of the resources of the AI platform by the medium priority task exceeds a preset threshold, suspending the medium priority task of which the continuous non-use time of the resources of the AI platform exceeds the preset threshold;
if the running model training task is interrupted, saving the weight information of the model at the break point during interruption;
when the interrupted model training task is restarted, the interrupted model training task is operated again from the breakpoint according to the stored model weight information;
if the running model reasoning task is interrupted, the information at the breakpoint is not required to be kept;
and directly recovering the model reasoning task running the interrupt when the model reasoning task of the interrupt is restarted.
2. The method for scheduling resources of an artificial intelligence platform according to claim 1, wherein determining the priority of the task corresponding to the AI task request according to the resource attribute of the task corresponding to the AI task request comprises:
determining the priority of the interactive task as the highest priority according to the unoccupied resource attribute of the interactive task; and
when the resource attributes of the model training task and the model reasoning task are preemptive, the priority of the model reasoning task is determined to be medium priority, and the priority of the model training task is determined to be the lowest priority.
3. The method for scheduling resources of an artificial intelligence platform according to claim 1, wherein said preferentially scheduling the resources of the AI platform to relatively higher priority tasks among the tasks according to the priorities of the tasks corresponding to the AI task requests comprises:
when the AI task request corresponding task contains the task with the highest priority, if the idle resource can meet the requirement of the task with the highest priority on the resource, the idle resource is scheduled to the task with the highest priority.
4. The method of scheduling artificial intelligence platform resources of claim 3, further comprising:
if the idle resources cannot meet the demands of the tasks with the highest priority on the resources, and the running tasks comprise the tasks with the medium priority and/or the lowest priority, releasing the resources occupied by the tasks with the medium priority and/or the lowest priority in running;
and scheduling the idle resources and the released resources to the task with the highest priority.
5. The method of scheduling artificial intelligence platform resources of claim 3, further comprising:
when the AI task requests that the corresponding task does not contain the task with the highest priority, if the idle resource can meet the task with the medium priority and/or the task with the lowest priority, the idle resource is scheduled to the task with the medium priority and/or the task with the lowest priority;
if the idle resources cannot meet the tasks with the medium priority and the running tasks comprise tasks with the lowest priority, releasing the resources occupied by the tasks with the lowest priority in running;
and scheduling the idle resources and the released resources to the medium priority task.
6. The method of scheduling artificial intelligence platform resources of claim 2, further comprising:
if a new model reasoning request for a suspended medium priority task is received, restarting the suspended medium priority task.
7. An apparatus for scheduling artificial intelligence platform resources, the apparatus comprising:
the task request receiving module is used for receiving an AI task request by the artificial intelligent AI platform, wherein the type of the task corresponding to the AI task request comprises a model training task, a model reasoning task or an interactive task;
the priority determining module is used for determining the priority of the task corresponding to the AI task request according to the resource attribute of the task corresponding to the AI task request, wherein the resource attribute of the task comprises that the resource of the task can not be preempted or can be preempted, or the resource attribute of the task comprises that the task can be divided into a task which can not be preempted and a task which can be preempted;
the scheduling module is used for preferentially scheduling the resources of the AI platform to tasks with relatively higher priorities among the tasks according to the priorities of the tasks corresponding to the AI task requests;
the method comprises the steps that available resources which are currently possessed by the AI platform are calculated, wherein the available resources which are currently possessed by the AI platform comprise idle resources and/or resources which are being used by tasks with lower priority, and the resources which are being used by the tasks with lower priority comprise resources which are being used by tasks with medium priority and/or lowest priority;
when the task corresponding to the AI task request does not contain the task with the highest priority, if the idle resource of the AI platform cannot meet the task with the medium priority and the running task of the AI platform does not contain the task with the lowest priority, the occupation condition of the running task with the medium priority on the resource of the AI platform is counted; if the continuous non-use time of the resources of the AI platform by the medium priority task exceeds a preset threshold, suspending the medium priority task of which the continuous non-use time of the resources of the AI platform exceeds the preset threshold;
if the running model training task is interrupted, saving the weight information of the model at the break point during interruption;
when the interrupted model training task is restarted, the interrupted model training task is operated again from the breakpoint according to the stored model weight information;
if the running model reasoning task is interrupted, the information at the breakpoint is not required to be kept;
and directly recovering the model reasoning task running the interrupt when the model reasoning task of the interrupt is restarted.
8. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 6.
CN202110313956.7A 2021-03-24 2021-03-24 Method, apparatus and computer readable storage medium for scheduling artificial intelligence platform resources Active CN113051054B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110313956.7A CN113051054B (en) 2021-03-24 2021-03-24 Method, apparatus and computer readable storage medium for scheduling artificial intelligence platform resources

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110313956.7A CN113051054B (en) 2021-03-24 2021-03-24 Method, apparatus and computer readable storage medium for scheduling artificial intelligence platform resources

Publications (2)

Publication Number Publication Date
CN113051054A CN113051054A (en) 2021-06-29
CN113051054B true CN113051054B (en) 2023-09-08

Family

ID=76514943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110313956.7A Active CN113051054B (en) 2021-03-24 2021-03-24 Method, apparatus and computer readable storage medium for scheduling artificial intelligence platform resources

Country Status (1)

Country Link
CN (1) CN113051054B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114020470B (en) * 2021-11-09 2024-04-26 抖音视界有限公司 Resource allocation method and device, readable medium and electronic equipment
CN114201278B (en) * 2021-12-07 2023-12-15 北京百度网讯科技有限公司 Task processing method, task processing device, electronic equipment and storage medium
CN115061800A (en) * 2022-06-30 2022-09-16 中国联合网络通信集团有限公司 Edge computing task processing method, edge server and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101847128A (en) * 2009-03-23 2010-09-29 国际商业机器公司 TLB management method and device
CN107977268A (en) * 2017-10-13 2018-05-01 北京百度网讯科技有限公司 Method for scheduling task, device and the computer-readable recording medium of the isomerization hardware of artificial intelligence
CN111768006A (en) * 2020-06-24 2020-10-13 北京金山云网络技术有限公司 Artificial intelligence model training method, device, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9513962B2 (en) * 2013-12-03 2016-12-06 International Business Machines Corporation Migrating a running, preempted workload in a grid computing system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101847128A (en) * 2009-03-23 2010-09-29 国际商业机器公司 TLB management method and device
CN107977268A (en) * 2017-10-13 2018-05-01 北京百度网讯科技有限公司 Method for scheduling task, device and the computer-readable recording medium of the isomerization hardware of artificial intelligence
CN111768006A (en) * 2020-06-24 2020-10-13 北京金山云网络技术有限公司 Artificial intelligence model training method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
云平台的任务合理化调度模型仿真分析;程邺华;《计算机仿真》;第33卷(第5期);第376-379页 *

Also Published As

Publication number Publication date
CN113051054A (en) 2021-06-29

Similar Documents

Publication Publication Date Title
CN113051054B (en) Method, apparatus and computer readable storage medium for scheduling artificial intelligence platform resources
CN111367679B (en) Artificial intelligence computing power resource multiplexing method and device
CN110837410A (en) Task scheduling method and device, electronic equipment and computer readable storage medium
CN112286644B (en) Elastic scheduling method, system, equipment and storage medium for GPU (graphics processing Unit) virtualization computing power
WO2017080273A1 (en) Task management methods and system, and computer storage medium
CN110413412B (en) GPU (graphics processing Unit) cluster resource allocation method and device
CN111768006A (en) Artificial intelligence model training method, device, equipment and storage medium
CN111798113B (en) Resource allocation method, device, storage medium and electronic equipment
CN105022668B (en) Job scheduling method and system
CN112486642B (en) Resource scheduling method, device, electronic equipment and computer readable storage medium
CN112988390A (en) Calculation power resource allocation method and device
CN112860387A (en) Distributed task scheduling method and device, computer equipment and storage medium
US20240202024A1 (en) Thread processing methods, scheduling component, monitoring component, server, and storage medium
CN116089040A (en) Service flow scheduling method and device, electronic equipment and storage medium
CN112948109B (en) Quota flexible scheduling method, device and medium for AI computing cluster
WO2024119930A1 (en) Scheduling method and apparatus, and computer device and storage medium
WO2019029721A1 (en) Task scheduling method, apparatus and device, and storage medium
CN116483546A (en) Distributed training task scheduling method, device, equipment and storage medium
CN109426556B (en) Process scheduling method and device
CN116467065A (en) Algorithm model training method and device, electronic equipment and storage medium
CN112540886B (en) CPU load value detection method and device
CN115373826A (en) Task scheduling method and device based on cloud computing
CN115543765A (en) Test case scheduling method and device, computer equipment and readable medium
CN108920722B (en) Parameter configuration method and device and computer storage medium
CN115373825B (en) Resource scheduling method, device and equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 518000 18B, Microsoft Science Building, No. 55, Gaoxin South 9th Road, High tech Zone Community, Yuehai Street, Nanshan District, Shenzhen, Guangdong

Applicant after: Bohan Intelligent (Shenzhen) Co.,Ltd.

Address before: 518000 18D, Microsoft tech building, 55 Gaoxin South 9th Road, high tech Zone community, Yuehai street, Nanshan District, Shenzhen City, Guangdong Province

Applicant before: Yitong Technology (Shenzhen) Co.,Ltd.

GR01 Patent grant
GR01 Patent grant