TWI786564B - Task scheduling method and apparatus, storage media and computer equipment - Google Patents

Task scheduling method and apparatus, storage media and computer equipment Download PDF

Info

Publication number
TWI786564B
TWI786564B TW110108474A TW110108474A TWI786564B TW I786564 B TWI786564 B TW I786564B TW 110108474 A TW110108474 A TW 110108474A TW 110108474 A TW110108474 A TW 110108474A TW I786564 B TWI786564 B TW I786564B
Authority
TW
Taiwan
Prior art keywords
task
subtasks
subtask
processor
node
Prior art date
Application number
TW110108474A
Other languages
Chinese (zh)
Other versions
TW202134870A (en
Inventor
葉志晟
陳遜
吳保東
孫鵬
顏深根
Original Assignee
大陸商上海商湯智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202010165763.7A external-priority patent/CN113391914A/en
Priority claimed from CN202010165543.4A external-priority patent/CN113391886A/en
Application filed by 大陸商上海商湯智能科技有限公司 filed Critical 大陸商上海商湯智能科技有限公司
Publication of TW202134870A publication Critical patent/TW202134870A/en
Application granted granted Critical
Publication of TWI786564B publication Critical patent/TWI786564B/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Debugging And Monitoring (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

Examples of the present disclosure provide a task scheduling method and apparatus. According to the method, information of a task may be obtained, where the information of the task includes at least one of a task type and resource requirement information of the task; a target node can be assigned to each of a plurality of subtasks included in the task according to the information of the task.

Description

任務調度方法和裝置、儲存媒體及計算機設備 Task scheduling method and device, storage medium and computer equipment

本公開涉及分散式系統技術領域,尤其涉及分散式系統中的任務調度。 The present disclosure relates to the technical field of distributed systems, in particular to task scheduling in distributed systems.

目前,越來越多的任務開始使用分散式系統進行資源的分配計算,其中,許多任務往往包括多個子任務,各個子任務的內容以及所需的資源可能完全相同也可能不同。傳統的分散式系統中的任務調度方式一般以子任務為單位進行資源節點的分配,對叢集的資源利用率和任務的執行效率都比較低。 At present, more and more tasks begin to use distributed systems for resource allocation calculations, and many tasks often include multiple subtasks, and the content and required resources of each subtask may be completely the same or may be different. The task scheduling method in the traditional distributed system generally allocates resource nodes in units of sub-tasks, and the resource utilization rate of the cluster and the execution efficiency of tasks are relatively low.

本公開提供一種任務調度方案。 The present disclosure provides a task scheduling scheme.

根據本公開實施例的第一方面,提供一種任務調度方法,所述方法包括:獲取任務的資訊,所述任務的資訊包括所述任務的任務類型和資源需求資訊的至少一者;根據所述任務的資訊,分別 為所述任務包括的多個子任務分配目標節點。 According to the first aspect of an embodiment of the present disclosure, there is provided a task scheduling method, the method comprising: obtaining task information, the task information including at least one of the task type and resource requirement information of the task; according to the task information, respectively Assigning target nodes to multiple subtasks included in the task.

在一些實施例中,所述任務的資源需求資訊為所述任務包括的多個子任務的資源需求資訊;所述方法還包括:在分散式系統中的當前可用資源滿足所述任務的多個子任務的資源需求資訊對應的總資源的情況下,分別為所述任務的多個子任務分配對應的目標節點。 In some embodiments, the resource requirement information of the task is the resource requirement information of a plurality of subtasks included in the task; the method further includes: currently available resources in the distributed system satisfy the plurality of subtasks of the task In the case of the total resource corresponding to the resource requirement information of , assign corresponding target nodes to multiple subtasks of the task respectively.

在一些實施例中,所述方法還包括:基於所述多個子任務的資源需求資訊,為所述任務的多個子任務確定目標節點;所述在分散式系統中的當前可用資源滿足所述任務的多個子任務的資源需求資訊對應的總資源的情況下,分別為所述任務的多個子任務分配對應的目標節點,包括:在成功為所述多個子任務中的每個子任務確定對應的目標節點的情況下,分別為所述任務的多個子任務分配對應的目標節點。 In some embodiments, the method further includes: determining a target node for a plurality of subtasks of the task based on resource requirement information of the plurality of subtasks; the currently available resources in the distributed system satisfy the task In the case of the total resource corresponding to the resource requirement information of multiple subtasks, assigning corresponding target nodes to the multiple subtasks of the task respectively includes: after successfully determining the corresponding target for each subtask in the multiple subtasks In the case of a node, assign corresponding target nodes to multiple subtasks of the task.

在一些實施例中,基於所述分散式系統的當前資源狀態資訊和所述多個子任務的資源需求資訊,按照特定順序依次確定多個子任務對應的目標節點。 In some embodiments, based on the current resource status information of the distributed system and the resource requirement information of the multiple subtasks, the target nodes corresponding to the multiple subtasks are sequentially determined in a specific order.

在一個可選例子中,所述多個子任務的資源需求資訊包括所述多個子任務中每個子任務的資源需求資訊,其中,所述資源需求資訊可以包括所需的資源類型和每種類型的資源數量,或者進一步包括其他資訊。 In an optional example, the resource requirement information of the plurality of subtasks includes resource requirement information of each subtask in the plurality of subtasks, wherein the resource requirement information may include required resource types and resources of each type. The number of resources, or further include other information.

在一個可選例子中,所述分散式系統的當前資源狀態資訊可以包括所述分散式系統中多個節點的當前狀態資訊,其中,當 前狀態資訊用於指示資源是否可用、可用資源的類型和數量、負載情況、拓撲連接資訊中的至少一種,或者進一步包括其他資訊。 In an optional example, the current resource status information of the distributed system may include current status information of multiple nodes in the distributed system, wherein, when The previous state information is used to indicate whether the resource is available, the type and quantity of the available resource, the load condition, the topological connection information, or further include other information.

在一些實施例中,在每次確定為一個子任務分配的目標節點之後,更新分散式系統的當前資源狀態資訊,並基於更新後的當前資源狀態資訊確定下一個子任務對應的目標節點。 In some embodiments, after each determination of the target node assigned to a subtask, the current resource state information of the distributed system is updated, and the target node corresponding to the next subtask is determined based on the updated current resource state information.

在一些實施例中,在未能成功為所述多個子任務中的至少一個子任務確定對應的目標節點的情況下,確定所述分散式系統的當前可用資源不滿足所述多個子任務中每個子任務所需求的資源。 In some embodiments, in the event that the corresponding target node for at least one subtask of the plurality of subtasks fails to be determined, it is determined that currently available resources of the distributed system do not meet the requirements for each of the plurality of subtasks. resources required by each subtask.

在一些實施例中,所述多個子任務的目標節點的依次確定的順序是按照子任務的優先級、子任務之間的依賴關係中的至少一項得到的。 In some embodiments, the sequentially determined order of the target nodes of the plurality of subtasks is obtained according to at least one of priorities of subtasks and dependencies between subtasks.

在一些實施例中,所述基於所述分散式系統的當前資源狀態資訊和所述多個子任務的資源需求資訊,為所述多個子任務確定目標節點,包括:基於所述分散式系統的當前資源狀態資訊和所述多個子任務的資源需求資訊,確定所述多個子任務中每個子任務的預選節點集合;從多個子任務中每個子任務的預選節點集合中選擇所述每個子任務的目標節點。 In some embodiments, the determining target nodes for the multiple subtasks based on the current resource status information of the distributed system and the resource requirement information of the multiple subtasks includes: based on the current resource status information of the distributed system resource status information and resource requirement information of the plurality of subtasks, determining a preselected node set of each subtask in the plurality of subtasks; selecting a target of each subtask from the preselected node set of each subtask in the plurality of subtasks node.

其中,不同子任務的預選節點集合可以相同或不同。在一些例子中,多個子任務可以具有相同的預選節點集合。 Wherein, the preselected node sets of different subtasks may be the same or different. In some examples, multiple subtasks may have the same set of preselected nodes.

在一些實施例中,所述基於所述分散式系統的當前資源狀態資訊和所述多個子任務的資源需求資訊,確定所述多個子任 務中每個子任務的預選節點集合,包括:基於所述分散式系統的當前資源狀態資訊和每個子任務的資源需求資訊,確定所述每個子任務的預選節點集合。 In some embodiments, the plurality of subtasks are determined based on the current resource status information of the distributed system and the resource requirement information of the plurality of subtasks. The preselected node set for each subtask in the task includes: determining the preselected node set for each subtask based on the current resource status information of the distributed system and the resource requirement information of each subtask.

多個子任務的預選節點集合的確定可以是獨立的,相互之間不存在依賴關係,例如,可以並行執行或者以任意先後順序執行。例如,多個子任務的預選節點集合是基於相同的分散式系統的當前資源狀態資訊確定的,也就是說,分散式系統的當前資源狀態資訊在預選過程中不進行更新。例如,每個子任務的預選節點集合的確定只跟自己的資源需求資訊有關係,而與其他子任務的資源需求資訊無關。 The determination of the pre-selected node sets of multiple subtasks may be independent without interdependence, for example, may be executed in parallel or in any sequence. For example, the preselected node sets of multiple subtasks are determined based on the same current resource state information of the distributed system, that is, the current resource state information of the distributed system is not updated during the preselection process. For example, the determination of the pre-selected node set of each subtask is only related to its own resource requirement information, but has nothing to do with the resource requirement information of other subtasks.

在一些實施例中,所述確定所述多個子任務中每個子任務的預選節點集合,包括:按照特定順序依次確定所述多個子任務中每個子任務的預選節點集合。 In some embodiments, the determining the preselected node set for each of the multiple subtasks includes: sequentially determining the preselected node set for each of the multiple subtasks in a specific order.

在一個例子中,所述多個子任務中第一子任務的預選節點集合中的預選節點是從所述多個子任務中第二子任務的預選節點集合選取的,其中,所述第二子任務的順序位於所述第一子任務之前。 In one example, the preselected nodes in the preselected node set of the first subtask among the multiple subtasks are selected from the preselected node set of the second subtask among the multiple subtasks, wherein the second subtask The order of is located before the first subtask.

在另一個例子中,基於所述多個子任務的資源需求資訊,確定所述多個子任務中每個子任務的預選節點集合。 In another example, a preselected node set for each subtask of the plurality of subtasks is determined based on resource requirement information of the plurality of subtasks.

在一些實施例中,所述從所述多個子任務中每個子任務的預選節點集合中選擇所述每個子任務的目標節點,包括:按照特定順序依次從所述多個子任務中每個子任務的預選節點中選擇所 述每個子任務的目標節點。 In some embodiments, the selecting the target node of each subtask from the preselected node set of each subtask in the plurality of subtasks includes: sequentially selecting the target node of each subtask in the plurality of subtasks in a specific order Select the selected node in the pre-selected Describe the target node of each subtask.

在一個可選例子中,優先選擇所述多個子任務中第二子任務的目標節點作為所述多個子任務中第一子任務的目標節點,其中,所述第二子任務的順序位於所述第一子任務之前。 In an optional example, the target node of the second subtask among the plurality of subtasks is preferentially selected as the target node of the first subtask among the plurality of subtasks, wherein the order of the second subtask is located in the Before the first subtask.

在一些實施例中,所述從所述多個子任務中每個子任務的預選節點集合中選擇所述每個子任務的目標節點,包括:基於所述分散式系統的當前資源狀態資訊,從所述多個子任務中第一子任務的預選節點集合中確定所述第一子任務的目標節點;基於所述第一子任務的資源需求資訊,更新所述分散式系統的當前資源狀態資訊,並基於所述分散式系統的更新後的當前資源狀態資訊,從所述多個子任務中第二子任務的預選節點集合中確定所述第二子任務的目標節點。 In some embodiments, the selecting the target node of each subtask from the preselected node set of each subtask in the plurality of subtasks includes: based on the current resource state information of the distributed system, from the Determining the target node of the first subtask from the preselected node set of the first subtask among the plurality of subtasks; updating the current resource state information of the distributed system based on the resource requirement information of the first subtask, and based on The updated current resource status information of the distributed system determines the target node of the second subtask from the preselected node set of the second subtask in the plurality of subtasks.

在一個可選例子中,基於第一子任務的預選節點集合中每個預選節點的當前狀態資訊,從所述第一子任務的預選節點集合中確定所述第一子任務的目標節點,並基於所述第一子任務的所需資源資訊,更新所述第一子任務的目標節點的當前狀態資訊。 In an optional example, based on the current state information of each preselected node in the first subtask's preselected node set, the target node of the first subtask is determined from the first subtask's preselected node set, and Based on the required resource information of the first subtask, current state information of the target node of the first subtask is updated.

在一些實施例中,所述從所述多個子任務中每個子任務的預選節點集合中選擇所述每個子任務的目標節點,包括:根據所述任務的任務類型,確定每個子任務的預選節點集合中包含的預選節點的分值;基於每個子任務的預選節點集合中包含的預選節點的分值,從所述每個子任務的預選節點集合中選擇所述每個子任務的目標節點。 In some embodiments, the selecting the target node of each subtask from the preselected node set of each subtask in the plurality of subtasks includes: determining the preselected node of each subtask according to the task type of the task The score of the preselected nodes included in the set; based on the score of the preselected nodes included in the preselected node set of each subtask, select the target node of each subtask from the preselected node set of each subtask.

在一個可選例子中,基於所述任務的任務類型,確定所述預選節點集合中每個預選節點的分值確定策略。 In an optional example, based on the task type of the task, a score determination strategy for each preselected node in the preselected node set is determined.

在一些實施例中,根據所述任務的資訊,分別為所述任務包括的多個子任務分配目標節點,包括:確定所述任務的任務類型;基於所述任務的任務類型,分別為所述任務的多個子任務分配對應的目標節點。 In some embodiments, assigning target nodes to multiple subtasks included in the task respectively according to the information of the task includes: determining the task type of the task; The multiple subtasks of are allocated to corresponding target nodes.

在一些實施例中,所述任務的任務類型為計算密集型或通信密集型。 In some embodiments, the task type of the task is computation-intensive or communication-intensive.

在一些實施例中,所述分別為所述任務的多個子任務分配對應的目標節點,包括:將所述任務的多個子任務劃分為至少一個分組,其中,每個分組包括所述多個子任務中的至少一個子任務;分別為所述至少一個分組中每個分組分配對應的目標節點,其中,同一分組中的各個子任務分配到同一目標節點。 In some embodiments, the assigning corresponding target nodes to the multiple subtasks of the task includes: dividing the multiple subtasks of the task into at least one group, wherein each group includes the multiple subtasks at least one subtask in the at least one group; each group in the at least one group is assigned a corresponding target node, wherein each subtask in the same group is assigned to the same target node.

在一些實施例中,所述將所述任務的多個子任務劃分為至少一個分組,包括:根據所述任務的任務類型,將所述任務的多個子任務劃分為至少一個分組。 In some embodiments, the dividing the multiple subtasks of the task into at least one group includes: dividing the multiple subtasks of the task into at least one group according to the task type of the task.

在一些實施例中,所述將所述任務的多個子任務劃分為至少一個分組,包括:在所述任務的任務類型為通信密集型的情況下,根據所述任務的多個子任務的資源需求資訊,確定所述多個子任務所需的總節點數量,並根據所述總節點數量,將所述任務的多個子任務劃分為至少一個分組。 In some embodiments, the dividing the plurality of subtasks of the task into at least one group includes: when the task type of the task is communication-intensive, according to the resource requirements of the plurality of subtasks of the task information, determine the total number of nodes required by the multiple subtasks, and divide the multiple subtasks of the task into at least one group according to the total number of nodes.

在一些實施例中,所述將所述任務的多個子任務劃分為 至少一個分組,包括:在所述任務的任務類型為計算密集型的情況下,將所述多個子任務中的每個子任務作為一個分組。 In some embodiments, the multiple subtasks of the task are divided into At least one grouping includes: when the task type of the task is computationally intensive, each subtask in the plurality of subtasks is regarded as a grouping.

在一些實施例中,分別為所述至少一個分組中每個分組分配對應的目標節點,包括:分別為所述至少一個分組中每個分組確定預選節點集合,並依次從所述至少一個分組中每個分組的預選節點集合中選取所述分組的目標節點。 In some embodiments, respectively assigning corresponding target nodes to each group in the at least one group includes: respectively determining a preselected node set for each group in the at least one group, and sequentially selecting from the at least one group The target node of the group is selected from the preselected node set of each group.

在一些實施例中,所述基於所述任務的任務類型,分別為所述任務的多個子任務分配對應的目標節點,包括:基於所述任務的任務類型,為所述任務的多個子任務確定目標節點;在為所述任務的多個子任務均成功確定對應的目標節點的情況下,為所述任務包括的多個子任務分配目標節點。 In some embodiments, the assigning corresponding target nodes to multiple subtasks of the task based on the task type of the task includes: determining for the multiple subtasks of the task based on the task type of the task Target node: assigning target nodes to the multiple subtasks included in the task when the corresponding target nodes are all successfully determined for the multiple subtasks of the task.

在一些實施例中,所述方法還包括:在為所述子任務中的至少一個子任務確定對應的目標節點不成功的情況下,對所述任務的多個子任務均進行延遲分配。 In some embodiments, the method further includes: in a case that determining the corresponding target node for at least one of the subtasks is unsuccessful, performing delayed allocation on multiple subtasks of the task.

在一些實施例中,所述方法還包括:在分散式系統中的當前可用資源不滿足所述任務的多個子任務的資源需求資訊對應的總資源的情況下,對所述任務包括的多個子任務均進行延遲分配。 In some embodiments, the method further includes: when the current available resources in the distributed system do not meet the total resources corresponding to the resource requirement information of the multiple subtasks of the task, for the multiple subtasks included in the task Tasks are assigned with delay.

在一些實施例中,所述方法還包括:在分別為所述任務的多個子任務分配對應的目標節點之後,對所述任務的多個子任務進行同步調度。 In some embodiments, the method further includes: after assigning corresponding target nodes to the multiple subtasks of the task, synchronously scheduling the multiple subtasks of the task.

在一些實施例中,所述方法應用於任務編排系統。 In some embodiments, the method is applied to a task orchestration system.

根據本公開實施例的第二方面,提供一種任務調度裝置, 所述裝置包括:獲取模組,用於獲取任務的資訊,所述任務的資訊包括所述任務的任務類型和資源需求資訊的至少一者;第一分配模組,用於根據所述任務的資訊,分別為所述任務包括的多個子任務分配目標節點。 According to a second aspect of an embodiment of the present disclosure, a task scheduling device is provided, The device includes: an acquisition module for acquiring task information, the task information including at least one of the task type and resource requirement information of the task; a first allocation module for information, respectively assigning target nodes to multiple subtasks included in the task.

根據本公開實施例的第三方面,提供一種計算機可讀儲存媒體,其上儲存有計算機程式,該程式被處理器執行時實現任一實施例所述的方法。 According to a third aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, on which a computer program is stored, and when the program is executed by a processor, the method described in any embodiment is implemented.

根據本公開實施例的第四方面,提供一種計算機設備,包括記憶體、處理器及儲存在記憶體上並可在處理器上運行的計算機程式,所述處理器執行所述程式時實現任一實施例所述的方法。 According to a fourth aspect of the embodiments of the present disclosure, there is provided a computer device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the program, any The method described in the examples.

本公開實施例中,可以獲取任務的資訊,所述任務的資訊包括所述任務的任務類型和資源請求資訊的至少一者,其中所述任務的資源請求資訊為所述任務包括的多個子任務的資源需求資訊。可以根據任務類型或者任務的資源需求資訊來進行目標節點的分配。例如,在分散式系統中的當前可用資源滿足所述任務的多個子任務的資源需求資訊對應的總資源的情況下,分別為所述任務的多個子任務分配對應的目標節點。以這種方式,能夠獲取到整個任務的資源需求情況,實現了以整個任務為粒度進行資源分配,提高了叢集的資源利用率。 In an embodiment of the present disclosure, task information can be obtained, and the task information includes at least one of the task type and resource request information of the task, wherein the resource request information of the task is a plurality of subtasks included in the task resource requirements information for . Allocation of target nodes can be performed according to the task type or resource requirement information of the task. For example, when the currently available resources in the distributed system satisfy the total resources corresponding to the resource requirement information of the multiple subtasks of the task, assign corresponding target nodes to the multiple subtasks of the task respectively. In this way, the resource requirements of the entire task can be obtained, resource allocation is implemented at the granularity of the entire task, and the resource utilization rate of the cluster is improved.

應當理解的是,以上的一般描述和後文的細節描述僅是示例性和解釋性的,而非限制本公開。 It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

201:獲取任務的資訊,所述任務的資訊包括所述任務的任務類型和資源需求資訊的至少一者 201: Obtain task information, where the task information includes at least one of the task type and resource requirement information of the task

202:根據所述任務的資訊,分別為所述任務的多個子任務分配目標節點 202: According to the information of the task, assign target nodes to multiple subtasks of the task respectively

701:獲取模組 701: Get module

702:第一分配模組 702: The first allocation module

801:處理器 801: Processor

802:隨機存取記憶體 802: random access memory

803:網路介面 803: Network interface

804:非揮發性記憶體 804: Non-volatile memory

圖1是本公開實施例的分散式系統的示意圖。 FIG. 1 is a schematic diagram of a decentralized system of an embodiment of the present disclosure.

圖2是本公開實施例的任務調度方法流程圖。 Fig. 2 is a flowchart of a task scheduling method according to an embodiment of the present disclosure.

圖3A是本公開實施例的模擬分配節點過程中資源變化的示意圖。 FIG. 3A is a schematic diagram of resource changes in the process of simulating allocation of nodes according to an embodiment of the present disclosure.

圖3B是本公開實施例的模擬分配節點過程的流程圖。 FIG. 3B is a flow diagram of a simulated allocation node process of an embodiment of the disclosure.

圖4是傳統的任務調度過程的示意圖。 Fig. 4 is a schematic diagram of a traditional task scheduling process.

圖5是本公開實施例的任務調度過程的示意圖。 Fig. 5 is a schematic diagram of a task scheduling process according to an embodiment of the present disclosure.

圖6是本公開實施例的調度邏輯的示意圖。 FIG. 6 is a schematic diagram of scheduling logic of an embodiment of the disclosure.

圖7是本公開實施例的任務調度裝置的結構示意圖。 Fig. 7 is a schematic structural diagram of a task scheduling device according to an embodiment of the present disclosure.

圖8是本公開實施例裝置的計算機設備的結構示意圖。 FIG. 8 is a schematic structural diagram of a computer device of an apparatus according to an embodiment of the present disclosure.

這裡將詳細地對示例性實施例進行說明,其示例表示在附圖中。下面的描述涉及附圖時,除非另有表示,不同附圖中的相同數位表示相同或相似的要素。以下示例性實施例中所描述的實施方式並不代表與本公開相一致的所有實施方式。相反,它們僅是與如所附申請專利範圍中所詳述的、本公開的一些方面相一致的裝置和方法的例子。 Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, unless otherwise indicated, the same numerals in different drawings indicate the same or similar elements. The implementations described in the following exemplary examples do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present disclosure as detailed in the appended claims.

在本公開使用的術語是僅僅出於描述特定實施例的目的,而非旨在限制本公開。在本公開和所附申請專利範圍中所使用的 單數形式的“一種”、“所述”和“該”也旨在包括多數形式,除非上下文清楚地表示其他含義。還應當理解,本文中使用的術語“和/或”是指並包含一個或多個相關聯的列出項目的任何或所有可能組合。另外,本文中術語“至少一種”表示多種中的任意一種或多種中的至少兩種的任意組合。 The terminology used in the present disclosure is for the purpose of describing particular embodiments only, and is not intended to limit the present disclosure. As used in this disclosure and appended claims The singular forms "a", "said" and "the" are also intended to include the plural unless the context clearly dictates otherwise. It should also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality.

應當理解,儘管在本公開可能採用術語第一、第二、第三等來描述各種資訊,但這些資訊不應限於這些術語。這些術語僅用來將同一類型的資訊彼此區分開。例如,在不脫離本公開範圍的情況下,第一資訊也可以被稱為第二資訊,類似地,第二資訊也可以被稱為第一資訊。取決於語境,如在此所使用的詞語“如果”可以被解釋成為“在......時”或“當......時”或“響應於確定”。 It should be understood that although the terms first, second, third, etc. may be used in the present disclosure to describe various pieces of information, these pieces of information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the present disclosure, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word "if" as used herein may be interpreted as "at" or "when" or "in response to a determination."

目前,越來越多的任務開始使用分散式系統進行處理,這樣的任務稱為分散式任務。這些分散式任務中包含對GPU(Graphics Processing Unit,圖形處理器)、DSP(Digital Signal Processor,數位數位訊號處理器)、FPGA(Field Programmable Gate Array,現場可編程門陣列)等高性能的處理資源有著迫切需求的任務,例如,深度學習任務。如圖1所示,是一些實施例的分散式系統,該分散式系統包括一個或多個叢集,每個叢集包括一台或多台伺服器,每台伺服器可以視為一個節點(圖中的各個黑點所示),每個節點都包括CPU(Central Processing Unit,中央處理器)、GPU、隨機存取記憶體、磁碟、網路介面等資源中的至少一者。在將子任務分配到節點之後,可以通過該節點上的資源來執行子任務,每個 節點可以執行一個或多個子任務。同一叢集內的各個節點執行的子任務可以相同,也可以不同。在分散式系統中,無論是任務本身,亦或是任務對資源的需求,都與傳統任務截然不同,因此,需要對分散式系統中的任務調度方式進行相應的調整,從而提高分散式系統中叢集資源使用的合理性,進而提高任務的運行性能和叢集的資源利用率。 At present, more and more tasks are being processed by decentralized systems, and such tasks are called decentralized tasks. These distributed tasks include GPU (Graphics Processing Unit, graphics processing unit), DSP (Digital Signal Processor, digital signal processor), FPGA (Field Programmable Gate Array, field programmable gate array) and other high-performance processing resources Tasks with demanding requirements, e.g. deep learning tasks. As shown in Figure 1, it is a distributed system of some embodiments, the distributed system includes one or more clusters, each cluster includes one or more servers, and each server can be regarded as a node (in the figure Each node includes at least one of CPU (Central Processing Unit, central processing unit), GPU, random access memory, disk, network interface and other resources. After a subtask is assigned to a node, it can be executed by resources on that node, each A node can execute one or more subtasks. The subtasks performed by each node in the same cluster can be the same or different. In a distributed system, both the task itself and the resource requirements of the task are completely different from traditional tasks. Therefore, it is necessary to adjust the task scheduling method in the distributed system accordingly, so as to improve the efficiency of the distributed system. Rational use of cluster resources, thereby improving task performance and cluster resource utilization.

在為分散式系統中的任務分配資源時,不同的分配方式可能對任務的執行效率產生影響。例如,在一些情況下,一個任務中的多個子任務之間需要進行頻繁的通信,因此,該任務中的多個子任務適合分配到通信代價較小的節點上;在另一些情況下,一個任務中的多個子任務在執行過程中會產生比較大的計算量,因此,該任務中的多個子任務適合分配到計算資源較多的節點上。 When allocating resources for tasks in a decentralized system, different allocation methods may have an impact on the execution efficiency of tasks. For example, in some cases, frequent communication is required between multiple subtasks in a task, therefore, multiple subtasks in this task are suitable to be assigned to nodes with lower communication costs; in other cases, a task The multiple subtasks in will generate a relatively large amount of calculation during the execution process. Therefore, multiple subtasks in this task are suitable for allocation to nodes with more computing resources.

而且,在當前分散式系統中的任務大多具有如下的特徵:一個任務往往包括多個子任務,各個子任務的內容以及所需的資源可能完全相同也可能不同,子任務之間可能需要進行頻繁的通信。此外,在一些情況下,一個任務中的多個子任務需要一起被調度執行,否則任務並不能正常執行;在另一些情況下,各個子任務的執行順序會對整個任務的執行情況造成一定的影響,例如,從(slave)子任務需要在其從屬的主(master)子任務執行完成之後才能執行。 Moreover, most of the tasks in the current distributed system have the following characteristics: a task often includes multiple subtasks, the content and required resources of each subtask may be completely the same or different, and frequent subtasks may need to be performed. communication. In addition, in some cases, multiple subtasks in a task need to be scheduled for execution together, otherwise the task cannot be executed normally; in other cases, the execution order of each subtask will have a certain impact on the execution of the entire task , for example, a slave (slave) subtask needs to be executed after its subordinate master (master) subtask is executed.

目前有一些調度器(例如,kube-batch)根據以下方式對分散式系統中的任務進行批調度:調度器按次序嘗試對一個任務 中的各個子任務一個一個進行調度,並在該任務的每一個子任務的調度嘗試完成後判斷整個任務是否已經符合用戶預期的可以運行的狀態。這種調度策略稱為gang scheduling(組調度)。 There are currently some schedulers (for example, kube-batch) that batch tasks in a decentralized system according to the following: the scheduler tries to batch a task in order Each subtask in the task is scheduled one by one, and after the scheduling attempt of each subtask of the task is completed, it is judged whether the entire task has met the user's expected runnable state. This scheduling strategy is called gang scheduling (group scheduling).

然而,目前的gang scheduling實現方式,本質上是以子任務為粒度進行調度,調度器未獲取到整個任務的資源需求情況,導致為子任務分配的節點可能不是最佳選擇,使得任務運行效率和對叢集的資源利用率都比較低。因此在叢集資源緊張或者任務類型複雜時,會產生大量的調度失敗現象。 However, the current implementation of gang scheduling essentially schedules at the granularity of subtasks, and the scheduler does not obtain the resource requirements of the entire task. As a result, the nodes allocated for subtasks may not be the best choice, making task operation efficiency and The resource utilization of the cluster is relatively low. Therefore, when cluster resources are tight or task types are complex, a large number of scheduling failures will occur.

基於此,本公開實施例提供了一種任務調度方法。如圖2所示,所述方法可包括步驟201和202。 Based on this, an embodiment of the present disclosure provides a task scheduling method. As shown in FIG. 2 , the method may include steps 201 and 202 .

步驟201:獲取任務的資訊,所述任務的資訊包括所述任務的任務類型和資源需求資訊的至少一者。 Step 201: Obtain task information, where the task information includes at least one of task type and resource requirement information of the task.

步驟202:根據所述任務的資訊,分別為所述任務的多個子任務分配目標節點。 Step 202: Assign target nodes to multiple subtasks of the task according to information of the task.

本公開實施例中的方法可以由任意電子設備執行,例如終端設備或雲伺服器等,在一些實施例中,該方法可以由處理器或調度器執行,其中,可選地,所述處理器或調度器可以運行在諸如Kubernetes等雲平臺上,具體可以設置在容器編排引擎上,但本公開實施例不限於此。在步驟201中,可以獲取一個或多個任務的資訊。 The method in the embodiments of the present disclosure may be executed by any electronic device, such as a terminal device or a cloud server, etc., and in some embodiments, the method may be executed by a processor or a scheduler, wherein, optionally, the processor Or the scheduler may run on a cloud platform such as Kubernetes, and specifically may be set on a container orchestration engine, but the embodiments of the present disclosure are not limited thereto. In step 201, information of one or more tasks can be obtained.

所述任務包括的多個子任務可以是所述任務的所有子任務,也可以是所述任務中的部分子任務,例如資源需求量較大的子 任務、或者優先級較高的子任務、或者是相互之間具有依賴關係的子任務、對資源需求類型或數量較為接近的子任務,等等。其中,相互之間具有依賴關係的子任務是指相互具有依賴或主從關係的多個子任務,如子任務A的執行只有在子任務B執行完成後才能開始,再例如,具有相同類型或具有至少一個特定類型的多個子任務,等等。所述任務可以是神經網路的訓練任務、推斷任務,也可以是其他類型的深度學習任務等。在一些可選實施例下,同一任務中的各個子任務可以由分散式系統中的同一叢集中的相同或者不同節點來執行。在分散式系統包括多個叢集的情況下,同一任務的各個子任務可以由相同或不同叢集的節點執行。可選地,可以先將任務分配到某個叢集,然後,再從該叢集中確定用於所述任務的多個子任務的目標節點。這樣,通過將任務的多個子任務分配到相同的叢集上執行,有利於縮減通信開銷,提高任務執行效率。 The multiple subtasks included in the task may be all subtasks of the task, or some subtasks in the task, such as subtasks with large resource requirements Tasks, or subtasks with higher priority, or subtasks that have dependencies with each other, subtasks that have similar types or quantities of resource requirements, and so on. Among them, subtasks with mutual dependencies refer to multiple subtasks with mutual dependence or master-slave relationship. For example, the execution of subtask A can only start after the execution of subtask B is completed. For example, subtasks with the same type or with Multiple subtasks of at least one specific type, etc. The task may be a neural network training task, an inference task, or other types of deep learning tasks. In some optional embodiments, each subtask in the same task may be performed by the same or different nodes in the same cluster in the distributed system. Where the decentralized system includes multiple clusters, individual subtasks of the same task may be performed by nodes of the same or different clusters. Optionally, the task may be assigned to a certain cluster first, and then the target nodes for multiple subtasks of the task are determined from the cluster. In this way, by allocating multiple subtasks of a task to the same cluster for execution, it is beneficial to reduce communication overhead and improve task execution efficiency.

所述任務的資訊可以是基於用戶輸入或設置得到的,或者是基於對任務的分析得到的,等等。所述任務的資訊可以包括任務的任務類型和/或資源需求資訊。其中,所述任務的資源需求資訊具體為所述任務包括的多個子任務的資源需求資訊。在一些實施例中,除了任務類型和資源需求資訊以外,所述任務的資訊還可以包括所述任務的優先級資訊、所述任務包括的多個子任務的優先級資訊、子任務之間的依賴資訊、用戶提供資訊、歷史資訊、計算量資訊、通信量資訊等一種或多種。 The task information may be obtained based on user input or settings, or based on task analysis, and so on. The task information may include task type and/or resource requirement information of the task. Wherein, the resource requirement information of the task is specifically the resource requirement information of multiple subtasks included in the task. In some embodiments, in addition to the task type and resource requirement information, the task information may also include the priority information of the task, the priority information of multiple subtasks included in the task, and the dependencies between subtasks One or more of information, user provided information, historical information, calculation amount information, communication amount information, etc.

其中,所述子任務的資源需求資訊可以包括執行該子任 務所需的資源類型和/或資源數量,所述資源類型可以包括但不限於CPU、GPU、DSP、FPGA、隨機存取記憶體、磁碟、網路介面等資源中的至少一者。 Wherein, the resource requirement information of the subtask may include executing the subtask The resource type and/or resource quantity required by the service may include but not limited to at least one of resources such as CPU, GPU, DSP, FPGA, random access memory, disk, and network interface.

所述任務的優先級資訊可以包括但不限於所述任務等待被調度的時長和/或所述任務中的所述多個子任務所需的總資源。等待被調度的時長是指調度器接收到一個任務的時刻,到該任務被調度的時刻之間的時長,可以將等待被調度的時長較長的任務以及所需的總資源較多的任務的優先級設置得較高,從而避免一個任務長期處於等待調度的狀態。 The priority information of the task may include, but not limited to, the waiting time of the task to be scheduled and/or the total resources required by the plurality of subtasks in the task. The waiting time to be scheduled refers to the time between the time when the scheduler receives a task and the time when the task is scheduled, and the task with a longer waiting time to be scheduled and the total resources required can be more The priority of the task is set higher, so as to prevent a task from waiting for scheduling for a long time.

所述任務包括的多個子任務的優先級資訊用於確定所述任務包括的多個子任務的優先級。在一些實施例中,在一個調度週期或資源分配流程中,可以至少基於各個子任務的優先級資訊來進行子任務的調度。 The priority information of the multiple subtasks included in the task is used to determine the priorities of the multiple subtasks included in the task. In some embodiments, in a scheduling cycle or a resource allocation process, the subtasks can be scheduled at least based on the priority information of each subtask.

所述子任務之間的依賴資訊用於確定所述任務包括的多個子任務之間的依賴或主從關係,如果子任務B只有在子任務A執行完成後才能開始執行,則子任務B依賴於子任務A,或者子任務A為主任務,子任務B為子任務A的從任務。 The dependency information between the subtasks is used to determine the dependency or master-slave relationship between the multiple subtasks included in the task. If subtask B can only start to execute after the execution of subtask A is completed, then subtask B depends on Subtask A, or subtask A is the master task, and subtask B is the slave task of subtask A.

所述用戶提供資訊用於確定所述任務的多個子任務的計算量與通信量。 The user-provided information is used to determine computation and communication amounts of subtasks of the task.

所述歷史資訊可以包括歷史調度資訊和/或歷史執行情況等。可以根據所述歷史資訊來確定歷史調度過程中所述任務的任務類型。 The historical information may include historical scheduling information and/or historical execution conditions and the like. The task type of the task in the historical scheduling process can be determined according to the historical information.

除了以上資訊之外,根據實際的應用場景,所述任務的資訊還可包括其他資訊,此處不再贅述。 In addition to the above information, according to actual application scenarios, the task information may also include other information, which will not be repeated here.

在一些實施例中,在步驟202中,根據所述任務的資訊,分別為所述任務包括的多個子任務分配目標節點,包括:在分散式系統中的當前可用資源滿足所述任務的多個子任務的資源需求資訊對應的總資源的情況下,分別為所述任務的多個子任務分配對應的目標節點。所述任務的多個子任務的資源需求資訊對應的總資源可以指所述任務的多個子任務所需的總資源,可選地,在多個子任務為任務的所有子任務的情況下,該總資源也是所述任務所需的總資源。可選地,如果分散式系統的當前可用資源滿足所述多個子任務或所述任務所需的總資源,則表明分散式系統中的當前可用資源足以支持多個子任務的運行。可選地,如果分散式系統的當前可用資源不滿足多個子任務或任務所需的總資源,則表明分散式系統中的當前可用資源最多僅支持多個子任務中的一部分任務的運行,而無法支持所有子任務的運行。 In some embodiments, in step 202, assigning target nodes to multiple subtasks included in the task respectively according to the information of the task includes: the currently available resources in the distributed system satisfy the multiple subtasks of the task In the case of the total resource corresponding to the resource requirement information of the task, corresponding target nodes are assigned to the multiple subtasks of the task respectively. The total resources corresponding to the resource requirement information of multiple subtasks of the task may refer to the total resources required by the multiple subtasks of the task. Optionally, when the multiple subtasks are all subtasks of the task, the total resource Resource is also the total resource required for the task in question. Optionally, if the currently available resources of the distributed system meet the multiple subtasks or the total resources required by the task, it indicates that the currently available resources in the decentralized system are sufficient to support the running of multiple subtasks. Optionally, if the currently available resources of the decentralized system do not meet the total resources required by multiple subtasks or tasks, it indicates that the currently available resources in the decentralized system can only support the operation of a part of the multiple subtasks at most, and cannot Supports running of all subtasks.

在本公開實施例中,只有在分散式系統中的當前可用資源滿足所述任務的多個子任務所需的總資源的情況下,才分別為所述任務的多個子任務分配對應的目標節點。這樣,可以以任務作為資源分配的單位,並對任務包含的多個子任務一起進行節點的分配,從而避免在分散式系統中的當前可用資源不夠的情況下,任務中的部分子任務無法被成功分配資源,導致該任務中僅部分子任務佔用了分散式系統中的整個資源的情況,有利於改善資源的 合理利用。 In the embodiment of the present disclosure, only when the currently available resources in the distributed system satisfy the total resources required by the multiple subtasks of the task, the corresponding target nodes are respectively assigned to the multiple subtasks of the task. In this way, the task can be used as the unit of resource allocation, and the multiple subtasks contained in the task can be allocated to nodes together, so as to avoid the failure of some subtasks in the task to be successful when the current available resources in the decentralized system are not enough Allocating resources, resulting in the fact that only some subtasks in the task occupy the entire resources in the distributed system, which is conducive to improving resource allocation. Reasonable use.

在一些實施例中,可以基於分散式系統的當前資源狀態資訊和多個子任務的資源需求資訊,為多個子任務分配目標節點。其中,可以為每個子任務分配能夠滿足其資源需求的目標節點,不同子任務的目標節點可以相同或者不同。 In some embodiments, target nodes can be assigned to multiple subtasks based on current resource status information of the distributed system and resource requirement information of the multiple subtasks. Wherein, each subtask can be assigned a target node that can meet its resource requirements, and the target nodes of different subtasks can be the same or different.

在一些實施例中,可以基於所述分散式系統的當前資源狀態資訊和所述多個子任務的資源需求資訊,為所述多個子任務確定目標節點,在成功為所述多個子任務中的每個子任務確定對應的目標節點的情況下,分別為所述任務的多個子任務分配對應的目標節點。上述為所述多個子任務確定目標節點的過程為模擬分配過程,模擬分配是一種虛擬的資源分配過程,並非實際執行的資源分配過程,所述模擬分配過程用於通過計算確定多個子任務當前是否可以分配到分散式系統中的節點。通過為所述任務的多個子任務進行節點的模擬分配,從而可以確定分散式系統當前是否滿足該多個子任務所需資源。 In some embodiments, based on the current resource state information of the distributed system and the resource requirement information of the multiple subtasks, target nodes for the multiple subtasks can be determined, and each of the multiple subtasks can be successfully When subtasks determine corresponding target nodes, assign corresponding target nodes to multiple subtasks of the task. The above-mentioned process of determining target nodes for the multiple subtasks is a simulated allocation process. The simulated allocation is a virtual resource allocation process, not an actual resource allocation process. The simulated allocation process is used to determine whether multiple subtasks are currently Can be assigned to nodes in a decentralized system. By performing simulated allocation of nodes for multiple subtasks of the task, it can be determined whether the distributed system currently meets the resources required by the multiple subtasks.

其中,分散式系統的當前資源狀態資訊用於表徵分散式系統中當前可用的資源,可以包括分散式系統的當前可用資源的數量、類型和分佈中的至少一者,或者,可以包括分散式系統中多個節點的當前狀態,例如是否可用或是否有可用資源、可用資源的類型和數量等等。多個子任務的資源需求資訊可以包括多個子任務中每個子任務的資源需求資訊,例如所需的資源類型、某種資源類型的資源數量中的至少一種,例如所需的GPU、DSP、CPU等 一種或多種,本公開實施例不限於此。 Wherein, the current resource state information of the distributed system is used to characterize the resources currently available in the distributed system, which may include at least one of the quantity, type and distribution of the currently available resources of the distributed system, or may include The current status of multiple nodes in the node, such as whether or not there are resources available, the type and quantity of available resources, and so on. The resource requirement information of multiple subtasks may include resource requirement information of each subtask in the multiple subtasks, such as at least one of required resource type and resource quantity of a certain resource type, such as required GPU, DSP, CPU, etc. One or more, the embodiments of the present disclosure are not limited thereto.

在一些實施例中,可以基於多個子任務的資源需求資訊和分散式系統的當前資源狀態資訊,確定分散式系統中是否存在滿足多個子任務需求的目標節點,或者,按照特定順序依次確定多個子任務中每個子任務的目標節點。例如,如果多個子任務中的每個子任務都能在分散式系統中找到滿足其需求的目標節點,則可以確定分散式系統的當前可用資源滿足多個子任務所需的總資源。再例如,如果多個子任務中的至少一個子任務在分散式系統中找不到滿足其需求的目標節點,則可以確定分散式系統的當前可用資源不滿足多個子任務所需的總資源。 In some embodiments, based on the resource requirement information of multiple subtasks and the current resource status information of the distributed system, it can be determined whether there is a target node that meets the requirements of multiple subtasks in the distributed system, or, multiple subtasks can be sequentially determined in a specific order. The target node for each subtask in the task. For example, if each of the multiple subtasks can find a target node in the decentralized system that meets its needs, it can be determined that the currently available resources of the decentralized system satisfy the total resources required by the multiple subtasks. For another example, if at least one subtask in the multiple subtasks cannot find a target node that meets its requirements in the decentralized system, it may be determined that the currently available resources of the decentralized system do not meet the total resources required by the multiple subtasks.

與傳統單個子任務的節點確定不同的是,在本公開實施例中,多個子任務的目標節點是按照特定順序依次確定的,並且後面的子任務的目標節點的確定可能會受到為前面的子任務確定的目標節點的影響。在一些實施例中,在每次為其中一個子任務確定了對應的目標節點之後,更新分散式系統的當前資源狀態資訊,例如更新確定的該目標節點的當前狀態資訊,然後基於更新後的當前資源狀態資訊確定下一個子任務對應的目標節點,從而避免出現資源衝突問題。 Different from the traditional single subtask node determination, in the embodiment of the present disclosure, the target nodes of multiple subtasks are sequentially determined in a specific order, and the determination of the target nodes of the subsequent subtasks may be influenced by the previous subtasks. The impact of the target node determined by the task. In some embodiments, after the corresponding target node is determined for one of the subtasks each time, the current resource state information of the distributed system is updated, for example, the current state information of the determined target node is updated, and then based on the updated current The resource status information determines the target node corresponding to the next subtask, so as to avoid resource conflicts.

應當說明的是,這裡所說的更新所述當前資源狀態資訊,是用於以上模擬分配或者目標節點的確定過程,是虛擬的而非真實的更新,可以指在執行完一次子任務的節點模擬分配或目標節點的確定之後,分散式系統中當前可用資源的虛擬變化情況,由於 還尚未進行子任務的實際分配,所以分散式系統中的當前可用資源實際並未發生變化。例如,在確定一個子任務的目標節點之後,可以認為將該目標節點虛擬地分配給該子任務,將目標節點上與該子任務所需求的資源對應的部分資源與該子任務虛擬地綁定,這樣,該被虛擬地綁定的資源就不能再被分配給其他子任務。 It should be noted that the update of the current resource status information mentioned here is used for the above simulation allocation or determination process of the target node. It is a virtual rather than a real update. After the allocation or determination of the target node, the virtual change of the currently available resources in the decentralized system, due to The actual allocation of subtasks has not yet taken place, so the currently available resources in the decentralized system have not actually changed. For example, after determining the target node of a subtask, it can be considered that the target node is virtually assigned to the subtask, and some resources on the target node corresponding to the resources required by the subtask are virtually bound to the subtask , so that the virtually bound resource can no longer be assigned to other subtasks.

上述特定順序可以指預設順序。或者,在一些例子中,所述特定順序可以根據所述多個子任務的優先級來確定,例如,先為優先級較高的子任務確定對應的目標節點,再為優先級較低的子任務確定對應的目標節點,其中,該優先級可以基於子任務類型、子任務與其他子任務之間的依賴關係等一種或多種因素確定,例如,對於存在依賴或主從關係的兩個子任務,主(master)子任務的優先級高於從(slave)子任務的優先級,但本公開實施例不限於此。或者,在另一些例子中,所述特定順序可以根據所述子任務的依賴或主從關係來確定。由於從子任務在缺少主子任務的情況下無法開始運行,因此,通過優先為主子任務分配對應的目標節點,有利於提高整個任務的運行效率。或者,所述特定順序也可以基於其他因素確定,本公開實施例對此不做限定。 The above specific order may refer to a preset order. Or, in some examples, the specific order can be determined according to the priorities of the plurality of subtasks, for example, first determine the corresponding target nodes for subtasks with higher priority, and then determine the corresponding target nodes for subtasks with lower priority Determine the corresponding target node, where the priority can be determined based on one or more factors such as the subtask type, the dependency relationship between the subtask and other subtasks, for example, for two subtasks that have a dependency or a master-slave relationship, The priority of the master (master) subtask is higher than that of the slave (slave) subtask, but the embodiments of the present disclosure are not limited thereto. Or, in some other examples, the specific order may be determined according to the dependency or master-slave relationship of the subtasks. Since the slave subtask cannot start to run without the master subtask, it is beneficial to improve the operation efficiency of the entire task by assigning the corresponding target node to the master subtask first. Alternatively, the specific order may also be determined based on other factors, which is not limited in this embodiment of the present disclosure.

在一些實施例中,在分散式系統中存在滿足所述多個子任務中的每個子任務所需資源的目標節點的情況下,則確定分散式系統中的當前可用資源滿足所述任務的多個子任務的資源需求資訊對應的總資源;否則,確定分散式系統中的當前可用資源不滿足所述任務的多個子任務的資源需求資訊對應的總資源。 In some embodiments, if there is a target node in the distributed system that satisfies the resources required by each subtask in the plurality of subtasks, then it is determined that the currently available resources in the distributed system satisfy the plurality of subtasks of the task The total resources corresponding to the resource requirement information of the task; otherwise, it is determined that the currently available resources in the distributed system do not satisfy the total resources corresponding to the resource requirement information of the plurality of subtasks of the task.

如圖3A和3B所示的例子,假設一個任務包括優先級由高到低的子任務1、子任務2和子任務3,可以先根據分散式系統的當前可用資源資訊和子任務1的資源需求資訊,為子任務1模擬分配節點,得到子任務1的目標節點,然後更新分散式系統中的當前資源狀態資訊;再根據更新後的分散式系統的當前可用資源資訊和子任務2的資源需求資訊,為子任務2模擬分配節點,得到子任務2的目標節點,然後更新分散式系統中的當前資源狀態資訊;再根據再次更新後的分散式系統的當前可用資源資訊和子任務3的資源需求資訊,為子任務3模擬分配節點,得到子任務3的目標節點。 As shown in Figure 3A and 3B, assuming a task includes subtask 1, subtask 2, and subtask 3 with priority from high to low, it can be based on the current available resource information of the distributed system and the resource demand information of subtask 1 , to simulate the distribution of nodes for subtask 1, obtain the target node of subtask 1, and then update the current resource status information in the distributed system; then according to the updated distributed system’s current available resource information and resource demand information of subtask 2, Simulate the allocation of nodes for subtask 2, obtain the target node of subtask 2, and then update the current resource status information in the distributed system; then, according to the updated distributed system's current available resource information and resource demand information of subtask 3, Simulate the distribution of nodes for subtask 3, and get the target node of subtask 3.

如果為子任務1、子任務2和子任務3模擬分配節點均成功,則確定所述分散式系統的當前可用資源滿足所述多個子任務中每個子任務的需求,即滿足多個子任務所需的總資源。如果為子任務1、子任務2或者子任務3模擬分配節點失敗,例如,分散式系統中找不到滿足其所需資源的目標節點,則確定所述分散式系統的當前可用資源不滿足所述多個子任務所需的總資源。 If the simulated assignment of nodes for subtask 1, subtask 2, and subtask 3 is successful, then it is determined that the currently available resources of the distributed system meet the requirements of each subtask in the plurality of subtasks, that is, satisfy the requirements of a plurality of subtasks total resources. If the simulated allocation of nodes for subtask 1, subtask 2, or subtask 3 fails, for example, no target node that satisfies its required resources can be found in the distributed system, then it is determined that the currently available resources of the distributed system do not meet the required The total resources required by the multiple subtasks described above.

在基於所述模擬分配確定所述分散式系統的當前可用資源滿足所述多個子任務中每個子任務所需求的資源的情況下,可以將在所述模擬分配中確定的每個子任務對應的目標節點分配給所述每個子任務。 When it is determined based on the simulated allocation that the currently available resources of the distributed system meet the resources required by each of the multiple subtasks, the target corresponding to each subtask determined in the simulated allocation may be A node is assigned to each of the subtasks.

在一些實施例中,可以基於分散式系統中各個節點的當前狀態資訊,對各個子任務進行目標節點的確定,例如,基於各個 節點的當前狀態資訊,對各個節點進行排序或打分,並按照特定順序依次為多個子任務分配排序或打分較高的節點。 In some embodiments, target nodes can be determined for each subtask based on the current state information of each node in the decentralized system, for example, based on each The current status information of the nodes, sort or score each node, and assign nodes with higher ranking or higher scores to multiple subtasks in a specific order.

在一些實施例中,為所述任務的多個子任務確定目標節點時,可以通過預選和優選兩個過程來確定多個子任務分別對應的目標節點。 In some embodiments, when determining the target nodes for the multiple subtasks of the task, the target nodes respectively corresponding to the multiple subtasks may be determined through two processes of preselection and optimization.

其中,在預選過程中,基於多個子任務的資源需求資訊,分別為多個子任務中的每個子任務初步選取分散式系統中能夠滿足其需求的至少一個預選節點,或者剔除分散式系統中不能滿足其需求的節點。例如,可以基於所述分散式系統中的多個節點的當前狀態資訊和所述多個子任務的資源需求資訊,確定所述多個子任務中每個子任務的預選節點集合,其中,每個子任務的預選節點集合包括至少一個預選節點。不同的子任務的預選節點集合中可能包括相同或不同的節點。 Among them, in the pre-selection process, based on the resource demand information of multiple sub-tasks, at least one pre-selected node in the distributed system that can meet its needs is initially selected for each sub-task in the multiple sub-tasks, or eliminated in the distributed system. the node it needs. For example, based on the current state information of multiple nodes in the distributed system and the resource requirement information of the multiple subtasks, a preselected node set for each subtask in the multiple subtasks may be determined, wherein each subtask's The set of preselected nodes includes at least one preselected node. The preselected node sets of different subtasks may include the same or different nodes.

在優選過程中,從每個子任務的預選節點集合中選擇所述每個子任務的目標節點。例如,可以從子任務的預選節點集合中選出一個最適合的節點作為該子任務的目標節點。在一些可選例子中,可以根據子任務的預選節點集合中包括的各個預選節點的當前狀態資訊,例如當前負載情況、可用資源類型、可用資源數量、可用資源分佈和節點內部之間的拓撲連接關係、與其他節點之間的拓撲連接關係中的至少一者,來確定目標節點。所述可用資源類型包括但不限於以下至少任一:通信介面、磁碟儲存空間、隨機存取記憶體、CPU、GPU等。所述可用資源數量可以是CPU數量、 磁碟剩餘容量、隨機存取記憶體剩餘容量等。所述可用資源分佈也稱為資源碎片情況,即當前可用資源的分佈位置,例如,磁碟剩餘容量分佈在多少個磁碟上。分散式系統中各個節點的拓撲結構例如可以是總線型拓撲、星形拓撲、環形拓撲、樹形拓撲等。 In the optimization process, the target node of each subtask is selected from the preselected node set of each subtask. For example, a most suitable node may be selected from the pre-selected node set of the subtask as the target node of the subtask. In some optional examples, it can be based on the current status information of each pre-selected node included in the pre-selected node set of the subtask, such as the current load situation, available resource type, available resource quantity, available resource distribution, and topological connections between nodes The target node is determined by at least one of a relationship and a topological connection relationship with other nodes. The available resource types include but are not limited to at least any of the following: communication interface, disk storage space, random access memory, CPU, GPU, and the like. The number of available resources may be the number of CPUs, Disk remaining capacity, random access memory remaining capacity, etc. The distribution of available resources is also referred to as resource fragmentation, that is, the distribution position of currently available resources, for example, how many disks the remaining disk capacity is distributed on. The topology of each node in the distributed system may be, for example, a bus topology, a star topology, a ring topology, a tree topology, and the like.

在一些實施例中,在預選過程中,可以基於所述分散式系統的當前資源狀態資訊和每個子任務的資源需求資訊,確定所述每個子任務的預選節點集合,即,所述任務的多個子任務中的每個子任務的預選過程可以互不干擾,每個子任務的預選節點集合的選取僅與該子任務的資源需求資訊有關,而不依賴於其他子任務的資源需求資訊。相應地,多個子任務的預選節點集合的確定可以並行執行或者以任意先後順序執行。這樣,可以快速地進行預選節點的確定,從而提高模擬分配的整體效率。 In some embodiments, in the preselection process, the preselected node set for each subtask can be determined based on the current resource state information of the distributed system and the resource requirement information of each subtask, that is, the multiplicity of the task The preselection process of each subtask in the subtasks may not interfere with each other, and the selection of the preselected node set of each subtask is only related to the resource requirement information of the subtask, and does not depend on the resource requirement information of other subtasks. Correspondingly, the determination of the preselected node sets of multiple subtasks can be performed in parallel or in any sequence. In this way, the determination of the pre-selected nodes can be performed quickly, thereby improving the overall efficiency of the simulation assignment.

例如,假設任務包括優先級由高到低的子任務1、子任務2和子任務3,則在預選過程中,可以並行地執行以下操作:根據子任務1的資源需求資訊與分散式系統的當前資源狀態資訊確定子任務1的預選節點集合,根據子任務2的資源需求資訊與分散式系統的當前資源狀態資訊確定子任務2的預選節點集合,並根據子任務3的資源需求資訊與分散式系統的當前資源狀態資訊確定子任務3的預選節點集合,假設得到子任務1對應的預選節點集合包括{節點1,節點2,節點3},子任務2對應的預選節點集合包括{節點2,節點5,節點6,節點7},子任務3對應的預選節點集合包括{節點6和節點7}。 For example, assuming that the task includes subtask 1, subtask 2 and subtask 3 with priority from high to low, then in the preselection process, the following operations can be performed in parallel: according to the resource requirement information of subtask 1 and the current distributed system The resource status information determines the preselected node set of subtask 1, determines the preselected node set of subtask 2 according to the resource requirement information of subtask 2 and the current resource status information of the distributed system, and determines the preselected node set of subtask 2 according to the resource requirement information of subtask 3 and the distributed system The current resource state information of the system determines the set of preselected nodes for subtask 3, assuming that the set of preselected nodes corresponding to subtask 1 includes {node 1, node 2, node 3}, and the set of preselected nodes corresponding to subtask 2 includes {node 2, Node 5, Node 6, Node 7}, the preselected node set corresponding to subtask 3 includes {Node 6 and Node 7}.

在一些實施例中,在預選過程中,可以按照特定順序,依次選擇多個子任務的預選節點集合。在一個可選的例子中,基於前面子任務的預選節點集合來確定後面子任務的預選節點集合。例如,後面子任務的預選節點是從前面子任務的預選節點集合中選擇的,其中,該前面子任務的數量可以為一個或兩個以上,例如,該前面子任務可以是該後面子任務之前的所有子任務,該前面子任務和後面子任務可以相鄰或者間隔至少一個子任務,但本公開實施例不限於此。 In some embodiments, during the preselection process, the preselected node sets of multiple subtasks may be sequentially selected in a specific order. In an optional example, the preselected node set of the following subtask is determined based on the preselected node set of the previous subtask. For example, the preselected nodes of the following subtasks are selected from the set of preselected nodes of the preceding subtasks, wherein the number of the preceding subtasks can be one or more than two, for example, the preceding subtasks can be the All subtasks, the preceding subtask and the following subtask may be adjacent to each other or separated by at least one subtask, but the embodiments of the present disclosure are not limited thereto.

在一些實施例中,也可以綜合考慮多個子任務的資源需求資訊,來進行多子任務的預選節點集合的確定,例如,為所述多個子任務選擇共用的預選節點集合,其中,該共用的預選節點集合中包含的預選節點能夠滿足多個子任務中至少兩個子任務的資源需求。 In some embodiments, the resource requirement information of multiple subtasks can also be considered comprehensively to determine the preselected node sets for multiple subtasks, for example, select a common preselected node set for the multiple subtasks, wherein the shared The preselected nodes included in the preselected node set can meet the resource requirements of at least two subtasks in the plurality of subtasks.

在一些實施例中,在優選過程中,可以並行地確定多個子任務的目標節點,或者結合多個子任務的候選節點集合,來綜合確定多個子任務的目標節點。例如,對多個子任務中至少一部分子任務的候選節點集合求交來確定該至少一部分子任務的目標節點,但本公開實施例不限於此。 In some embodiments, in the optimization process, the target nodes of multiple subtasks may be determined in parallel, or the target nodes of multiple subtasks may be comprehensively determined by combining the candidate node sets of multiple subtasks. For example, target nodes of at least a part of the subtasks are determined by intersecting the candidate node sets of at least a part of the subtasks among the multiple subtasks, but the embodiments of the present disclosure are not limited thereto.

在一些實施例中,可以按照特定順序,例如基於多個子任務的優先級,依次從所述多個子任務的預選節點集合中確定所述多個子任務對應的目標節點。其中,在一個可選例子中,每確定一個子任務對應的目標節點,更新該目標節點的當前狀態資訊。例 如,基於該子任務對應的資源需求資訊,更新該目標節點的當前狀態資訊。這樣,在進行後續子任務的目標節點的確定過程中,該目標節點上與該子任務所需求的資源對應的部分將不可用,從而能夠保證多個子任務的順利執行。在另一個可選例子中,基於前面子任務選擇的目標節點,來選擇後面子任務的目標節點。例如,後面的子任務優先選擇前面子任務的目標節點作為自己的目標節點,除非該後面的子任務的候選節點集合中不包含該目標節點或者該目標節點的其他關鍵因素不滿足後面子節點的需求,但本公開實施例不限於此。 In some embodiments, the target nodes corresponding to the multiple subtasks may be sequentially determined from the preselected node sets of the multiple subtasks in a specific order, for example, based on the priorities of the multiple subtasks. Wherein, in an optional example, each time a target node corresponding to a subtask is determined, the current state information of the target node is updated. example For example, based on the resource requirement information corresponding to the subtask, the current state information of the target node is updated. In this way, during the determination of the target node for the subsequent subtask, the part of the target node corresponding to the resource required by the subtask will not be available, thereby ensuring the smooth execution of multiple subtasks. In another optional example, the target node of the subsequent subtask is selected based on the target node selected by the previous subtask. For example, the following subtasks prefer to select the target node of the previous subtask as its own target node, unless the target node is not included in the candidate node set of the subsequent subtask or other key factors of the target node do not meet the requirements of the subsequent subtasks. requirements, but the embodiments of the present disclosure are not limited thereto.

在同步地為多個子任務進行資源分配的情況下,上述預選和優選過程可以是一種模擬分配過程。在非同步地為多個子任務進行資源分配的情況下,上述預選和優選過程也可以是真實的節點分配過程。同理,上述過程中涉及的更新當前資源狀態資訊,也可以指在執行完一次子任務的節點分配之後,由於分散式系統中的當前可用資源發生變化而執行的真實的更新過程。 In the case of synchronous resource allocation for multiple subtasks, the above preselection and optimization process may be a simulated allocation process. In the case of asynchronous resource allocation for multiple subtasks, the above preselection and optimization process may also be a real node allocation process. Similarly, the update of the current resource status information involved in the above process may also refer to the real update process performed due to changes in the currently available resources in the distributed system after the node allocation of a subtask is performed.

在本公開實施例中,可以基於一定策略從預選節點集合中選取目標節點。例如,基於分散式系統的當前資源狀態資訊或各個預選節點的當前狀態資訊和多個子任務的資源需求資訊進行目標節點的選取。在一些實施例中,可以確定每個子任務的預選節點集合中包含的預選節點的分值,例如,基於子任務的預選節點集合中各個預選節點的當前狀態資訊和該子任務的資源需求資訊,確定各個預選節點的分值,並基於每個子任務的預選節點集合中包 含的預選節點的分值,從所述每個子任務的預選節點集合中選擇所述每個子任務的目標節點。例如,可以從預選節點集合中包括的至少一個預選節點中選取分數最高的預選節點作為目標節點。或者,基於分數和其他因素共同進行目標節點的選取。 In the embodiment of the present disclosure, the target node may be selected from the pre-selected node set based on a certain strategy. For example, the target node is selected based on current resource status information of the distributed system or current status information of each pre-selected node and resource requirement information of multiple subtasks. In some embodiments, the score of the preselected nodes contained in the preselected node set of each subtask may be determined, for example, based on the current state information of each preselected node in the subtask preselected node set and the resource requirement information of the subtask, Determine the score of each pre-selected node, and based on the pre-selected node set of each subtask Select the target node of each subtask from the preselected node set of each subtask. For example, the preselected node with the highest score may be selected from at least one preselected node included in the preselected node set as the target node. Alternatively, target node selection is performed based on scores and other factors.

接著前面的例子,以基於任務優先級依次確定多個子任務的目標節點為例,首先可以確定子任務1的預選節點集合中各個預選節點的分值,假設節點1,節點2和節點3的分值分別為80,90和70,則確定節點2為子任務1的目標節點,並對節點2的當前狀態資訊進行更新。然後,確定子任務2的預選節點集合中各個預選節點的分值,其中,節點2的分值是基於節點2的更新後的當前狀態資訊確定的,假設節點2,節點5,節點6和節點7的分值分別為60,80,75和70,則確定節點5為子任務2的目標節點,並對節點5的當前狀態資訊進行更新。最後,確定子任務3的預選節點集合中各個預選節點的分值,假設節點6和節點7的分值分別為70和60,則確定節點6為子任務3的目標節點。 Following the previous example, take determining the target nodes of multiple subtasks sequentially based on task priority as an example. First, you can determine the scores of each preselected node in the preselected node set of subtask 1. Assume that the scores of nodes 1, 2 and 3 are The values are 80, 90 and 70 respectively, then it is determined that node 2 is the target node of subtask 1, and the current state information of node 2 is updated. Then, determine the score of each preselected node in the preselected node set of subtask 2, wherein the score of node 2 is determined based on the updated current state information of node 2, assuming node 2, node 5, node 6 and node The scores of 7 are 60, 80, 75 and 70 respectively, then determine node 5 as the target node of subtask 2, and update the current state information of node 5. Finally, determine the score of each preselected node in the preselected node set of subtask 3, assuming that the scores of nodes 6 and 7 are 70 and 60 respectively, then determine node 6 as the target node of subtask 3.

在一些實施例中,可以基於步驟201中獲取的任務的資訊來確定任務的任務類型,並基於任務的任務類型,為多個子任務進行目標節點的分配。其中,對於任務的任務類型可以基於實際情況進行劃分。對於神經網路等深度學習模型的訓練、推理等深度學習任務而言,可選地,可以將任務分成通信密集型任務和計算密集型任務兩類。其中,通信密集型任務是指任務處理過程中各個子任務之間的通信比較頻繁的任務,計算密集型任務是指任務處理過 程中計算量比較大的任務。 In some embodiments, the task type of the task may be determined based on the information of the task acquired in step 201, and target nodes may be assigned to multiple subtasks based on the task type of the task. Wherein, the task types of tasks may be divided based on actual conditions. For deep learning tasks such as training and reasoning of deep learning models such as neural networks, tasks can optionally be divided into communication-intensive tasks and computation-intensive tasks. Among them, the communication-intensive task refers to the task with frequent communication between sub-tasks in the process of task processing, and the calculation-intensive task refers to the task that has been processed Tasks with a relatively large amount of computation in the process.

具體地,可以基於任務的多個子任務的整體情況來確定任務的任務類型。作為一個可選例子,可以基於所述任務的多個子任務的計算量與通信量,確定任務的任務類型。例如,在計算量大於預設的計算量閾值的情況下,認為該任務是計算密集型任務;再例如,在通信量大於預設的通信量閾值的情況下,認為該任務是通信密集型任務。再例如,如果計算量和通信量均超過對應的預設閾值,則可以將其確定為通信密集型任務。作為另一個可選例子,還可以根據歷史調度經驗確定任務類型。對於某種特定的任務,如果調度器在歷史調度過程中將該任務確定為計算密集型任務,則確定該任務為計算密集型任務;如果調度器在歷史調度過程中將該任務確定為通信密集型任務,則確定該任務為通信密集型任務。可選地,也可以基於對所述任務的至少一次歷史執行情況,來確定所述任務的任務類型,例如,如果在最近一次或多次歷史執行過程中,所述任務的計算所佔用的時間或計算資源較多,則將該任務確定為計算密集型任務;再例如,如果在最近一次或多次歷史執行過程中,所述任務的通信所佔用的時間或通信資源較多,則將該任務確定為通信密集型任務。或者,還可以基於其他方式確定任務的任務類型。或者,基於用戶提供資訊,例如,用戶提供的任務類型資訊,或者用戶提供的任務的計算量和/或通信量資訊,等等,來確定任務的任務類型。本公開實施例對確定任務類型的具體實現不做限定。 Specifically, the task type of the task may be determined based on the overall situation of multiple subtasks of the task. As an optional example, the task type of the task may be determined based on the calculation amount and communication amount of multiple subtasks of the task. For example, when the calculation amount is greater than the preset calculation amount threshold, the task is considered to be a calculation-intensive task; for another example, when the communication amount is greater than the preset communication amount threshold, the task is considered to be a communication-intensive task . For another example, if both the calculation amount and the communication amount exceed corresponding preset thresholds, it may be determined as a communication-intensive task. As another optional example, the task type may also be determined according to historical scheduling experience. For a specific task, if the scheduler determines that the task is a computation-intensive task in the historical scheduling process, it determines that the task is a computational-intensive task; if the scheduler determines that the task is a communication-intensive task in the historical scheduling process type task, it is determined that the task is a communication-intensive task. Optionally, the task type of the task may also be determined based on at least one historical execution of the task, for example, if in the latest one or more historical executions, the time occupied by the calculation of the task or more computing resources, determine the task as a computing-intensive task; for another example, if the communication of the task takes up more time or communication resources during one or more recent historical executions, then determine the task as The task was determined to be communication intensive. Alternatively, the task type of the task may also be determined based on other methods. Or, the task type of the task is determined based on the information provided by the user, for example, the task type information provided by the user, or the calculation amount and/or communication amount information of the task provided by the user, and so on. The embodiment of the present disclosure does not limit the specific implementation of determining the task type.

在本公開實施例中,可以基於任務類型確定多個子任務的目標節點,以使得任務整體能夠更高效地執行。例如,對於通信密集型任務,可以將該任務中的多個子任務儘量分配到數量較少的目標節點上,從而減少通信開銷;在例如,對於計算密集型任務,則可選地只需要目標節點上的資源能夠滿足所述任務的多個子任務的需求即可,或者優先選擇計算性能較好的節點,或者優先利用節點的碎片資源,或者盡力在保證計算資源符合用戶需求的前提下優化通信開銷,等等。 In the embodiment of the present disclosure, the target nodes of multiple subtasks may be determined based on the task type, so that the task as a whole can be executed more efficiently. For example, for a communication-intensive task, multiple subtasks in the task can be allocated to a small number of target nodes as much as possible, thereby reducing communication overhead; for example, for a calculation-intensive task, only the target node The resources on the network can meet the requirements of multiple subtasks of the task, or choose nodes with better computing performance first, or give priority to the use of fragmented resources of nodes, or try to optimize communication overhead on the premise of ensuring that computing resources meet user needs ,wait.

在一些可選例子中,可以根據所述任務的任務類型,進行以上預選和/或優選過程。 In some optional examples, the above preselection and/or optimization process may be performed according to the task type of the task.

在一些實施例中,基於任務的任務類型,進行以上預選過程,例如,對於通信密集型任務,可以將可用資源類型和/或可用資源數量較少等明顯不可能滿足多個子任務中較多子任務需求的節點剔除,再例如,預選節點集合包含的節點數量控制在一定範圍之內,其中,可以優先選擇資源類型較多和/或數量較多的節點,或者資源類型與多個子任務的整體需求資源比較吻合的節點作為預選節點。在一些可選例子中,也可以為通信密集型任務的多個子任務中的部分或所有子任務確定一個預選節點集合,而非為每個子任務確定單獨的預選節點集合,但本公開實施例不限於此。 In some embodiments, the above preselection process is performed based on the task type of the task. For example, for a communication-intensive task, it is obviously impossible to satisfy more subtasks in multiple subtasks, such as the available resource type and/or the small number of available resources. The node elimination required by the task, for example, the number of nodes contained in the pre-selected node set is controlled within a certain range, among which, the nodes with more resource types and/or numbers can be preferentially selected, or the overall resource type and multiple subtasks Nodes with matching resource requirements are used as pre-selected nodes. In some optional examples, a preselected node set may also be determined for part or all of the subtasks of the communication-intensive task, instead of determining a separate preselected node set for each subtask, but the embodiments of the present disclosure do not limited to this.

在另一些實施例中,基於任務類型,進行以上優選過程。例如,基於任務類型,確定選擇目標節點的策略。例如,在確定一個子任務的目標節點的過程中,優先選擇為其前面的子任務確定 的目標節點。例如,任務包括子任務1和子任務2,子任務1的目標節點包括節點1和節點2,則可以優選為子任務2選擇節點1和節點2作為其目標節點。在一些可選例子中,可以根據子任務的預選節點集合中包含的每個節點的當前狀態資訊和任務的任務類型,確定該子任務的預選節點集合中包含的每個預選節點的分值。在另一些實施例中,還可以根據其他方式來確定子任務的預選節點集合中包含的每個預選節點的分值,此處不再贅述。 In other embodiments, the above preferred process is performed based on the task type. For example, based on the task type, a strategy for selecting a target node is determined. For example, in the process of determining the target node of a subtask, it is preferred to determine target node. For example, a task includes subtask 1 and subtask 2, and the target nodes of subtask 1 include node 1 and node 2, then node 1 and node 2 may preferably be selected as target nodes for subtask 2. In some optional examples, the score of each preselected node included in the subtask preselected node set may be determined according to the current state information of each node included in the subtask preselected node set and the task type of the task. In some other embodiments, the score of each preselected node included in the subtask preselected node set may also be determined according to other methods, which will not be repeated here.

在一個例子中,對於通信密集型任務,由於期望盡可能減少目標節點的數量從而減少子任務間的通信代價,因此,對於通信密集型任務的子任務,資源分佈集中的節點的分值一般高於資源分佈分散的節點。例如,一個子任務的預選節點1的可用磁碟容量分佈在兩個磁碟上,而預選節點2的可用磁碟容量分佈在一個磁碟上,因此,該子任務的預選節點2比預選節點1的分值高。在另一個例子中,對於計算密集型任務,計算能力較強的節點的分值一般高於計算能力較弱的節點的分值。例如,一個子任務的預選節點1的處理器包括1個處理器核心,該子任務的預選節點2的處理器包括2個處理器核心,則該子任務的預選節點2比預選節點1的分值高。 In one example, for communication-intensive tasks, since it is expected to reduce the number of target nodes as much as possible to reduce the communication cost between sub-tasks, therefore, for the sub-tasks of communication-intensive tasks, the scores of nodes in the resource distribution concentration are generally high Nodes with scattered resource distribution. For example, the available disk capacity of the preselected node 1 of a subtask is distributed on two disks, and the available disk capacity of the preselected node 2 is distributed on one disk. Therefore, the preselected node 2 of this subtask is A score of 1 is high. In another example, for computationally intensive tasks, nodes with more computational power generally have higher scores than nodes with less computational power. For example, if the processor of the preselected node 1 of a subtask includes 1 processor core, and the processor of the preselected node 2 of the subtask includes 2 processor cores, then the score of the preselected node 2 of the subtask is higher than that of the preselected node 1. high value.

以上流程以子任務為資源分配對象為例進行說明。在一些實施例中,還可以將所述任務的多個子任務劃分為至少一個分組,其中,每個分組包括所述多個子任務中的至少一個子任務;分別為所述任務的至少一個分組中每個分組分配對應的目標節點, 其中,同一分組中的各個子任務分配到同一目標節點。 The above process is described by taking subtasks as resource allocation objects as an example. In some embodiments, multiple subtasks of the task can also be divided into at least one grouping, wherein each grouping includes at least one subtask in the multiple subtasks; Each group is assigned a corresponding target node, Wherein, each subtask in the same group is assigned to the same target node.

分組的數量可以與任務所需的節點數量相同,即,在需要將任務的多個子任務分配到N個目標節點的情況下,就將該任務的多個子任務劃分為N個分組。其中,每個分組中的子任務的數量可以相同,也可以不同。例如,將多個子任務隨機分配,或者將優先級相近的子任務劃分到一個分組,或者將具有或不具有依賴關係的子任務劃分到一個分組,或者,將資源需求數量差別較大的子任務劃分到一個分組,或者將需求資源類型相同的子任務劃分到一個分組,等等,本公開實施例對具體的分組方式不做限定。這樣,以分組為單位進行目標節點的確定,能夠在不明顯影響任務執行效率的前提下提高資源分配過程的效率。 The number of groups may be the same as the number of nodes required by the task, that is, when multiple subtasks of the task need to be allocated to N target nodes, the multiple subtasks of the task are divided into N groups. Wherein, the number of subtasks in each group may be the same or different. For example, randomly assign multiple subtasks, or divide subtasks with similar priorities into a group, or divide subtasks with or without dependencies into a group, or divide subtasks with large differences in resource requirements Divide into one group, or divide subtasks with the same type of required resources into one group, etc., and the embodiment of the present disclosure does not limit the specific grouping manner. In this way, the determination of the target node in units of groups can improve the efficiency of the resource allocation process without significantly affecting the efficiency of task execution.

在一些實施例中,可以根據所述任務的任務類型將所述任務的多個子任務劃分為至少一個分組。例如,對於通信密集型任務,可以將所述任務的多個子任務分配到數量最少的目標節點上,從而減少子任務間的通信代價,例如,將多個子任務中的所有子任務劃分到同一個分組,以使得所有子任務在同一目標節點上執行。在一些可選例子中,可以根據所述任務的多個子任務的資源需求資訊,確定所述任務的多個子任務所需的總節點數量;根據所述總節點數量,將所述任務的多個子任務劃分為至少一個分組。再例如,對於計算密集型任務,可以將所述任務的多個子任務中的每個子任務作為一個分組,也就是說,以子任務為單位進行目標節點的確定。 In some embodiments, the multiple subtasks of the task may be divided into at least one group according to the task type of the task. For example, for a communication-intensive task, multiple subtasks of the task can be allocated to the least number of target nodes, thereby reducing the communication cost between subtasks, for example, dividing all subtasks in multiple subtasks into the same Group so that all subtasks execute on the same target node. In some optional examples, the total number of nodes required by the multiple subtasks of the task may be determined according to the resource requirement information of the multiple subtasks of the task; according to the total number of nodes, the multiple subtasks of the task Tasks are divided into at least one group. For another example, for a computation-intensive task, each of the multiple subtasks of the task may be regarded as a group, that is, the target node is determined in units of subtasks.

例如,假設一個任務包括4個子任務,每個子任務需要一個GPU,分散式系統中每個節點的可用GPU數量為2個,則至少需要2個節點來執行所述任務中的4個子任務。因此,若該任務為通信密集型任務,可以將這4個子任務平均分為2組,其中,分組可以是隨機進行的,或者也可以是根據子任務的資源需求資訊進行的,等等。若該任務為計算密集型任務,可以將這4個子任務劃分為4個分組,每個子任務作為一個分組。 For example, assuming that a task includes 4 subtasks, each subtask requires a GPU, and the number of available GPUs per node in the distributed system is 2, then at least 2 nodes are required to execute the 4 subtasks in the task. Therefore, if the task is a communication-intensive task, the 4 subtasks can be equally divided into 2 groups, wherein the grouping can be performed randomly, or according to the resource requirement information of the subtasks, and so on. If the task is a computationally intensive task, the 4 subtasks can be divided into 4 groups, and each subtask is regarded as a group.

在分組的情況下,可選地,以上針對子任務的目標節點的確定也可以適用於對分組的目標節點的確定,下面將進行詳細介紹。 In the case of grouping, optionally, the above determination of the target node for the subtasks may also be applicable to the determination of the target node for the grouping, which will be described in detail below.

例如,在一些實施例中,在所述任務的至少一個分組中的每個分組均存在對應的目標節點的情況下,確定分散式系統中的當前可用資源滿足所述任務的多個子任務所需的總資源。在另一些實施例中,在所述任務的至少一個分組不存在對應的目標節點的情況下,確定分散式系統中的當前可用資源不滿足所述任務的多個子任務所需的總資源。 For example, in some embodiments, in the case that each group in at least one group of the task has a corresponding target node, it is determined that the currently available resources in the decentralized system meet the requirements of multiple subtasks of the task of total resources. In some other embodiments, in the case that at least one group of the task does not have a corresponding target node, it is determined that currently available resources in the distributed system do not meet the total resources required by the multiple subtasks of the task.

再例如,可以基於所述分散式系統的當前資源狀態資訊和所述至少一個分組中每個分組的資源需求資訊,按照特定順序依次為所述至少一個分組中的每個分組進行目標節點的確定。分組的資源需求資訊可以基於分組中包括的至少一個子任務的資源需求資訊來確定,例如,分組所需求的資源可以為分組中包括的所有子任務所需求的同一類型資源的總和。其中,所述特定順序可選 地可以根據所述多個分組的優先級和依賴關係中的至少一項或其他因素來確定。分組的優先級可以基於分組中包括的兩個以上子任務的優先級來確定,例如,將分組的優先級確定為兩個以上子任務的最高優先級,或者將分組的優先級確定為兩個以上子任務的平均優先級,等等。分組之間的依賴關係可以基於不同分組包括的子任務之間的依賴關係確定,例如,假設分組1中的1個子任務依賴於分組2中的1個子任務,則確定分組1依賴於分組2,等等,本公開實施例不限於此。 For another example, based on the current resource status information of the distributed system and the resource requirement information of each group in the at least one group, the target node can be determined for each group in the at least one group in a specific order . The resource requirement information of a group may be determined based on the resource requirement information of at least one subtask included in the group, for example, the resource required by the group may be the sum of resources of the same type required by all the subtasks included in the group. Wherein, the specific order is optional The location may be determined according to at least one of the priorities and dependencies of the multiple groups or other factors. The priority of the grouping may be determined based on the priorities of two or more subtasks included in the grouping, for example, determining the priority of the grouping as the highest priority of the two or more subtasks, or determining the priority of the grouping as the two or more subtasks The average priority of the above subtasks, etc. The dependencies between groups can be determined based on the dependencies between subtasks included in different groups. For example, assuming that one subtask in group 1 depends on one subtask in group 2, then it is determined that group 1 depends on group 2, Etc., the embodiments of the present disclosure are not limited thereto.

在一些實施例中,還可以基於所述分散式系統的當前資源狀態資訊和每個分組的資源需求資訊,確定所述每個分組的預選節點集合,並按照一定次序依次從每個分組的預選節點集合中確定所述分組的目標節點。 In some embodiments, based on the current resource state information of the distributed system and the resource demand information of each group, the preselected node set of each group can be determined, and the preselected node set of each group can be sequentially selected according to a certain order. The target node of the group is determined in the node set.

在本公開實施例中,首先對整個任務進行目標節點的確定,例如執行以上預選和優選操作,最終得到整個任務對應的目標節點的資訊,然後再將任務分拆到子任務級別做一次綁定,即以子任務為單位進行目標節點的綁定。 In the embodiment of the present disclosure, first determine the target node for the entire task, for example, perform the above preselection and optimization operations, and finally obtain the information of the target node corresponding to the entire task, and then split the task into subtask levels to do a binding , that is, to bind the target node in units of subtasks.

在一些實施例中,在分散式系統中的當前可用資源不滿足所述任務的多個子任務所需的總資源的情況下,或者在為所述多個子任務中至少一個子任務確定對應的目標節點不成功的情況下,可以對所述任務包括的多個子任務均進行延遲分配。所述延遲分配,可以是在下一個分配週期對所述任務包括的多個子任務進行分配。在下一個分配週期,仍然是在分散式系統中的當前可用資 源滿足所述任務的多個子任務所需的總資源的情況下,才分別為所述任務的多個子任務分配對應的目標節點。如果在下一個分配週期,分散式系統中的當前可用資源仍不滿足所述任務的多個子任務所需的總資源,則繼續進行延遲分配。 In some embodiments, when the currently available resources in the decentralized system do not meet the total resources required by the plurality of subtasks of the task, or when determining the corresponding target for at least one subtask of the plurality of subtasks In the case that the node is unsuccessful, delay allocation may be performed on multiple subtasks included in the task. The delayed allocation may be to allocate multiple subtasks included in the task in the next allocation period. In the next allocation cycle, the currently available resources in the decentralized system remain Only when the source satisfies the total resources required by the multiple subtasks of the task can the corresponding target nodes be assigned to the multiple subtasks of the task. If in the next allocation cycle, the currently available resources in the distributed system still do not satisfy the total resources required by the multiple subtasks of the task, continue to perform delayed allocation.

在一些實施例中,每確定當前子任務或分組對應的目標節點之後,可以進行當前子任務或分組與對應的目標節點的虛擬分配,或者進行當前子任務或當前分組與對應的目標節點上的資源的虛擬綁定。而在確定分散式系統中的當前可用資源不滿足所述任務的多個子任務所需的總資源的情況下,例如,為某個子任務找不到對應的目標節點的情況下,可以將上述虛擬分配或虛擬綁定視為失效,從而使得其對應的目標節點和其上資源可以分配給其他任務。 In some embodiments, after each target node corresponding to the current subtask or group is determined, a virtual assignment between the current subtask or group and the corresponding target node can be performed, or a virtual assignment between the current subtask or the current group and the corresponding target node can be performed. Virtual binding of resources. However, when it is determined that the current available resources in the distributed system do not meet the total resources required by the multiple subtasks of the task, for example, when a corresponding target node cannot be found for a certain subtask, the above virtual An allocation or virtual binding is considered invalid, so that its corresponding target node and its resources can be allocated to other tasks.

在一些實施例中,在分別為所述任務的多個子任務分配對應的目標節點之後,可以對所述任務的多個子任務進行同步調度。 In some embodiments, after assigning corresponding target nodes to the multiple subtasks of the task, the multiple subtasks of the task may be scheduled synchronously.

應當說明的是,在本公開實施例中,分配是指將子任務分發到對應的目標節點,調度是指在將子任務分發到對應的目標節點之後,由對應的目標節點執行子任務。 It should be noted that, in the embodiments of the present disclosure, allocation refers to distributing subtasks to corresponding target nodes, and scheduling refers to executing subtasks by corresponding target nodes after subtasks are distributed to corresponding target nodes.

如圖4所示,在傳統的調度方式中,為了實現gang scheduling,調度器會按次序處理任務中的多個子任務,並當一個子任務可以被調度時即刻將該子任務實際分配(並非指本公開實施例所述的模擬分配)到對應的目標節點,並佔用目標節點上的資 源對該子任務進行調度。對一個任務來說,該任務中的每個子任務在分配時的當前可用資源是動態變化,逐漸變少的,這種調度方式稱為task by task的調度方式。而且,在進行子任務的調度時,並沒有將同一個任務中的多個子任務的調度情況與資源需求進行統籌考慮,導致每次分配都是獨立的行為。而在實際情況中,同一個任務中的多個子任務之間往往存在一些關聯,例如,一些待調度任務往往伴隨著大量的通信需求,如果一個任務的某些子任務被分配到通信開銷很大的多個節點上,將會顯著影響整個任務的運行效率。 As shown in Figure 4, in the traditional scheduling method, in order to realize gang scheduling, the scheduler will process multiple subtasks in the task in order, and when a subtask can be scheduled, the subtask will be actually allocated immediately (not referring to The simulated allocation described in the embodiment of the present disclosure) is assigned to the corresponding target node, and the resources on the target node are occupied The source schedules the subtask. For a task, the currently available resources of each subtask in the task are dynamically changing and gradually decreasing when allocated. This scheduling method is called task by task scheduling. Moreover, when scheduling subtasks, the scheduling situation and resource requirements of multiple subtasks in the same task are not considered as a whole, resulting in each assignment being an independent behavior. However, in actual situations, there are often some associations between multiple subtasks in the same task. For example, some tasks to be scheduled are often accompanied by a large amount of communication requirements. If some subtasks of a task are assigned to the communication overhead On multiple nodes, it will significantly affect the operating efficiency of the entire task.

本公開實施例中則是以任務為粒度進行子任務分配和調度。如圖5所示,首先仍然按照優先級獲取所有任務,對於每一個任務,統籌考慮整個任務,只有在分散式系統中的當前可用資源滿足任務的多個子任務所需的總資源的情況下,才分別為所述任務的多個子任務分配對應的目標節點,然後對所述多個子任務進行同步調度,即以任務為單位進行資源分配和調度。 In the embodiment of the present disclosure, subtask allocation and scheduling are performed at the granularity of tasks. As shown in Figure 5, firstly, all tasks are obtained according to the priority. For each task, the entire task is considered as a whole. Only when the current available resources in the distributed system meet the total resources required by the multiple subtasks of the task, The corresponding target nodes are assigned to the multiple subtasks of the task, and then the multiple subtasks are scheduled synchronously, that is, resource allocation and scheduling are performed in units of tasks.

區別於之前按照子任務的需求做task by task的調度方式,本公開實施例在為任務選擇節點時統籌考慮了整個任務,通過將整個任務作為調度的最小單位,能夠更好地利用任務內各個子任務的相互關係與需求情況,為整個任務從叢集中選擇出更加合適的調度節點,從而提高任務的運行效率和叢集的資源利用效率。 Different from the previous task-by-task scheduling method according to the requirements of sub-tasks, the embodiment of the present disclosure considers the entire task when selecting nodes for the task. By using the entire task as the smallest unit of scheduling, it is possible to make better use of each node in the task. Based on the interrelationship and demand of subtasks, a more suitable scheduling node is selected from the cluster for the entire task, thereby improving the operation efficiency of the task and the resource utilization efficiency of the cluster.

如圖6所示,是本公開一些實施例的調度邏輯的示意圖。為了降低調度複雜度,對於各個待調度任務可做如下假設: (1)每一個任務擁有多個子任務;(2)每個子任務在資源需求上是同質的,即對資源的需求相同;(3)每個用戶提交的任務,在提交時指定了每個子任務對資源的需求量,並在整個任務處理過程中子任務的數目和各個子任務的資源需求量保持不變。 As shown in FIG. 6 , it is a schematic diagram of the scheduling logic of some embodiments of the present disclosure. In order to reduce the scheduling complexity, the following assumptions can be made for each task to be scheduled: (1) Each task has multiple subtasks; (2) Each subtask is homogeneous in terms of resource requirements, that is, the resource requirements are the same; (3) Each task submitted by each user specifies each subtask when submitting The demand for resources, and the number of subtasks and the resource demand of each subtask remain unchanged during the entire task processing process.

首先進行預選過程,剔除不能滿足用戶的任務的節點。具體來說包括以下步驟: Firstly, a pre-selection process is performed to eliminate nodes that cannot meet the user's tasks. Specifically, it includes the following steps:

(1)對任務的每個子任務,可以基於所述分散式系統的當前資源狀態資訊和各個子任務的資源需求資訊,分別確定該子任務的預選節點集合,該子任務的預選節點集合之外的節點必然是無法支持該子任務在其上運行的。在這裡,任務可稱為job,子任務可稱為task。不同子任務的預選節點集合可能相同,也可能不同。 (1) For each subtask of the task, based on the current resource status information of the distributed system and the resource demand information of each subtask, the preselected node set of the subtask can be determined respectively, and the preselected node set of the subtask The node must be unable to support the subtask to run on it. Here, a task may be called a job, and a subtask may be called a task. The pre-selected node sets of different subtasks may be the same or different.

在一些可選例子中,可以從前面子任務的預選節點集合中,確定後面子任務的預選節點,其中子任務的順序可選地可以基於子任務的優先級確定。例如,假設分散式系統中包括節點1至節點5,一個任務包括子任務1、子任務2和子任務3,則可以先為子任務1確定預選節點,假設為節點1,節點2和節點3;然後,從節點1,節點2和節點3中確定子任務2的預選節點,假設為節點2和節點3;再從節點2和節點3中確定子任務3的預選節點,假設為節點3。通過這種確定預選節點集合的方式,對於有一定量 的通信需求的任務,尤其是通信密集型任務來說,能夠提高確定預選節點集合的效率,進而提高任務分配效率。 In some optional examples, the preselected nodes of the following subtasks can be determined from the set of preselected nodes of the previous subtasks, wherein the order of the subtasks can optionally be determined based on the priority of the subtasks. For example, assuming that the distributed system includes nodes 1 to 5, and a task includes subtask 1, subtask 2 and subtask 3, you can first determine the preselected nodes for subtask 1, assuming node 1, node 2 and node 3; Then, determine the preselected nodes of subtask 2 from nodes 1, 2 and 3, assuming node 2 and node 3; then determine the preselected nodes of subtask 3 from nodes 2 and 3, assuming node 3. Through this way of determining the set of pre-selected nodes, for a certain amount Tasks with high communication requirements, especially communication-intensive tasks, can improve the efficiency of determining the set of pre-selected nodes, thereby improving the efficiency of task allocation.

(2)對預選節點集合中每個預選節點進行驗證,確定該預選節點上的總資源是否滿足分配到該預選節點上的所有子任務的總資源需求。若是,則進行優選過程,否則,重新進行預選過程。 (2) Verify each preselected node in the preselected node set, and determine whether the total resources on the preselected node meet the total resource requirements of all subtasks allocated to the preselected node. If yes, perform the optimization process, otherwise, perform the preselection process again.

其中,該步驟是可選的,預選節點的驗證可以提高整個任務分配的可靠性。在進行預選過程之前,還可以先進行節點集合分割,即,對各個節點進行分組,分組的依據可以是節點所屬的叢集,即,將不同叢集上的節點劃分為不同的分組。通過進行節點集合分割,可以使不同的任務僅被分配到特定分組的節點上。 Wherein, this step is optional, and the verification of the pre-selected nodes can improve the reliability of the entire task assignment. Before performing the pre-selection process, the node set division can also be performed first, that is, each node is grouped, and the basis of the grouping can be the cluster to which the node belongs, that is, the nodes on different clusters are divided into different groups. By splitting the node set, different tasks can be assigned only to specific grouped nodes.

然後,進行優選過程,選擇出最適合用戶的任務的目標節點。具體來說,可以包括以下步驟: Then, an optimization process is performed to select the target node most suitable for the user's task. Specifically, the following steps may be included:

(1)通過用戶任務中的用戶提供資訊與調度器的歷史經驗,獲取任務在訓練過程中的計算量與通信量,並根據計算量與通信量判斷用戶的任務是計算密集型還是通信密集型。 (1) Obtain the calculation amount and communication amount of the task during the training process through the information provided by the user in the user task and the historical experience of the scheduler, and judge whether the user's task is calculation-intensive or communication-intensive according to the calculation amount and communication amount .

在一些可選例子中,任務類型的確定也可以在以上預選過程中或預選過程之前執行,並基於任務類型進行子任務的預選節點集合的確定。 In some optional examples, the determination of the task type may also be performed during or before the above pre-selection process, and the pre-selected node set of the sub-task is determined based on the task type.

在一些可選例子中,基於任務類型,進行優選過程中多個子任務目標節點的確定。如果是通信密集型任務,把該任務置放(即分配)到最少數量的物理節點(即目標節點)上,具體分配到的物理節點的數量由節點的可用資源與任務所需的資源共同決定, 如果不能滿足這種置放約束則對任務中的多個子任務均延遲分配。如果是計算密集型任務,則放寬這種置放約束,即使無法滿足最少物理節點,只要可以選擇出資源足夠的節點滿足任務需求就進行分配,否則就延遲分配。 In some optional examples, based on the task type, multiple subtask target nodes in the optimization process are determined. If it is a communication-intensive task, place (i.e. assign) the task to the minimum number of physical nodes (i.e. target nodes). The number of assigned physical nodes is determined by the available resources of the nodes and the resources required by the task. , If this placement constraint cannot be satisfied, allocation is delayed for multiple subtasks in the task. If it is a computing-intensive task, relax this placement constraint. Even if the minimum physical nodes cannot be satisfied, as long as a node with sufficient resources can be selected to meet the task requirements, the allocation will be performed; otherwise, the allocation will be delayed.

(2)對叢集節點按照空余資源進行從大到小打分排序,得到預選節點集合到任務的映射的分值,其中,預選節點的分值可以根據分散式系統的叢集中的節點負載和節點資源碎片情況來確定。節點負載是指叢集中的各個節點當前時刻的當前可用資源總量和資源使用量。資源碎片情況是指節點上的資源分佈。 (2) Score and sort the cluster nodes according to the free resources from large to small, and get the score of the mapping from the pre-selected node set to the task, where the score of the pre-selected nodes can be based on the node load and node resources in the distributed system cluster Debris to determine. Node load refers to the current total amount of available resources and resource usage of each node in the cluster at the current moment. Resource fragmentation refers to the distribution of resources on nodes.

(3)對所有上一步選出的節點,從高分到低分依次將任務的task進行置放,直到不能置放或任務的全部task都被置放。 (3) For all the nodes selected in the previous step, the tasks of the tasks are placed sequentially from high scores to low scores until they cannot be placed or all tasks of the task are placed.

(4)任務狀態檢查。如果任務還有空餘task沒有置放,說明叢集資源不夠,分配失敗。則釋放任務所佔用的資源,延遲分配。如果job沒有空餘task,並且任務類型是通信密集型,則判斷是否滿足最少節點要求。滿足則分配成功,不滿足則分配失敗並且釋放資源,延遲分配。 (4) Task status check. If the task is still free and the task is not placed, it means that the cluster resources are not enough and the allocation fails. Then release the resources occupied by the task and delay the allocation. If the job has no free tasks, and the task type is communication-intensive, determine whether the minimum node requirements are met. If it is satisfied, the allocation is successful, if it is not satisfied, the allocation fails and the resource is released, and the allocation is delayed.

本公開實施例的方法可以應用於任務編排系統或雲平臺,所述任務編排系統可基於Kubernetes等平臺實現。 The method of the embodiment of the present disclosure can be applied to a task orchestration system or a cloud platform, and the task orchestration system can be implemented based on a platform such as Kubernetes.

本領域技術人員可以理解,在具體實施方式的上述方法中,各步驟的撰寫順序並不意味著嚴格的執行順序而對實施過程構成任何限定,各步驟的具體執行順序應當以其功能和可能的內在邏輯確定。 Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.

如圖7所示,本公開實施例還提供一種任務調度裝置,所述裝置包括:獲取模組701,用於獲取任務的資訊,所述任務的資訊包括所述任務的任務類型和資源需求資訊的至少一者;第一分配模組702,用於根據所述任務的資訊,分別為所述任務包括的多個子任務分配目標節點。 As shown in FIG. 7 , the embodiment of the present disclosure also provides a task scheduling device, the device includes: an acquisition module 701, configured to acquire task information, the task information includes the task type and resource requirement information of the task at least one of; the first assignment module 702, configured to assign target nodes to multiple subtasks included in the task according to the information of the task.

在一些實施例中,所述任務的資源需求資訊為所述任務包括的多個子任務的資源需求資訊;該第一分配模組702還在分散式系統中的當前可用資源滿足所述任務的多個子任務的資源需求資訊對應的總資源的情況下,分別為所述任務的多個子任務分配對應的目標節點。 In some embodiments, the resource requirement information of the task is the resource requirement information of a plurality of subtasks included in the task; the first allocation module 702 also satisfies the multiplicity of the task with currently available resources in the distributed system In the case of the total resources corresponding to the resource requirement information of subtasks, the corresponding target nodes are assigned to the multiple subtasks of the task respectively.

在一些實施例中,所述裝置還包括:第一確定模組,用於基於所述分散式系統的當前資源狀態資訊和所述多個子任務的資源需求資訊,為所述多個子任務確定目標節點;所述第一分配模組用於:在成功為所述多個子任務中的每個子任務確定對應的目標節點的情況下,分別為所述任務的多個子任務分配對應的目標節點。 In some embodiments, the device further includes: a first determining module, configured to determine targets for the multiple subtasks based on the current resource state information of the distributed system and the resource requirement information of the multiple subtasks Node; the first assignment module is used for: in the case of successfully determining the corresponding target node for each subtask of the plurality of subtasks, respectively assigning corresponding target nodes to the plurality of subtasks of the task.

在一些實施例中,所述裝置還包括:第一確定模組,用於基於所述分散式系統的當前資源狀態資訊和所述多個子任務的資源需求資訊,為所述多個子任務確定目標節點;所述第一分配模組用於:在成功為所述多個子任務中的每個子任務確定對應的目標節點的情況下,分別為所述任務的多個子任務分配對應的目標節 點。 In some embodiments, the device further includes: a first determining module, configured to determine targets for the multiple subtasks based on the current resource state information of the distributed system and the resource requirement information of the multiple subtasks node; the first allocation module is used for: in the case of successfully determining the corresponding target node for each subtask in the multiple subtasks, respectively assigning the corresponding target node to the multiple subtasks of the task point.

在一些實施例中,所述第一確定模組用於:基於所述分散式系統的當前資源狀態資訊和所述多個子任務的資源需求資訊,按照特定順序依次確定多個子任務對應的目標節點。 In some embodiments, the first determining module is configured to: sequentially determine the target nodes corresponding to multiple subtasks in a specific order based on the current resource status information of the distributed system and the resource requirement information of the multiple subtasks .

在一個可選例子中,所述多個子任務的資源需求資訊包括所述多個子任務中每個子任務的資源需求資訊,其中,所述資源需求資訊可以包括所需的資源類型和每種類型的資源數量,或者進一步包括其他資訊。 In an optional example, the resource requirement information of the plurality of subtasks includes resource requirement information of each subtask in the plurality of subtasks, wherein the resource requirement information may include required resource types and resources of each type. The number of resources, or further include other information.

在一個可選例子中,所述分散式系統的當前資源狀態資訊可以包括所述分散式系統中多個節點的當前狀態資訊,其中,當前狀態資訊用於指示資源是否可用、可用資源的類型和數量、負載情況、拓撲連接資訊中的至少一種,或者進一步包括其他資訊。 In an optional example, the current resource status information of the distributed system may include current status information of multiple nodes in the distributed system, where the current status information is used to indicate whether resources are available, the type of available resources and At least one of quantity, load condition, topological connection information, or further include other information.

在一些實施例中,所述裝置還包括:更新模組,用於在每次確定為一個子任務分配的目標節點之後,更新分散式系統的當前資源狀態資訊,並基於更新後的當前資源狀態資訊確定下一個子任務對應的目標節點。 In some embodiments, the device further includes: an update module, configured to update the current resource state information of the distributed system after each determination of the target node assigned to a subtask, and based on the updated current resource state The information determines the target node corresponding to the next subtask.

在一些實施例中,所述多個子任務的目標節點的依次確定的順序是按照子任務的優先級、子任務之間的依賴關係中的至少一項得到的。 In some embodiments, the sequentially determined order of the target nodes of the plurality of subtasks is obtained according to at least one of priorities of subtasks and dependencies between subtasks.

在一些實施例中,所述第一確定模組包括:第一確定單元,用於基於所述分散式系統的當前資源狀態資訊和所述多個子任務的資源需求資訊,依次為所述多個子任務確定目標節點;更新單 元,用於在每確定一個子任務的目標節點後,基於所述一個子任務的資源需求資訊,更新所述分散式系統的當前資源狀態資訊。 In some embodiments, the first determining module includes: a first determining unit, configured to, based on the current resource state information of the distributed system and the resource requirement information of the plurality of subtasks, sequentially assign the plurality of subtasks The task determines the target node; the update order The element is used for updating the current resource status information of the distributed system based on the resource requirement information of the subtask after each target node of a subtask is determined.

在一些實施例中,所述裝置還包括:第二確定模組,用於在未能成功為所述多個子任務中的至少一個子任務確定對應的目標節點的情況下,確定所述分散式系統的當前可用資源不滿足所述多個子任務中每個子任務所需求的資源。 In some embodiments, the apparatus further includes: a second determination module, configured to determine the distributed The currently available resources of the system do not satisfy the resources required by each subtask in the plurality of subtasks.

在一些實施例中,所述第一確定單元包括:確定子單元,用於基於所述分散式系統的當前資源狀態資訊和所述多個子任務的資源需求資訊,確定所述多個子任務中每個子任務的預選節點集合;選擇子單元,用於從所述多個子任務中每個子任務的預選節點集合中選擇所述每個子任務的目標節點。 In some embodiments, the first determining unit includes: a determining subunit, configured to determine each of the multiple subtasks based on the current resource status information of the distributed system and the resource requirement information of the multiple subtasks. A preselected node set of subtasks; a selection subunit, configured to select the target node of each subtask from the preselected node set of each subtask in the plurality of subtasks.

其中,不同子任務的預選節點集合可以相同或不同。在一些例子中,多個子任務可以具有相同的預選節點集合。 Wherein, the preselected node sets of different subtasks may be the same or different. In some examples, multiple subtasks may have the same set of preselected nodes.

在一些實施例中,所述確定子單元用於:基於所述分散式系統的當前資源狀態資訊和每個子任務的資源需求資訊,確定所述每個子任務的預選節點集合。 In some embodiments, the determination subunit is configured to determine a preselected node set for each subtask based on current resource state information of the distributed system and resource requirement information of each subtask.

多個子任務的預選節點集合的確定可以是獨立的,相互之間不存在依賴關係,例如,可以並行執行或者以任意先後順序執行。例如,多個子任務的預選節點集合是基於相同的分散式系統的當前資源狀態資訊確定的,也就是說,分散式系統的當前資源狀態資訊在預選過程中不進行更新。例如,每個子任務的預選節點集合的確定只跟自己的資源需求資訊有關係,而與其他子任務的資源 需求資訊無關。 The determination of the pre-selected node sets of multiple subtasks may be independent without interdependence, for example, may be executed in parallel or in any sequence. For example, the preselected node sets of multiple subtasks are determined based on the same current resource state information of the distributed system, that is, the current resource state information of the distributed system is not updated during the preselection process. For example, the determination of the pre-selected node set of each subtask is only related to its own resource demand information, and not related to the resource requirements of other subtasks. Demand information is irrelevant.

在一些實施例中,所述確定子單元用於:按照特定順序依次確定所述多個子任務中每個子任務的預選節點集合,其中,所述多個子任務中第一子任務的預選節點集合中的預選節點是從所述多個子任務中第二子任務的預選節點集合選取的,其中,所述第二子任務的順序位於所述第一子任務之前。 In some embodiments, the determining subunit is configured to: sequentially determine the preselected node set of each of the multiple subtasks in a specific order, wherein the preselected node set of the first subtask of the multiple subtasks The preselected nodes are selected from the preselected node set of the second subtask among the plurality of subtasks, wherein the order of the second subtask is before the first subtask.

在一些實施例中,按照特定順序依次從所述多個子任務中每個子任務的預選節點中選擇所述每個子任務的目標節點,其中,優先選擇所述多個子任務中第二子任務的目標節點作為所述多個子任務中第一子任務的目標節點,其中,所述第二子任務的順序位於所述第一子任務之前。 In some embodiments, the target node of each subtask is sequentially selected from the preselected nodes of each subtask in the plurality of subtasks according to a specific order, wherein the target node of the second subtask among the plurality of subtasks is preferentially selected The node serves as the target node of the first subtask among the plurality of subtasks, wherein the order of the second subtask is before the first subtask.

在一些實施例中,所述選擇子單元用於:基於所述分散式系統的當前資源狀態資訊,從所述多個子任務中第一子任務的預選節點集合中確定所述第一子任務的目標節點;基於所述第一子任務的資源需求資訊,更新所述分散式系統的當前資源狀態資訊,並基於所述分散式系統的更新後的當前資源狀態資訊,從所述多個子任務中第二子任務的預選節點集合中確定所述第二子任務的目標節點。 In some embodiments, the selection subunit is configured to: determine the first subtask from the pre-selected node set of the first subtask among the plurality of subtasks based on the current resource status information of the distributed system. The target node: based on the resource requirement information of the first subtask, updating the current resource state information of the distributed system, and based on the updated current resource state information of the distributed system, from the plurality of subtasks The target node of the second subtask is determined from the preselected node set of the second subtask.

在一個可選例子中,基於第一子任務的預選節點集合中每個預選節點的當前狀態資訊,從所述第一子任務的預選節點集合中確定所述第一子任務的目標節點,並基於所述第一子任務的所需資源資訊,更新所述第一子任務的目標節點的當前狀態資訊。 In an optional example, based on the current state information of each preselected node in the first subtask's preselected node set, the target node of the first subtask is determined from the first subtask's preselected node set, and Based on the required resource information of the first subtask, current state information of the target node of the first subtask is updated.

在一些實施例中,所述選擇子單元用於:根據所述任務的任務類型,確定每個子任務的預選節點集合中包含的預選節點的分值;基於每個子任務的預選節點集合中包含的預選節點的分值,從所述每個子任務的預選節點集合中選擇所述每個子任務的目標節點。 In some embodiments, the selection subunit is configured to: determine the score of the preselected nodes contained in the preselected node set of each subtask according to the task type of the task; The score of the preselected node is to select the target node of each subtask from the preselected node set of each subtask.

在一個可選例子中,基於所述任務的任務類型,確定所述預選節點集合中每個預選節點的分值確定策略。 In an optional example, based on the task type of the task, a score determination strategy for each preselected node in the preselected node set is determined.

在一些實施例中,所述裝置還包括:第三確定模組,用於確定所述任務的任務類型;其中,所述多個子任務對應的目標節點是基於所述任務的任務類型確定的。 In some embodiments, the apparatus further includes: a third determination module configured to determine a task type of the task; wherein, the target nodes corresponding to the plurality of subtasks are determined based on the task type of the task.

在一些實施例中,所述第一分配模組包括:分組單元,用於將所述任務的多個子任務劃分為至少一個分組,其中,每個分組包括所述多個子任務中的至少一個子任務;分配單元,用於分別為所述至少一個分組中每個分組分配對應的目標節點,其中,同一分組中的各個子任務分配到同一目標節點。 In some embodiments, the first assigning module includes: a grouping unit, configured to divide a plurality of subtasks of the task into at least one group, wherein each group includes at least one subtask of the plurality of subtasks A task; an allocating unit, configured to allocate a corresponding target node to each of the at least one group, wherein each subtask in the same group is allocated to the same target node.

在一些實施例中,所述分組單元用於:根據所述任務的任務類型,將所述任務的多個子任務劃分為至少一個分組。 In some embodiments, the grouping unit is configured to: divide multiple subtasks of the task into at least one group according to the task type of the task.

在一些實施例中,所述任務的任務類型為計算密集型或通信密集型。 In some embodiments, the task type of the task is computation-intensive or communication-intensive.

在一些實施例中,所述分組單元用於:在所述任務的任務類型為通信密集型的情況下,根據所述任務的多個子任務的資源需求資訊,確定所述多個子任務所需的總節點數量,並根據所述總 節點數量,將所述任務的多個子任務劃分為至少一個分組。 In some embodiments, the grouping unit is configured to: when the task type of the task is communication-intensive, according to the resource requirement information of the multiple subtasks of the task, determine the resources required by the multiple subtasks. total number of nodes, and according to the total The number of nodes, dividing multiple subtasks of the task into at least one group.

在一些實施例中,在所述任務的任務類型為計算密集型的情況下,將所述多個子任務中的每個子任務作為一個分組。 In some embodiments, when the task type of the task is computation-intensive, each subtask in the plurality of subtasks is regarded as a group.

在一些實施例中,分別為所述至少一個分組中每個分組確定預選節點集合,並依次從所述至少一個分組中每個分組的預選節點集合中選取所述分組的目標節點。 In some embodiments, a preselected node set is respectively determined for each of the at least one group, and a target node of the group is sequentially selected from the preselected node set of each of the at least one group.

在一些實施例中,所述第一分配模組還基於所述任務的任務類型,為所述任務的多個子任務確定目標節點;在為所述任務的多個子任務均成功確定對應的目標節點的情況下,為所述任務包括的多個子任務分配目標節點。 In some embodiments, the first allocation module further determines target nodes for multiple subtasks of the task based on the task type of the task; when the corresponding target nodes are successfully determined for multiple subtasks of the task In the case of , assign target nodes to multiple subtasks included in the task.

在一些實施例中,所述裝置還包括:延遲模組,用於在為所述子任務中的至少一個子任務確定對應的目標節點不成功的情況下,對所述任務的多個子任務均進行延遲分配。 In some embodiments, the apparatus further includes: a delay module, configured to, in the case that the corresponding target node is determined to be unsuccessful for at least one subtask in the subtask, all the subtasks of the task Make a delayed allocation.

在一些實施例中,所述延遲模組在分散式系統中的當前可用資源不滿足所述任務的多個子任務的資源需求資訊對應的總資源的情況下,對所述任務包括的多個子任務均進行延遲分配。 In some embodiments, when the currently available resources in the distributed system do not meet the total resources corresponding to the resource requirement information of the multiple subtasks of the task, the delay module will Delayed distribution is performed.

在一些實施例中,所述裝置還包括:調度模組,用於在分別為所述任務的多個子任務分配對應的目標節點之後,對所述任務的多個子任務進行同步調度。 In some embodiments, the device further includes: a scheduling module, configured to perform synchronous scheduling on multiple subtasks of the task after assigning corresponding target nodes to the multiple subtasks of the task.

在一些實施例中,所述裝置應用於任務編排系統。 In some embodiments, the device is applied to a task scheduling system.

在一些實施例中,本公開實施例提供的裝置具有的功能或包含的模組可以用於執行上文方法實施例描述的方法,其具體 實現可以參照上文方法實施例的描述,為了簡潔,這裡不再贅述。 In some embodiments, the functions of the device provided by the embodiments of the present disclosure or the modules contained therein can be used to execute the methods described in the method embodiments above, which specifically For implementation, reference may be made to the description of the method embodiments above, and details are not repeated here for the sake of brevity.

以上所描述的裝置實施例僅僅是示意性的,其中所述作為分離部件說明的模組可以是或者也可以不是物理上分開的,作為模組顯示的部件可以是或者也可以不是物理模組,即可以位於一個地方,或者也可以分佈到多個網路模組上。可以根據實際的需要選擇其中的部分或者全部模組來實現本說明書方案的目的。本領域普通技術人員在不付出創造性勞動的情況下,即可以理解並實施。 The device embodiments described above are only illustrative, wherein the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules, That is, it can be located in one place, or it can be distributed to multiple network modules. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution in this manual. It can be understood and implemented by those skilled in the art without creative effort.

本說明書裝置的實施例可以應用在計算機設備上,例如伺服器或終端設備。裝置實施例可以通過軟體實現,也可以通過硬體或者軟硬體結合的方式實現。以軟體實現為例,作為一個邏輯意義上的裝置,是通過其所在文件處理的處理器將非揮發性記憶體中對應的計算機程式指令讀取到隨機存取記憶體中,再從隨機存取記憶體中讀取到處理器中運行形成的。從硬體層面而言,如圖8所示,為本說明書裝置所在計算機設備的一種硬體結構圖,除了圖8所示的處理器801、隨機存取記憶體802、網路介面803、以及非揮發性記憶體804之外,實施例中裝置所在的伺服器或電子設備,通常根據該計算機設備的實際功能,還可以包括其他硬體,對此不再贅述。 Embodiments of the apparatus in this specification can be applied to computer equipment, such as servers or terminal equipment. The device embodiment can be implemented by software, or by hardware or a combination of software and hardware. Taking software implementation as an example, as a device in a logical sense, it reads the corresponding computer program instructions in the non-volatile memory into the random access memory through the processor where the file is processed, and then reads them from the random access memory. Read from memory to run in processor. From the perspective of hardware, as shown in Figure 8, it is a hardware structure diagram of the computer equipment where the device of this specification is located, except for the processor 801 shown in Figure 8, the random access memory 802, the network interface 803, and In addition to the non-volatile memory 804, the server or the electronic device where the device in the embodiment is located may also include other hardware generally according to the actual function of the computer device, so details will not be repeated here.

相應地,本公開實施例還提供一種計算機儲存媒體,其上儲存有計算機程式,該程式被處理器執行時實現任一實施例所述的方法。 Correspondingly, an embodiment of the present disclosure also provides a computer storage medium on which a computer program is stored, and when the program is executed by a processor, the method described in any embodiment is implemented.

相應地,本公開實施例還提供一種計算機設備,包括記憶體、處理器及儲存在記憶體上並可在處理器上運行的計算機程式,所述處理器執行所述程式時實現任一實施例所述的方法。 Correspondingly, an embodiment of the present disclosure also provides a computer device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the program, any embodiment is realized. the method described.

本公開可採用在一個或多個其中包含有程式代碼的儲存媒體(包括但不限於磁碟記憶體、CD-ROM、光學記憶體等)上實施的計算機程式產品的形式。計算機可用儲存媒體包括永久性和非永久性、可移動和非可移動媒體,可以由任何方法或技術來實現資訊儲存。資訊可以是計算機可讀命令、數據結構、程式的模組或其他數據。計算機的儲存媒體的例子包括但不限於:相變隨機存取記憶體(PRAM)、靜態隨機存取記憶體(SRAM)、動態隨機存取記憶體(DRAM)、其他類型的隨機存取記憶體(RAM)、唯讀記憶體(ROM)、電可擦除可編程唯讀記憶體(EEPROM)、快閃記憶體或其他隨機存取記憶體技術、唯讀光碟唯讀記憶體(CD-ROM)、數位多功能光碟(DVD)或其他光學儲存、磁盒式磁帶,磁帶磁碟儲存或其他磁性儲存設備或任何其他非傳輸媒體,可用於儲存可以被計算設備訪問的資訊。 The present disclosure may take the form of a computer program product embodied on one or more storage media (including but not limited to disk memory, CD-ROM, optical memory, etc.) having program code embodied therein. Computer usable storage media includes permanent and non-permanent, removable and non-removable media, and may be implemented by any method or technology for information storage. Information may be computer readable commands, data structures, modules of programs, or other data. Examples of storage media for computers include, but are not limited to: phase change random access memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other random access memory technology, compact disc read-only memory (CD-ROM ), digital versatile disc (DVD) or other optical storage, magnetic tape cartridge, magnetic tape storage or other magnetic storage device, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.

本領域技術人員在考慮說明書及實踐這裡公開的說明書後,將容易想到本公開的其它實施方案。本公開旨在涵蓋本公開的任何變型、用途或者適應性變化,這些變型、用途或者適應性變化遵循本公開的一般性原理並包括本公開未公開的本技術領域中的公知常識或慣用技術手段。說明書和實施例僅被視為示例性的,本公開的真正範圍和精神由下面的申請專利範圍指出。 Other embodiments of the present disclosure will be readily apparent to those skilled in the art from consideration of the specification and practice of the specification disclosed herein. The present disclosure is intended to cover any modification, use or adaptation of the present disclosure. These modifications, uses or adaptations follow the general principles of the present disclosure and include common knowledge or conventional technical means in the technical field not disclosed in the present disclosure. . The specification and examples are to be considered exemplary only, with the true scope and spirit of the disclosure indicated by the following claims.

應當理解的是,本公開並不局限於上面已經描述並在附圖中示出的精確結構,並且可以在不脫離其範圍進行各種修改和改變。本公開的範圍僅由所附的權利要求來限制。 It should be understood that the present disclosure is not limited to the precise constructions which have been described above and shown in the drawings, and various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

以上所述僅為本公開的較佳實施例而已,並不用以限制本公開,凡在本公開的精神和原則之內,所做的任何修改、等同替換、改進等,均應包含在本公開保護的範圍之內。 The above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present disclosure shall be included in the present disclosure within the scope of protection.

201:獲取任務的資訊,所述任務的資訊包括所述任務的任務類型和資源需求資訊的至少一者 201: Obtain task information, where the task information includes at least one of the task type and resource requirement information of the task

202:根據所述任務的資訊,分別為所述任務的多個子任務分配目標節點 202: According to the information of the task, assign target nodes to multiple subtasks of the task respectively

Claims (21)

一種任務調度方法,所述方法應用於計算機設備,所述計算機設備包括處理器與儲存器,所述方法包括:由所述處理器獲取任務的資訊,所述任務包括多個子任務,所述資訊包括所述任務的任務類型和資源需求資訊的至少一者,所述任務的資源需求資訊為所述任務包括的所述多個子任務的資源需求資訊;在分散式系統中的當前可用資源滿足所述多個子任務的所述資源需求資訊對應的總資源的情況下,由所述處理器分別為所述多個子任務分配對應的目標節點。 A task scheduling method, the method is applied to a computer device, the computer device includes a processor and a memory, the method includes: obtaining information of a task by the processor, the task includes a plurality of subtasks, the information Including at least one of the task type of the task and resource requirement information, the resource requirement information of the task is the resource requirement information of the plurality of subtasks included in the task; the currently available resources in the distributed system meet the required In the case of total resources corresponding to the resource requirement information of the multiple subtasks, the processor assigns corresponding target nodes to the multiple subtasks respectively. 如請求項1所述的任務調度方法,更包括:由所述處理器基於所述分散式系統的當前資源狀態資訊和所述多個子任務的所述資源需求資訊,為所述多個子任務確定目標節點;所述在分散式系統中的當前可用資源滿足所述多個子任務的所述資源需求資訊對應的總資源的情況下,由所述處理器分別為所述多個子任務分配對應的目標節點,包括:在成功為所述多個子任務中的每個子任務確定對應的目標節點的情況下,由所述處理器分別為所述多個子任務分配對應的目標節點。 The task scheduling method according to claim 1, further comprising: determining, by the processor, for the plurality of subtasks based on the current resource state information of the distributed system and the resource requirement information of the plurality of subtasks Target node: when the currently available resources in the distributed system satisfy the total resources corresponding to the resource demand information of the multiple subtasks, the processor assigns corresponding targets to the multiple subtasks respectively The node includes: if the corresponding target node is successfully determined for each subtask in the multiple subtasks, assigning the corresponding target node to the multiple subtasks by the processor. 如請求項2所述的任務調度方法,其中,所述由所述處理器基於所述分散式系統的當前資源狀態資訊和所述多個子 任務的所述資源需求資訊,為所述多個子任務確定目標節點,包括:由所述處理器基於所述分散式系統的所述當前資源狀態資訊和所述多個子任務的所述資源需求資訊,依次為所述多個子任務確定目標節點;在每確定一個子任務的目標節點後,由所述處理器基於所述一個子任務的資源需求資訊,更新所述分散式系統的所述當前資源狀態資訊。 The task scheduling method according to claim 2, wherein the processor is based on the current resource state information of the distributed system and the plurality of child The resource requirement information of the task, determining the target node for the plurality of subtasks, comprising: by the processor based on the current resource status information of the distributed system and the resource requirement information of the plurality of subtasks , determining target nodes for the plurality of subtasks in turn; after each target node of a subtask is determined, the processor updates the current resource of the distributed system based on the resource requirement information of the subtask status information. 如請求項2或3所述的任務調度方法,其中,在未能成功為所述多個子任務中的至少一個子任務確定對應的目標節點的情況下,由所述處理器確定所述分散式系統的所述當前可用資源不滿足所述多個子任務中每個子任務所需求的資源。 The task scheduling method according to claim 2 or 3, wherein, in the case of failing to determine the corresponding target node for at least one of the plurality of subtasks, the processor determines the distributed The currently available resources of the system do not meet the resources required by each subtask in the plurality of subtasks. 如請求項2至3中任一項所述的任務調度方法,其中,所述由所述處理器基於所述分散式系統的當前資源狀態資訊和所述多個子任務的所述資源需求資訊,為所述多個子任務確定目標節點,包括:由所述處理器基於所述分散式系統的所述當前資源狀態資訊和所述多個子任務的所述資源需求資訊,確定所述多個子任務中每個子任務的預選節點集合;由所述處理器從所述每個子任務的所述預選節點集合中選擇所述每個子任務的目標節點。 The task scheduling method according to any one of claims 2 to 3, wherein the processor is based on the current resource status information of the distributed system and the resource requirement information of the plurality of subtasks, Determining target nodes for the plurality of subtasks includes: determining, by the processor, one of the plurality of subtasks based on the current resource state information of the distributed system and the resource requirement information of the plurality of subtasks A preselected node set for each subtask; the processor selects a target node for each subtask from the preselected node set for each subtask. 如請求項5所述的任務調度方法,其中,所述由所述處理器確定所述多個子任務中每個子任務的預選節點集合,包括:由所述處理器按照特定順序依次確定所述每個子任務的所述預選節點集合,其中,所述多個子任務中第一子任務的所述預選節點集合中的預選節點是從所述多個子任務中第二子任務的所述預選節點集合選取的,其中,所述第二子任務的順序位於所述第一子任務之前。 The task scheduling method according to claim 5, wherein the determining by the processor the preselected node set of each subtask in the plurality of subtasks includes: sequentially determining the set of nodes for each subtask in a specific order by the processor The preselected node set of subtasks, wherein the preselected node in the preselected node set of the first subtask of the plurality of subtasks is selected from the preselected node set of the second subtask of the plurality of subtasks , wherein the sequence of the second subtask is before the first subtask. 如請求項5所述的任務調度方法,其中,所述由所述處理器從所述每個子任務的所述預選節點集合中選擇所述每個子任務的目標節點,包括:由所述處理器按照特定順序依次從所述每個子任務的所述預選節點集合中選擇所述每個子任務的目標節點,其中,優先選擇所述多個子任務中第二子任務的目標節點作為所述多個子任務中第一子任務的目標節點,其中,所述第二子任務的順序位於所述第一子任務之前。 The task scheduling method according to claim 5, wherein the selecting the target node of each subtask by the processor from the preselected node set of each subtask includes: by the processor Selecting the target node of each subtask from the preselected node set of each subtask in a specific order, wherein the target node of the second subtask among the plurality of subtasks is preferentially selected as the plurality of subtasks The target node of the first subtask in , wherein the order of the second subtask is before the first subtask. 如請求項5所述的任務調度方法,其中,所述由所述處理器從所述每個子任務的所述預選節點集合中選擇所述每個子任務的目標節點,包括:由所述處理器根據所述任務的所述任務類型,確定所述每個子任務的所述預選節點集合中包含的預選節點的分值;由所述處理器基於所述每個子任務的所述預選節點集合中包 含的預選節點的分值,從所述每個子任務的所述預選節點集合中選擇所述每個子任務的目標節點。 The task scheduling method according to claim 5, wherein the selecting the target node of each subtask by the processor from the preselected node set of each subtask includes: by the processor According to the task type of the task, determine the score of the preselected nodes included in the preselected node set of each subtask; Select the target node of each subtask from the set of preselected nodes of each subtask. 如請求項2或3所述的任務調度方法,其中,所述多個子任務依次進行目標節點的確定的順序是基於子任務的優先級、子任務之間的依賴關係中的至少一項確定的。 The task scheduling method according to claim 2 or 3, wherein the order in which the plurality of subtasks sequentially determine the target node is determined based on at least one of the priority of the subtasks and the dependency relationship between the subtasks . 如請求項1至3任意一項所述的任務調度方法,更包括:確定所述任務的所述任務類型;由所述處理器基於所述任務的所述任務類型,分別為所述多個子任務分配對應的目標節點。 The task scheduling method according to any one of claim items 1 to 3, further comprising: determining the task type of the task; by the processor based on the task type of the task, respectively The target node corresponding to the task assignment. 如請求項1至3任意一項所述的任務調度方法,其中,所述由所述處理器分別為所述多個子任務分配對應的目標節點,包括:由所述處理器將所述多個子任務劃分為至少一個分組,其中,每個分組包括所述多個子任務中的至少一個子任務;由所述處理器分別為所述至少一個分組中每個分組分配對應的目標節點,其中,同一分組中的各個子任務分配到同一目標節點。 The task scheduling method according to any one of claim items 1 to 3, wherein the assigning corresponding target nodes to the plurality of subtasks by the processor includes: assigning the plurality of subtasks by the processor The task is divided into at least one group, wherein each group includes at least one subtask in the plurality of subtasks; the processor assigns a corresponding target node to each group in the at least one group, wherein the same Each subtask in the group is assigned to the same target node. 如請求項11所述的任務調度方法,其中,所述由所述處理器將所述多個子任務劃分為至少一個分組,包括:由所述處理器根據所述任務的所述任務類型,將所述多個子任務劃分為至少一個分組。 The task scheduling method according to claim 11, wherein said dividing the plurality of subtasks into at least one group by the processor includes: dividing by the processor according to the task type of the task The plurality of subtasks are divided into at least one group. 如請求項10所述的任務調度方法,其中,所述任務的所述任務類型為計算密集型或通信密集型。 The task scheduling method according to claim 10, wherein the task type of the task is computation-intensive or communication-intensive. 如請求項11所述的任務調度方法,其中,所述由所述處理器將所述多個子任務劃分為至少一個分組,包括:在所述任務的所述任務類型為通信密集型的情況下,由所述處理器根據所述多個子任務的所述資源需求資訊,確定所述多個子任務所需的總節點數量,並根據所述總節點數量,將所述多個子任務劃分為至少一個分組;和/或在所述任務的所述任務類型為計算密集型的情況下,由所述處理器將所述多個子任務中的每個子任務作為一個分組。 The task scheduling method according to claim 11, wherein said dividing said plurality of subtasks into at least one group by said processor includes: when said task type of said task is communication-intensive , the processor determines the total number of nodes required by the multiple subtasks according to the resource requirement information of the multiple subtasks, and divides the multiple subtasks into at least one according to the total number of nodes grouping; and/or in a case where the task type of the task is computationally intensive, the processor treats each of the plurality of subtasks as a group. 如請求項10所述的任務調度方法,其中,所述由所述處理器基於所述任務的所述任務類型,分別為所述多個子任務分配對應的目標節點,包括:由所述處理器基於所述任務的所述任務類型,為所述多個子任務確定目標節點;在為所述多個子任務均成功確定對應的目標節點的情況下,由所述處理器為所述多個子任務分配目標節點。 The task scheduling method according to claim 10, wherein said assigning corresponding target nodes to said plurality of subtasks by said processor based on said task type of said task comprises: said processor Based on the task type of the task, determine target nodes for the plurality of subtasks; when the corresponding target nodes are successfully determined for the plurality of subtasks, allocate the plurality of subtasks by the processor target node. 如請求項15所述的任務調度方法,更包括:在為所述子任務中的至少一個子任務確定對應的目標節點不成功的情況下,由所述處理器對所述多個子任務均進行延遲分配。 The task scheduling method according to claim 15, further comprising: in the case of unsuccessful determination of the corresponding target node for at least one of the subtasks, the processor performs Delayed distribution. 如請求項1至3任意一項所述的任務調度方法,更包括:在所述分散式系統中的所述當前可用資源不滿足所述多個子任務的所述資源需求資訊對應的所述總資源的情況下,由所述處理器對所述多個子任務均進行延遲分配。 The task scheduling method according to any one of claims 1 to 3, further comprising: the currently available resources in the distributed system do not satisfy the total corresponding to the resource requirement information of the multiple subtasks In the case of resources, the processor performs delayed allocation on the multiple subtasks. 如請求項1至3任意一項所述的任務調度方法,更包括:在分別為所述多個子任務分配對應的目標節點之後,由所述處理器對所述多個子任務進行同步調度。 The task scheduling method according to any one of claims 1 to 3, further comprising: after assigning corresponding target nodes to the multiple subtasks, the processor performs synchronous scheduling on the multiple subtasks. 一種任務調度裝置,包括:獲取模組,用於獲取任務的資訊,所述任務包括多個子任務,所述資訊包括所述任務的任務類型和資源需求資訊的至少一者,所述任務的資源需求資訊為所述任務包括的所述多個子任務的資源需求資訊;第一分配模組,用於在分散式系統中的當前可用資源滿足所述多個子任務的所述資源需求資訊對應的總資源的情況下,分別為所述多個子任務分配對應的目標節點。 A task scheduling device, comprising: an acquisition module for acquiring task information, the task includes a plurality of subtasks, the information includes at least one of the task type and resource requirement information of the task, and the resource of the task The requirement information is the resource requirement information of the plurality of subtasks included in the task; the first allocation module is used for the currently available resources in the distributed system to satisfy the total corresponding to the resource requirement information of the plurality of subtasks In the case of resources, corresponding target nodes are assigned to the multiple subtasks respectively. 一種計算機可讀儲存媒體,其上儲存有計算機程式,該程式被處理器執行時實現請求項1至18任意一項所述的任務調度方法。 A computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the task scheduling method described in any one of claims 1 to 18 is implemented. 一種計算機設備,包括記憶體、處理器及儲存在所述記憶體上並可在所述處理器上運行的計算機程式,所述處 理器執行所述程式時實現請求項1至18任意一項所述的任務調度方法。 A computer device, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, the processor When the processor executes the program, the task scheduling method described in any one of claim items 1 to 18 is realized.
TW110108474A 2020-03-11 2021-03-10 Task scheduling method and apparatus, storage media and computer equipment TWI786564B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202010165763.7A CN113391914A (en) 2020-03-11 2020-03-11 Task scheduling method and device
CN202010165763.7 2020-03-11
CN202010165543.4A CN113391886A (en) 2020-03-11 2020-03-11 Task scheduling method and device
CN202010165543.4 2020-03-11

Publications (2)

Publication Number Publication Date
TW202134870A TW202134870A (en) 2021-09-16
TWI786564B true TWI786564B (en) 2022-12-11

Family

ID=77671227

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110108474A TWI786564B (en) 2020-03-11 2021-03-10 Task scheduling method and apparatus, storage media and computer equipment

Country Status (4)

Country Link
JP (1) JP2022539955A (en)
KR (1) KR20220002547A (en)
TW (1) TWI786564B (en)
WO (1) WO2021180092A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961328B (en) * 2021-10-26 2022-07-19 深圳大学 Task processing method and device, storage medium and electronic equipment
CN114020434A (en) * 2021-11-09 2022-02-08 中国建设银行股份有限公司 Task processing method and device, electronic equipment and storage medium
CN114035931A (en) * 2021-12-22 2022-02-11 北京字节跳动网络技术有限公司 Task scheduling processing method and device
CN114546623B (en) * 2022-03-01 2022-12-27 淮安市第二人民医院 Task scheduling method and system based on big data system
WO2024069843A1 (en) * 2022-09-29 2024-04-04 楽天モバイル株式会社 Distributed deployment control for microservices
CN115658271B (en) * 2022-11-01 2023-07-21 中科雨辰科技有限公司 Method for acquiring target task object based on target task list

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201317910A (en) * 2011-10-08 2013-05-01 Broadcom Corp Social device resource management
TWI493481B (en) * 2011-10-08 2015-07-21 美國博通公司 Social device anonymity via full, content only, and functionality access views
CN106502791A (en) * 2016-10-14 2017-03-15 浪潮电子信息产业股份有限公司 A kind of method for allocating tasks and device
CN110187960A (en) * 2019-04-23 2019-08-30 广东省智能制造研究所 A kind of distributed resource scheduling method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104253850A (en) * 2014-01-07 2014-12-31 深圳市华傲数据技术有限公司 Distributed task scheduling method and system
US10439890B2 (en) * 2016-10-19 2019-10-08 Tata Consultancy Services Limited Optimal deployment of fog computations in IoT environments
CN107135257A (en) * 2017-04-28 2017-09-05 东方网力科技股份有限公司 Task is distributed in a kind of node cluster method, node and system
CN107291545B (en) * 2017-08-07 2019-12-10 星环信息科技(上海)有限公司 Task scheduling method and device for multiple users in computing cluster

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201317910A (en) * 2011-10-08 2013-05-01 Broadcom Corp Social device resource management
TWI493481B (en) * 2011-10-08 2015-07-21 美國博通公司 Social device anonymity via full, content only, and functionality access views
CN106502791A (en) * 2016-10-14 2017-03-15 浪潮电子信息产业股份有限公司 A kind of method for allocating tasks and device
CN110187960A (en) * 2019-04-23 2019-08-30 广东省智能制造研究所 A kind of distributed resource scheduling method and device

Also Published As

Publication number Publication date
KR20220002547A (en) 2022-01-06
JP2022539955A (en) 2022-09-14
TW202134870A (en) 2021-09-16
WO2021180092A1 (en) 2021-09-16

Similar Documents

Publication Publication Date Title
TWI786564B (en) Task scheduling method and apparatus, storage media and computer equipment
Polo et al. Performance-driven task co-scheduling for mapreduce environments
CN103593242B (en) Resource sharing control system based on Yarn frameworks
US9092266B2 (en) Scalable scheduling for distributed data processing
US8893148B2 (en) Performing setup operations for receiving different amounts of data while processors are performing message passing interface tasks
CN112416585B (en) Deep learning-oriented GPU resource management and intelligent scheduling method
US20170024251A1 (en) Scheduling method and apparatus for distributed computing system
CN111381950A (en) Task scheduling method and system based on multiple copies for edge computing environment
US20090064165A1 (en) Method for Hardware Based Dynamic Load Balancing of Message Passing Interface Tasks
CN112052068A (en) Method and device for binding CPU (central processing unit) of Kubernetes container platform
CN114356543A (en) Kubernetes-based multi-tenant machine learning task resource scheduling method
CN113342477A (en) Container group deployment method, device, equipment and storage medium
CN113391914A (en) Task scheduling method and device
US20210390405A1 (en) Microservice-based training systems in heterogeneous graphic processor unit (gpu) cluster and operating method thereof
US20090064166A1 (en) System and Method for Hardware Based Dynamic Load Balancing of Message Passing Interface Tasks
WO2024120205A1 (en) Method and apparatus for optimizing application performance, electronic device, and storage medium
CN114386560A (en) Data processing method and device
CN111459684A (en) Cloud computing resource fusion scheduling management method, system and medium for multiprocessor architecture
CN112948113A (en) Cluster resource management scheduling method, device, equipment and readable storage medium
Ungureanu et al. Kubernetes cluster optimization using hybrid shared-state scheduling framework
Pastorelli et al. Practical size-based scheduling for MapReduce workloads
CN116157778A (en) System and method for hybrid centralized and distributed scheduling on shared physical hosts
Nzanywayingoma et al. Task scheduling and virtual resource optimising in Hadoop YARN-based cloud computing environment
Thai et al. Algorithms for optimising heterogeneous Cloud virtual machine clusters
Polo et al. Adaptive task scheduling for multijob mapreduce environments