CN112114950A

CN112114950A - Task scheduling method and device and cluster management system

Info

Publication number: CN112114950A
Application number: CN202010997982.1A
Authority: CN
Inventors: 杨清强; 吕文栋; 薛佳梅
Original assignee: China Construction Bank Corp
Current assignee: China Construction Bank Corp
Priority date: 2020-09-21
Filing date: 2020-09-21
Publication date: 2020-12-22

Abstract

The invention discloses a task scheduling method and device and a cluster management system, and relates to the technical field of computers. One embodiment of the method comprises: receiving a task scheduling request, acquiring a big data running task according to the task scheduling request, and dividing the big data running task into at least one subtask; generating a task scheduling table corresponding to at least one subtask, and distributing the at least one subtask to at least one target Kubernets cluster according to the task scheduling table, wherein the at least one target Kubernets cluster is a cluster in a plurality of Kubernets clusters; and after at least one subtask is operated by at least one target Kubernets cluster, acquiring a subtask operation result returned by the at least one target Kubernets cluster. According to the implementation method, the big data operation task is changed from the previous operation condition of a single cluster into the operation of multiple clusters, so that the utilization rate of the whole bottom computing resource is improved, and conditions are provided for the subsequent more big data and container combination scenes.

Description

Task scheduling method and device and cluster management system

Technical Field

The invention relates to the technical field of computers, in particular to a task scheduling method and device and a cluster management system.

Background

The continuous development of container technology and the continuous deepening of the combination of big data and container related applications form a branch of big data applications. Big data component containerization can be performed by sinking a big data task to an underlying container with the help of the container management capability of a Kubernets cluster. The Kubernetes is a Google open-source container cluster management system, provides functions of application deployment, maintenance, extension mechanisms and the like, and can be used for conveniently managing cross-cluster operation containerized applications.

In the prior art, task issuing and scheduling are realized by adopting the following two modes: the method comprises the steps that on the basis of a single Kubernetes cluster, tasks are scheduled to appropriate nodes to run by calculating the node load condition of the Kubernetes cluster; and secondly, calculating the whole resources required by the big data task from the whole perspective, and issuing and scheduling the task when the single Kubernetes cluster meets the whole resource requirement.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: for the first mode, under the condition that cluster resources are relatively tense or a plurality of big data tasks compete to run simultaneously, a deadlock phenomenon occurs; for the second mode, the running time of the task with large resource occupation ratio is sacrificed to avoid the deadlock phenomenon generated by competitive resources, so that the running time of the task with large resource occupation ratio is long; the first mode and the second mode can only realize the resource scheduling of a single cluster.

Disclosure of Invention

In view of this, embodiments of the present invention provide a task scheduling method and apparatus, and a cluster management system, which can change a big data operation task from a previous operation condition of a single cluster to a multi-cluster operation, improve the utilization rate of the overall underlying computing resources, and provide conditions for subsequent more big data and container combination scenarios.

To achieve the above object, according to a first aspect of the embodiments of the present invention, a task scheduling method is provided.

The task scheduling method of the embodiment of the invention comprises the following steps: receiving a task scheduling request, acquiring a big data running task according to the task scheduling request, and dividing the big data running task into at least one subtask; generating a task schedule corresponding to the at least one subtask, and distributing the at least one subtask to at least one target Kubernets cluster according to the task schedule, wherein the at least one target Kubernets cluster is a cluster in a plurality of Kubernets clusters; and after the at least one target Kubernets cluster runs the at least one subtask, obtaining a subtask running result returned by the at least one target Kubernets cluster.

Optionally, the generating a task schedule corresponding to the at least one subtask includes: acquiring resource demand data of each subtask, and sequencing the at least one subtask according to the descending order of the resource demand data of each subtask; and determining a target Kubernets cluster corresponding to each subtask according to the sequence of the at least one subtask to generate a task scheduling table corresponding to the at least one subtask.

Optionally, the determining a target Kubernets cluster corresponding to each subtask includes: acquiring resource data of the plurality of Kubernets; and selecting a target Kubernets cluster corresponding to each subtask from the Kubernets clusters according to the resource data of the Kubernets clusters and the resource demand data of each subtask.

Optionally, the resource data of the plurality of Kubernets clusters includes: available resource data for each Kubernets cluster and a load value for each Kubernets cluster; and selecting a target Kubernets cluster corresponding to each subtask from the plurality of Kubernets clusters according to the resource data of the plurality of Kubernets clusters and the resource demand data of each subtask, including: filtering, from the plurality of Kubernets clusters, Kubernets clusters whose available resource data is not greater than the resource demand data of each subtask to obtain at least one optional Kubernets cluster corresponding to each subtask; and performing priority ranking on the at least one selectable Kubernets cluster, and determining the Kubernets cluster with the highest ranking as the target Kubernets cluster corresponding to each subtask.

Optionally, the prioritizing the at least one optional Kubernets cluster includes: calculating a priority score for each of the at least one selectable Kubernets cluster based on the weight value for available resource data and the weight value for load values, based on the available resource data and the load values for each of the selectable Kubernets cluster and the available resource data for each of the selectable Kubernets cluster; and ordering the at least one selectable Kubernets cluster according to the priority score of each selectable Kubernets cluster from high to low.

Optionally, after selecting the target Kubernets cluster corresponding to each subtask from the plurality of Kubernets clusters, the method further includes: and updating the resource data of the target Kuberne ts cluster.

Optionally, before generating the task schedule corresponding to the at least one sub-task, the method further includes: filtering the Kubernets cluster whose working status is unavailable from the plurality of Kuber nets clusters.

Optionally, after acquiring the big data running task, the method further includes: and formatting the big data running task to complete format verification of the big data running task.

Optionally, the dividing the big data running task into at least one subtask includes: acquiring a task result dimension corresponding to the big data operation task; and dividing the big data operation task according to the task result dimension to obtain the at least one subtask.

Optionally, after the big data running task is divided according to the task result dimension to obtain the at least one subtask, the method further includes: and combining the subtasks with the dependency relationship in the at least one subtask together so as to distribute the subtask with the dependency relationship to a target Kubernets cluster.

Optionally, after distributing the at least one subtask to at least one target Kubernets cluster, the method further comprises: and if the sub-task operation result returned by the target Kubernets cluster is not obtained within the preset time, recovering the sub-tasks distributed to the target Kubernets cluster, re-scheduling the recovered sub-tasks, and sending the alarm information of the target K-ubernets cluster.

Optionally, the plurality of Kubernets clusters correspond to a unified underlying storage.

To achieve the above object, according to a second aspect of the embodiments of the present invention, there is provided a task scheduling apparatus.

The task scheduling device of the embodiment of the invention comprises: the receiving module is used for receiving a task scheduling request, acquiring a big data running task according to the task scheduling request, and dividing the big data running task into at least one subtask; the scheduling module is used for generating a task scheduling table corresponding to the at least one subtask, and distributing the at least one subtask to at least one target Kubernets cluster according to the task scheduling table, wherein the at least one target Kubernets cluster is a cluster in a plurality of Kubernets clusters; and the obtaining module is used for obtaining a subtask operation result returned by the at least one target Kubernets cluster after the at least one target Kubernets cluster operates the at least one subtask.

Optionally, the scheduling module is further configured to: acquiring resource demand data of each subtask, and sequencing the at least one subtask according to the descending order of the resource demand data of each subtask; and determining a target Kubernets cluster corresponding to each subtask according to the sequence of the at least one subtask to generate a task scheduling table corresponding to the at least one subtask.

Optionally, the scheduling module is further configured to: acquiring resource data of the plurality of Kubernets; and selecting a target Kubernets cluster corresponding to each subtask from the Kubernets clusters according to the resource data of the Kubernets clusters and the resource demand data of each subtask.

Optionally, the resource data of the plurality of Kubernets clusters includes: available resource data for each Kubernets cluster and a load value for each Kubernets cluster; and the scheduling module is further configured to: filtering, from the plurality of Kubernets clusters, Kubernets clusters whose available resource data is not greater than the resource demand data of each subtask to obtain at least one optional Kubernets cluster corresponding to each subtask; and performing priority ranking on the at least one selectable Kubernets cluster, and determining the Kubernets cluster with the highest ranking as the target Kubernets cluster corresponding to each subtask.

Optionally, the scheduling module is further configured to: calculating a priority score for each selectable Kubernets cluster of the at least one selectable Kubernets cluster according to the available resource data of the each selectable Kubernets cluster and the load value of the each selectable Kubernets cluster based on the weight value of the available resource data and the weight value of the load value; and sequencing the at least one selectable Kubernets cluster according to the priority score of each selectable Kubernets cluster from high to low.

Optionally, the scheduling module is further configured to: and updating the resource data of the target Kubernets cluster.

Optionally, the scheduling module is further configured to: filtering the Kubernets cluster whose working status is unavailable from the plurality of Kubernets clusters.

Optionally, the receiving module is further configured to: and formatting the big data running task to complete format verification of the big data running task.

Optionally, the receiving module is further configured to: acquiring a task result dimension corresponding to the big data operation task; and dividing the big data operation task according to the task result dimension to obtain the at least one subtask.

Optionally, the receiving module is further configured to: and combining the subtasks with the dependency relationship in the at least one subtask together so as to distribute the subtask with the dependency relationship to a target Kubernets cluster.

Optionally, the obtaining module is further configured to: and if the sub-task operation result returned by the target Kubernets cluster is not obtained within the preset time, recovering the sub-task distributed to the target Kubernets cluster, re-scheduling the recovered sub-task, and sending the alarm information of the target Kubernets cluster.

To achieve the above object, according to a third aspect of the embodiments of the present invention, a cluster management system is provided.

The cluster management system of the embodiment of the invention comprises: the system comprises a big data task management component and a plurality of Kubernets clusters; the big data task management component is configured to: executing the task scheduling method to divide a big data operation task into at least one subtask, and then distributing the at least one subtask to at least one target Kubernets cluster; the at least one target Kubernets cluster is a cluster of the plurality of Kubernets clusters, configured to: and running the at least one subtask, and returning a subtask running result to the big data task management component.

To achieve the above object, according to a fourth aspect of embodiments of the present invention, there is provided an electronic apparatus.

An electronic device of an embodiment of the present invention includes: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by one or more processors, the one or more processors realize the task scheduling method of the embodiment of the invention.

To achieve the above object, according to a fifth aspect of embodiments of the present invention, there is provided a computer-readable medium.

A computer-readable medium of an embodiment of the present invention stores thereon a computer program, and when the program is executed by a processor, the program implements a task scheduling method of an embodiment of the present invention.

One embodiment of the above invention has the following advantages or benefits: dividing a big data running task into one or more subtasks, generating a task scheduling table corresponding to the subtasks, distributing the divided subtasks into at least one target Kubernets cluster according to the task scheduling table, and finally obtaining a subtask running result returned by the target Kubernets cluster.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of a cluster management system;

FIG. 2 is a schematic diagram of the main steps of a task scheduling method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of the main steps of dividing a big data run task into at least one subtask, according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of the main steps of generating a task schedule corresponding to at least one subtask, according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a main flow of generating a task schedule corresponding to at least one subtask, according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of the main modules of a task scheduler according to an embodiment of the invention;

FIG. 7 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The operation of the big data task needs the support of the underlying computational power, the container cluster in the container field provides the arranging and stretching capacity, and can provide a combination point for the big data task to be scheduled and executed according to the requirement, so the big data component ON container cluster can be continuously evolved and advanced in the future development process, and a branch of big data application is formed. For example, spark (i.e., a fast general-purpose computing engine designed for large-scale data processing) on kubernets, flink (i.e., an open-source big-data computing engine that supports batch processing and stream processing) on kubernets, the computing power of spark or flink can be sunk to the underlying container by virtue of the container management capabilities of the kubernets cluster.

At present, the scheduling of a task issuing process depends on a kube-schedule component (i.e., a component existing in a plug-in form and equivalent to a scheduling decision maker of the whole cluster) of the kubernets cluster, and the task is scheduled to a proper node for operation by calculating different node load conditions in the kubernets cluster. The scheduling mode can meet the requirement that a big data component runs on a container under the condition that the utilization rate of Kubernets cluster resources is not high, but the deadlock phenomenon can occur under the condition that the Kubernets cluster resources are relatively tense or a plurality of big data tasks compete to run simultaneously. For example, the remaining number of resources of the Kubernetes cluster is 20, the big data operation tasks a and B are issued to the Kubernetes cluster, the total number of resources required by a is 18, and the total number of resources required to be occupied by B is 9. Then, during scheduling, the number of resources allocated to a is 15, the number of resources allocated to B is 5, but the total resource demand number of a is 18, and the total resource demand number of B is 9, so that both a and B are not enough to complete their own task operation, and will wait for each other to finish their own task operation after acquiring resources released by each other, resulting in deadlock.

In addition, the mode of scheduling subtasks one by one can be converted into the mode of calculating the whole resource quantity required by the big data operation task from the whole angle, and the task issuing scheduling is carried out under the condition that the Kubernetes cluster meets the whole resource requirement so as to prevent the deadlock phenomenon. Specifically, the kubernets cluster runs a big data running task with relatively small resource consumption first, and runs a big data running task with relatively large resource consumption after the big data running task with relatively small resource consumption is finished.

In addition, the existing task scheduling modes all operate in independent kubernets clusters, and if a scene has 10 kubernets clusters, even if resources on a certain cluster are insufficient, tasks on the cluster cannot be scheduled to other clusters.

In summary, with the continuous and deep integration of big data applications and container clusters, the rational scheduling of resources can slowly become the bottleneck of the development of big data component on container clusters. Based on this application background, embodiments of the present invention provide a task scheduling method to advance further application of big data component on container clusters. For the convenience of understanding, the cluster management system is introduced, and then the task scheduling method based on the cluster management system is described in detail. Fig. 1 is a schematic structural diagram of a cluster management system. As shown in fig. 1, the cluster management system may include: a big data task management component and multiple Kubernets clusters. The plurality of Kubernets are independent, each Kubernets cluster has respective nodes, and versions of different Kubernets clusters can be different. Each Kubernets cluster mainly comprises a main node and working nodes, wherein the main node is responsible for managing the operation of the Kubernets cluster, and the working nodes mainly bear specific containers for operation. It should be noted that the plurality of Kubernets correspond to a unified underlying storage, that is, for the same data request, the obtained data is the same no matter which Kubernets cluster obtains data from the underlying layer. The big data task management component can execute the task scheduling method according to the embodiment of the invention, and fig. 2 is a schematic diagram of main steps of the task scheduling method according to the embodiment of the invention. As shown in fig. 2, the main steps of the task scheduling method may include:

step S201, receiving a task scheduling request, acquiring a big data running task according to the task scheduling request, and dividing the big data running task into at least one subtask;

step S202, generating a task scheduling table corresponding to at least one subtask, and distributing the at least one subtask to at least one target Kubernets cluster according to the task scheduling table, wherein the at least one target Kubernets cluster is a cluster in a plurality of Kubernets clusters;

step S203, after at least one target Kubernets cluster runs at least one subtask, obtaining a subtask running result returned by the at least one target Kubernets cluster.

The task scheduling request refers to a request for scheduling a big data running task, so that the big data running task can be acquired according to the received task scheduling request, and the big data running task is divided into at least one subtask. Next, a task schedule corresponding to the subtasks is generated, and the task schedule records to which Kubernets cluster each subtask needs to be distributed. The subtasks may then be distributed into specific Kubernets clusters according to a task schedule. Finally, after the specific Kubernets cluster runs the subtasks, the subtask running result returned by the specific K ubernets cluster can be obtained.

For example, there are 5 Kubernets clusters from K1 to K5, and a user submits a big data operation task a to a big data task management component, the big data task management component divides a into 10 subtasks from a1 to a10, and then generates a task schedule corresponding to the 10 subtasks, where the task schedule records distribution of a1 to A3 to the Kubernets cluster K2, a4 to a8 to the Kubernets cluster K3, a9 and a10 to the Kubernets cluster K5, and then K2, K3, and K5 are target Kubernets clusters. Distributing A1 to A10 to Kubernets clusters K2, K3 and K5 according to the task schedules corresponding to the 10 subtasks. After the Kubernets cluster K2 runs the subtasks A1 to A3, returning the subtask running result to the big data task management component; after the Kubernets cluster K3 runs the subtasks A4 to A8, returning the subtask running result to the big data task management component; after the Kubernets cluster K5 runs the subtasks a9 and a10, the subtask running results are returned to the big data task management component.

In the prior art, task scheduling is performed based on a single Kubernets cluster, so that deadlock can occur or the running time of a task with larger resource occupation is longer. However, in the embodiment of the present invention, the large data operation task may be divided into one or more subtasks, a task schedule corresponding to the subtask is generated, the divided subtasks are distributed to at least one target Kubernets cluster according to the task schedule, and finally, the subtask operation result returned by the target Kubernets cluster may be obtained. Therefore, the embodiment of the invention can change the operation condition of a big data operation task from the previous single cluster into the operation of multiple clusters together, solves the problems of deadlock and long operation time of the big task in the prior art, improves the utilization rate of the whole bottom layer computing resource, and provides conditions for more follow-up big data and container combination scenes.

In the embodiment of the invention, after the big data operation task is obtained, the big data operation task can be formatted, namely, the format of the big data operation task is checked, so that the format of the big data operation task can be normalized, the task scheduling time is reduced, and the task scheduling efficiency is improved.

After the big data operation task is obtained, the big data operation task needs to be divided into subtasks, then the subtasks are distributed to the Kubernets cluster, the Kubernets cluster operates the subtasks, and the subtask operation result is returned. In the embodiment of the present invention, dividing the big data running task into at least one subtask may include: acquiring a task result dimension corresponding to a big data operation task; and dividing the big data operation task according to the dimension of the task result to obtain at least one subtask. That is to say, in the process of dividing the big data operation task, the big data operation task is divided according to the dimension of the task result.

In addition, after the big data running task is divided according to the dimension of the task result to obtain at least one subtask, the subtasks with the dependency relationship in the at least one subtask can be combined together, so that the subtasks with the dependency relationship can be distributed to a target Kuberne ts cluster. Considering that the running of some subtasks needs to depend on the running results of other subtasks, in order to reduce the running time of the subtasks and achieve reasonable scheduling of each subtask, the subtasks with the dependency relationship can be combined together, so that the subtasks with the dependency relationship can be distributed to the same target Kubernets cluster. For example, the big data running task a is divided into 10 subtasks from a1 to a10, where the running of a1 needs to depend on the running result of A3, that is, a1 and A3 have a dependency relationship, then a1 and A3 may be combined together, and a1 and A3 may be distributed to the same target Kubernets cluster.

FIG. 3 is a schematic diagram of the main steps of dividing a big data run task into at least one subtask, according to an embodiment of the present invention. As shown in fig. 3, the main steps of dividing the big data execution task into at least one subtask may include:

step S301, acquiring a big data operation task;

step S302, formatting the acquired big data operation task;

step S303, acquiring a task result dimension corresponding to the big data operation task, and dividing the big data operation task according to the task result dimension to obtain at least one subtask;

and step S304, combining the subtasks with the dependency relationship in at least one subtask so as to distribute the subtasks with the dependency relationship to a target Kubernets cluster.

In the embodiment of the invention, the big data operation task is formatted, the format of the big data operation task can be normalized, the task scheduling time is reduced, and the task scheduling efficiency is improved. In addition, the subtasks with the dependency relationship can be combined together, and can be distributed to a target Kubernets cluster, so that the running time of the subtasks is reduced, and each subtask can be reasonably scheduled.

The target Kubernets cluster corresponding to each subtask is recorded in the task scheduling table, and the subtask is distributed to the corresponding target Kubernets cluster according to the task scheduling table, so that the target K ubernets cluster runs the subtask, the running time of the subtask is related to the target Kubernets cluster, and the generation of the task scheduling table is an important component in the embodiment of the invention. It is noted that unavailable Kubernets clusters need to be filtered out of multiple Kubernets before generating a task schedule. Considering the situation that some Kubernets cluster fails, some Kubernets cluster needs to execute an emergency task, some Kubernets cluster needs to be upgraded and the like are unavailable, when the task scheduling table is generated, the unavailable Kubernets cluster needs to be filtered out from a plurality of Kubernets clusters, and the generated task scheduling table is more reasonable.

FIG. 4 is a schematic diagram of the main steps of generating a task schedule corresponding to at least one subtask, according to an embodiment of the present invention. As shown in fig. 4, the main steps of generating a task schedule corresponding to at least one sub-task may include step S401 and step S402.

Step S401: and acquiring the resource demand data of each subtask, and sequencing at least one subtask according to the sequence of the resource demand data of each subtask from large to small.

The task scheduling table records which Kubernets cluster each subtask needs to be distributed to, so that in the process of generating the task scheduling table, each subtask can be analyzed, and a target Kubernets cluster corresponding to each subtask is determined. Considering that the resource demand data of each subtask is different, in order to achieve reasonable scheduling of each subtask, at least one subtask can be sequenced according to the sequence of the resource demand data of each subtask from large to small, and the target Kubernets cluster corresponding to the subtasks is determined in sequence according to the sequencing sequence of the subtasks.

Step S402: and determining a target Kubernets cluster corresponding to each subtask according to the sequence of the at least one subtask to generate a task scheduling table corresponding to the at least one subtask. Specifically, the process of determining the target Kubernets cluster corresponding to each subtask may include:

acquiring resource data of a plurality of Kubernets clusters;

and (II) selecting a target Kubernets cluster corresponding to each subtask from the Kubernets clusters according to the resource data of the Kubernets clusters and the resource demand data of each subtask.

The resource data of the plurality of Kubernets may include: available resource data for each Kubernets cluster and a load value for each Kubernets cluster. The available resource data refers to the remaining resource data of the K ubernets cluster, for example, if the original resource data of the Kubernets cluster K1 is 100, the used resource data is 40, and the available resource data is 60. The load value may be a ratio of used resource data to available resource data, and it can be seen that if the original resource data of two Kubernets are the same, the higher the load value of one Kubernets cluster is, the less the available resource data of the Kubernets cluster is.

After the available resource data of each Kubernets cluster and the load value of each Kubernets cluster are obtained, the target Kubernets cluster corresponding to each subtask can be selected from the plurality of Kubernets clusters by combining the resource demand data of each subtask. The specific implementation can be as follows: filtering the Kubernets cluster with available resource data not larger than the resource demand data of each subtask from the plurality of Kubernets clusters to obtain at least one optional K ubernets cluster corresponding to each subtask; and performing priority ordering on at least one selectable Kubernets cluster, and determining the Kubernets cluster with the highest ordering as a target Kubernets cluster corresponding to each subtask.

For one subtask a1, in the process of selecting the target Kubernets cluster corresponding to the subtask a1 from the plurality of Kubernets clusters, firstly, the Kubernets cluster of which available resource data is not greater than the resource demand data of the subtask a1 needs to be filtered out from the plurality of Kubernets clusters, so as to obtain at least one selectable Kubernets cluster corresponding to the subtask a 1. For example, there are 5 Kubernets clusters from K1 to K5, and the available resource data of K1 and K3 is smaller than the resource requirement data of the subtask a1, that is, there are not enough resources available for the K1 and K3 to run the subtask a1, so that K1 and K3 need to be filtered out of the 5 Kubernets to obtain 3 optional Kubernets clusters of K2, K4 and K5. And then carrying out priority ordering on the 3 selectable Kubernets, and determining the Kubernets cluster with the highest ordering as a target Ku bernets cluster corresponding to the subtask A1.

In this embodiment of the present invention, prioritizing the at least one selectable Kubernets cluster may include: calculating a priority score of each selectable Kubernets cluster according to the available resource data of each selectable Kubernets cluster in the at least one selectable Kubernets cluster and the load value of each selectable Kubernets cluster based on the weight value of the available resource data and the weight value of the load value; at least one of the selectable Kubernets is ordered in order from high to low according to the priority score of each of the selectable Kubernets.

For example, the subtask a1 corresponds to 3 selectable Kubernets clusters K2, K4, and K5, obtains a preset weight value of available resource data and a preset weight value of a load value, calculates a priority score of K2 according to the available resource data and the load value of K2, and calculates priority scores of K4 and K5 by using the same method. Then, K2, K4, and K5 are sorted in order of priority scores from high to low, and finally, the cluster with the top sorting (i.e., the highest priority score) can be determined as the target Kubernets cluster corresponding to the subtask a 1.

In addition, in this embodiment of the present invention, after selecting a target Kubernets cluster corresponding to each subtask from a plurality of Kubernets clusters, the task scheduling method may further include: and updating the resource data of the target Kubernets cluster. In the embodiment of the invention, the subtasks are sequenced from large to small according to the resource demand data, and then the target Kubernets cluster corresponding to each subtask is determined in sequence. After the target kubernes ts cluster corresponding to one subtask is determined, it can be considered that both the available resource data and the load value of the target Kubernets cluster are changed, so before the target Kubernets cluster corresponding to the next subtask is determined, the resource data of the target Kubernets cluster needs to be updated.

FIG. 5 is a schematic diagram of a main flow of generating a task schedule corresponding to at least one subtask according to an embodiment of the present invention. As shown in fig. 5, the main process of generating the task schedule corresponding to at least one sub-task may include:

step S501, filtering the Kubernets cluster with the unavailable working state from a plurality of Kubernets clusters;

step S502, acquiring resource demand data of each subtask, and sequencing at least one subtask according to the descending order of the resource demand data of each subtask;

step S503, selecting the subtask with the top sequence according to the sequence of the subtasks;

step S504, filtering the Kubernets cluster with available resource data not larger than the resource demand data of the subtask from the plurality of Kubernets clusters to obtain at least one optional Kubernets cluster corresponding to the subtask;

step S505, calculating a priority score of each selectable Kubernets cluster according to the available resource data of each selectable Kubernets cluster in the at least one selectable Kubernets cluster and the load value of each selectable Kubernets cluster based on the weight value of the available resource data and the weight value of the load value;

step S506, sequencing at least one selectable Kubernets cluster according to the priority score of each selectable Kubernets cluster from high to low, and determining the Kubernets cluster with the highest sequencing as a target Kubernets cluster corresponding to the subtask;

step S507, eliminating the subtask from the sequence of the subtask, and updating the resource data of the target Kubernets cluster corresponding to the subtask;

step S508, determining whether a target Kubernets cluster corresponding to each subtask in the at least one subtask has been determined, if yes, performing step S509;

in step S509, a task schedule table corresponding to at least one subtask is generated.

In the process of generating the task schedule shown in steps S501 to S509, an unavailable Kubernets cluster can be filtered out from a plurality of Kubernets clusters, so that the generated task schedule is more reasonable; at least one subtask can be sequenced according to the sequence of the resource demand data of each subtask from large to small, and a target Kubernets cluster corresponding to the subtask is sequentially determined according to the sequencing sequence of the subtasks, so that the reasonable scheduling of each subtask can be realized; in the process of determining the target Kubernets cluster corresponding to each subtask, the accuracy of the target Kubernets cluster can be improved by combining the available resource data and the load value of the Kubernes ts cluster, and the reasonability of the task scheduling table is further improved.

In this embodiment of the present invention, after distributing at least one subtask to at least one target Kuber nets cluster, the task scheduling method may further include: and if the sub-task operation result returned by the target Kubernets cluster is not obtained within the preset time, recovering the sub-task distributed to the target Kubernets cluster, re-scheduling the recovered sub-task, and sending the alarm information of the target Kubernets cluster. That is to say, after the big data task management component distributes the subtask to the target Kubernets cluster, the target Kubernets cluster may run the subtask, and if the target Kubernets cluster does not return the subtask running result to the big data task management component within a preset time, it indicates that the target Kubernets cluster may have a fault problem or other problems. In addition, as the target Kubernets cluster does not return the subtask operation result within the preset time, the target Kubernets cluster is considered to have a fault problem or other problems, and therefore the alarm information of the target Kubernets cluster needs to be sent so that the target Kubernets cluster can be maintained in time.

Next, the operation of each part in the cluster management system will be described in detail with reference to the schematic structural diagram of the cluster management system shown in fig. 1. The cluster management system shown in fig. 1 includes: a big data task management component and multiple Kubernets clusters. Wherein, the big data task management component can include: the system comprises a big data task submitter, a task manager and a task result processor. Each cluster of K ubernets may contain the following plug-ins: big data component agent, big data task driving service, big data task execution service and kube-schedule. The plug-ins that the Kubernets cluster contains run within the Kubernets cluster in the form of pod (i.e., the basic unit of the Kubernets cluster, which is the smallest component created or deployed by a user, and also the resource object on which the containerized application runs on the Kubernets cluster) or service. The big data task management component is independent of each Kubernets cluster and is equivalent to an external manager of each Kubernets cluster.

The big data task submitter is mainly used for submitting a big data operation task, formatting the big data operation task and submitting the big data operation task to the task manager. The task manager is mainly used for dividing the big data operation task into a plurality of subtasks; simultaneously initiating a resource usage inquiry action, consulting big data component agents running on different Kubernets clusters, and acquiring current resource usage conditions on different Kuberne ts clusters; and scheduling the tasks according to the resource demand data of the subtasks and the current resource use condition of each current cluster to form a task scheduling table. The task result processor is mainly used for receiving subtask operation results returned by the big data component agents in different Kubernets, and combining the subtask operation results to obtain a task operation result of the big data operation task.

The big data component agent runs in a Kubernets cluster and mainly provides the following services: receiving the distributed subtasks; feeding back the load condition of the current cluster resource and providing initial judgment basis data of the subtask scheduling in the task manager; and receiving the running result return function of the subtask. In addition, for a big data component agent of a certain Kubernets cluster, the big data component agent receives a subtask list dispatched to the Kubernets cluster, converts the subtask list into a format identified by a kube-schedule of the Kubernets cluster, and after the kube-schedule receives the subtask, firstly, a node is selected from the nodes of the Kubernets cluster to create a big data task driving service for organizing the concrete operation of the big data subtask.

The big data task driving service is responsible for basic mirror image pulling and big data subtask encapsulation, and can monitor the running condition of the big data task execution service and ensure the normal running of the subtasks. Specifically, the big data task driving service may obtain a mirror image of the big data task running component, transmit the big data running code and the configuration file into the mirror image, package the mirror image to form a specific mirror image of the task running, and place the specific mirror image in the mirror image warehouse. And after detecting that the mirror image is ready, the big data component agent issues a task execution instruction to the kube-schedule, the kube-schedule executes Kubernets cluster internal scheduling, and a big data task execution service corresponding to different working nodes in the Kubernets cluster is utilized to execute specific subtasks. The big data task execution service operates on the container, calls the bottom layer resources and completes the operation of the subtasks; and after the execution is finished, setting the running state as complied, and reporting and returning the result to the big data task driving service. And the big data task driving service receives results returned by all the big data task execution services and sets the execution state of the big data task driving service to be co-validated. And after monitoring that the execution state of the big data task driving service is c acknowledged, the big data component agent can actively acquire the subtask operation result stored in the big data task driving service and return the subtask operation result to the task result processor of the big data task management component.

Fig. 6 is a schematic diagram of main blocks of a task scheduling apparatus according to an embodiment of the present invention. As shown in fig. 6, the main modules of the task scheduler 600 may include: a receiving module 601, a scheduling module 602, and an obtaining module 603.

Wherein, the receiving module 601 may be configured to: receiving a task scheduling request, acquiring a big data running task according to the task scheduling request, and dividing the big data running task into at least one subtask; the scheduling module 602 may be configured to: generating a task scheduling table corresponding to at least one subtask, and distributing the at least one subtask to at least one target Kubernets cluster according to the task scheduling table; the obtaining module 603 may be configured to: and after at least one subtask is operated by at least one target Kubernets cluster, acquiring a subtask operation result returned by the at least one target Kubernets cluster. The at least one target Kubernets cluster is a cluster in a plurality of Kubernets clusters, and the plurality of Kubernets clusters correspond to unified bottom storage.

In this embodiment of the present invention, the scheduling module 602 may further be configured to: acquiring resource demand data of each subtask, and sequencing at least one subtask according to the descending order of the resource demand data of each subtask; and determining a target Kubernets cluster corresponding to each subtask according to the sequence of the at least one subtask to generate a task scheduling table corresponding to the at least one subtask.

In this embodiment of the present invention, the scheduling module 602 may further be configured to: acquiring resource data of a plurality of Kubernets clusters; and selecting a target Kubernets cluster corresponding to each subtask from the Kubernets clusters according to the resource data of the Kubernets clusters and the resource demand data of each subtask.

In this embodiment of the present invention, the resource data of the plurality of Kubernets may include: available resource data for each K ubernets cluster and a load value for each Kubernets cluster; and, the scheduling module 602 may be further operable to: filtering the Kubernets cluster with available resource data not larger than the resource demand data of each subtask from the plurality of Kubernets clusters to obtain at least one optional Kubernets cluster corresponding to each subtask; and performing priority ordering on at least one selectable Kubernets cluster, and determining the Kubernets cluster with the highest ordering as a target Kubernets cluster corresponding to each subtask.

In this embodiment of the present invention, the scheduling module 602 may further be configured to: calculating a priority score of each selectable Kubernets cluster according to the available resource data of each selectable Kubernets cluster in the at least one selectable Kubernets cluster and the load value of each selectable Kubernets cluster based on the weight value of the available resource data and the weight value of the load value; at least one of the selectable Kubernets is ordered in order from high to low according to the priority score of each of the selectable Kubernets.

In this embodiment of the present invention, the scheduling module 602 may further be configured to: and updating the resource data of the target Kubernets cluster.

In this embodiment of the present invention, the scheduling module 602 may further be configured to: kubernets that are not operational are filtered out from the plurality of Kubernets.

In this embodiment of the present invention, the receiving module 601 may further be configured to: and formatting the big data operation task to complete format verification of the big data operation task.

In this embodiment of the present invention, the receiving module 601 may further be configured to: acquiring a task result dimension corresponding to a big data operation task; and dividing the big data operation task according to the dimension of the task result to obtain at least one subtask.

In this embodiment of the present invention, the receiving module 601 may further be configured to: and combining the subtasks with the dependency relationship in at least one subtask so as to distribute the subtasks with the dependency relationship to a target Kubernets cluster.

In this embodiment of the present invention, the obtaining module 603 may further be configured to: and if the sub-task operation result returned by the target Kubernets cluster is not obtained within the preset time, recovering the sub-task distributed to the target Kubernets cluster, re-scheduling the recovered sub-task, and sending the alarm information of the target Kubernets cluster.

The task scheduling device divides a big data operation task into one or more subtasks, generates a task scheduling table corresponding to the subtasks, distributes the divided subtasks to at least one target Kubernets cluster according to the task scheduling table, and finally can obtain a subtask operation result returned by the target Kubernets cluster.

Fig. 7 shows an exemplary system architecture 700 to which a task scheduling method or a task scheduling apparatus of an embodiment of the present invention may be applied.

As shown in fig. 7, the system architecture 700 may include

terminal devices

701, 702, 703, a network 704, and a server 705. The network 704 serves to provide a medium for communication links between the

terminal devices

701, 702, 703 and the server 705. Network 704 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

701, 702, 703 to interact with a server 705 over a network 704, to receive or send messages or the like. Various communication client applications may be installed on the

terminal devices

701, 702, 703. The

terminal devices

701, 702, 703 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. The server 705 may be a server that provides various services, such as a backend management server that provides service support. The background management server can analyze and process the received data such as the service request and feed back the processing result to the terminal equipment.

It should be noted that the task scheduling method provided by the embodiment of the present invention is generally executed by the server 705, and accordingly, the task scheduling apparatus is generally disposed in the server 705.

It should be understood that the number of terminal devices, networks, and servers in fig. 7 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 8, a block diagram of a computer system 800 suitable for use with a server device implementing an embodiment of the invention is shown. The terminal device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program performs the above-described functions defined in the system of the present invention when executed by the central processing unit (CP U) 801.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a receiving module, a scheduling module, and an obtaining module. For example, the receiving module may be further described as a module that receives a task scheduling request, obtains a big data running task according to the task scheduling request, and divides the big data running task into at least one sub-task.

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: receiving a task scheduling request, acquiring a big data running task according to the task scheduling request, and dividing the big data running task into at least one subtask; generating a task scheduling table corresponding to at least one subtask, and distributing the at least one subtask to at least one target Kubernets cluster according to the task scheduling table, wherein the at least one target Kubernets cluster is a cluster in a plurality of Kubernes ts clusters; and after at least one subtask is operated by at least one target Kubernets cluster, acquiring a subtask operation result returned by the at least one target Kubernets cluster.

According to the technical scheme of the embodiment of the invention, the big data operation task is divided into one or more subtasks, a task scheduling table corresponding to the subtasks is generated, then the divided subtasks are distributed to at least one target Kubernets cluster according to the task scheduling table, and finally the subtask operation result returned by the target Kubernets cluster can be obtained, so that the big data operation task can be changed into multi-cluster joint operation from the previous single-cluster operation condition, the problems of deadlock and long operation time of the task with large resource occupation ratio in the prior art are solved, the utilization rate of the whole bottom layer computing resource is improved, and conditions are provided for the following more big data and container combination scenes.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for task scheduling, comprising:

receiving a task scheduling request, acquiring a big data running task according to the task scheduling request, and dividing the big data running task into at least one subtask;

generating a task schedule corresponding to the at least one subtask, and distributing the at least one subtask to at least one target Kubernets cluster according to the task schedule, wherein the at least one target Kubernets cluster is a cluster in a plurality of Kubernets clusters;

and after the at least one target Kubernets cluster runs the at least one subtask, obtaining a subtask running result returned by the at least one target Kubernets cluster.

2. The method of claim 1, wherein generating the task schedule corresponding to the at least one subtask comprises:

acquiring resource demand data of each subtask, and sequencing the at least one subtask according to the descending order of the resource demand data of each subtask;

and determining a target Kubernets cluster corresponding to each subtask according to the sequence of the at least one subtask to generate a task scheduling table corresponding to the at least one subtask.

3. The method of claim 2, wherein the determining the target Kubernets cluster corresponding to each subtask comprises:

acquiring resource data of the plurality of Kubernets;

and selecting a target Kubernets cluster corresponding to each subtask from the Kubernets clusters according to the resource data of the Kubernets clusters and the resource demand data of each subtask.

4. The method of claim 3, wherein the resource data for the plurality of Kubernets clusters comprises: available resource data for each Kubernets cluster and a load value for each Kuber nets cluster; and the number of the first and second groups,

selecting a target Kubernets cluster corresponding to each subtask from the plurality of Kubernets clusters according to the resource data of the plurality of Kubernets clusters and the resource demand data of each subtask, including:

filtering, from the plurality of Kubernets clusters, Kubernets clusters whose available resource data is not greater than the resource demand data of each subtask to obtain at least one optional Kubernets cluster corresponding to each subtask;

and performing priority ranking on the at least one selectable Kubernets cluster, and determining the Kubernets cluster with the highest ranking as the target Kubernets cluster corresponding to each subtask.

5. The method of claim 4, wherein said prioritizing said at least one optional Kubernets cluster comprises:

calculating a priority score for each of the at least one selectable Kubernets cluster based on the weight value for available resource data and the weight value for load values, based on the available resource data and the load values for each of the selectable Kubernets cluster and the available resource data for each of the selectable Kubernets cluster;

and ordering the at least one selectable Kubernets cluster according to the priority score of each selectable Kubernets cluster from high to low.

6. The method of claim 3, wherein after selecting the target Kubernets cluster corresponding to each subtask from the plurality of Kube rnets clusters, the method further comprises:

and updating the resource data of the target Kubernets cluster.

7. The method of claim 1, wherein prior to generating the task schedule for the at least one subtask, the method further comprises:

filtering the Kubernets cluster whose working status is unavailable from the plurality of Kubernets clusters.

8. The method of claim 1, wherein after obtaining a big data run task, the method further comprises:

and formatting the big data running task to complete format verification of the big data running task.

9. The method of claim 1, wherein the dividing the big data execution task into at least one subtask comprises:

acquiring a task result dimension corresponding to the big data operation task;

and dividing the big data operation task according to the task result dimension to obtain the at least one subtask.

10. The method of claim 9, wherein after the dividing the big data execution task into the at least one subtask according to the task result dimension, the method further comprises:

and combining the subtasks with the dependency relationship in the at least one subtask together so as to distribute the subtask with the dependency relationship to a target Kubernets cluster.

11. The method in accordance with claim 1, wherein after distributing the at least one subtask to at least one target Kubernets cluster, the method further comprises:

and if the sub-task operation result returned by the target Kubernets cluster is not obtained within the preset time, recovering the sub-task distributed to the target Kubernets cluster, re-scheduling the recovered sub-task, and sending the alarm information of the target Kubernets cluster.

12. The method of any of claims 1-11, wherein the plurality of K ubernets correspond to a unified underlying storage.

13. A task scheduling apparatus, comprising:

the receiving module is used for receiving a task scheduling request, acquiring a big data running task according to the task scheduling request, and dividing the big data running task into at least one subtask;

the scheduling module is used for generating a task scheduling table corresponding to the at least one subtask, and distributing the at least one subtask to at least one target Kubernets cluster according to the task scheduling table, wherein the at least one target Kubernets cluster is a cluster in a plurality of Kubernets clusters;

and the obtaining module is used for obtaining a subtask operation result returned by the at least one target Kubernets cluster after the at least one target Kubernets cluster operates the at least one subtask.

14. The apparatus of claim 13, wherein the scheduling module is further configured to:

15. A cluster management system, the cluster management system comprising: the system comprises a big data task management component and a plurality of Kubernets clusters;

the big data task management component is configured to: executing the task scheduling method of any of claims 1 to 12 to divide a big data execution task into at least one subtask and then distribute the at least one subtask to at least one target Kubernets cluster;

the at least one target Kubernets cluster is a cluster of the plurality of Kubernets clusters, configured to: and running the at least one subtask, and returning a subtask running result to the big data task management component.

16. An electronic device, comprising:

one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method according to any one of claims 1-12.

17. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-12.