CN116302457A

CN116302457A - Cloud primary workflow engine implementation method, system, medium and electronic equipment

Info

Publication number: CN116302457A
Application number: CN202310598352.0A
Authority: CN
Inventors: 毛良献; 高丰; 孙铭鸽; 陈旭东; 白文媛
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2023-05-25
Filing date: 2023-05-25
Publication date: 2023-06-23

Abstract

The specification discloses a cloud native workflow engine implementation method, a system, a medium and electronic equipment, wherein the method is applied to multiple clusters and comprises the following steps: firstly, determining a scheduling diagram of a task to be executed according to a task file of the task to be executed, which is input by a user, and determining a subtask queue. Then, for each sub-task in the sub-task queue, a scheduling cluster of the sub-task is determined from the multiple clusters. When the subtask is executed, determining the resource information of a scheduling cluster of the subtask in the multi-cluster, determining a designated cluster from the scheduling clusters meeting the load condition according to the resource information and the preset load condition, and sending the subtask to the designated cluster so that the designated cluster executes the subtask. And when the execution of each subtask is finished, determining an execution result of the task to be executed. Different subtasks can be executed by different clusters, complex and large-scale workflow tasks can be better executed, and the execution rate is improved.

Description

Cloud primary workflow engine implementation method, system, medium and electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method, a system, a medium, and an electronic device for implementing a cloud native workflow engine.

Background

With the continuous development of technology, the workflow functions more and more remarkably. Wherein, the workflow is an abstract and generalized description of the workflow and business rules between operational steps in the workflow.

One of the primary uses of a workflow is to automatically transfer documents, information or tasks between a plurality of participants using a computer according to some predetermined rule in order to achieve a certain business objective. Workflow is essentially an abstraction of a business process and requires implementation by a workflow engine. Meanwhile, the workflow engine may be a cloud-native oriented architecture, where cloud native is a method of building and running applications, and where the cloud represents that the applications are located in the cloud, rather than in a traditional data center. The native representation designs the application program towards the cloud environment, i.e. the cloud environment is considered when the application program is designed to enable the application program to run on the cloud, and the elasticity and the distributed advantages of the cloud platform are fully utilized and exerted. Therefore, how to implement a cloud native workflow engine is very important in the business implementation process.

Based on the above, the specification provides a cloud native workflow engine implementation method.

Disclosure of Invention

The present disclosure provides a method, a system, a medium, and an electronic device for implementing a cloud native workflow engine, so as to partially solve the foregoing problems in the prior art.

The technical scheme adopted in the specification is as follows:

the specification provides a cloud native workflow engine implementation method, which is applied to multiple clusters, and comprises the following steps:

responding to a task file of a task to be executed, which is input by a user, and determining a scheduling diagram of the task to be executed according to the task file;

determining a subtask queue of the task to be executed according to the scheduling diagram;

determining a scheduling cluster of each subtask in the subtask queue from the multi-cluster aiming at the subtask;

when the subtask is executed, determining the resource information of a scheduling cluster of the subtask in the multi-cluster, and determining the scheduling cluster for executing the subtask from the scheduling clusters meeting the load condition according to the determined resource information and the preset load condition, wherein the scheduling cluster is used as a designated cluster of the subtask;

transmitting the subtask to the appointed cluster so that the appointed cluster executes the subtask;

And when the execution of each subtask of the task to be executed is finished, determining the execution result of the task to be executed.

Optionally, determining the scheduling cluster of the subtask from the multiple clusters specifically includes:

basic information of each cluster is obtained, wherein the basic information at least comprises performance information;

according to the basic information of the multi-cluster, determining a candidate set consisting of clusters meeting the scheduling conditions of the subtasks from the multi-cluster;

and determining the sequence of the performance information of each cluster according to the basic information of each cluster in the to-be-selected clusters, and selecting the scheduling cluster of the subtask according to the sequence.

Optionally, according to the determined resource information and the preset load condition, determining a scheduling cluster for executing the subtask from the scheduling clusters meeting the load condition, wherein the scheduling cluster is used as a designated cluster of the subtask, and specifically comprises:

judging whether the scheduling cluster of the subtask meets a preset load condition according to the determined resource information;

if yes, determining the scheduling cluster of the subtask as the appointed cluster of the subtask;

if not, deleting the scheduling cluster of the subtask from the candidate set of the subtask, re-determining the scheduling cluster of the subtask from the deleted candidate set of the subtask, acquiring the resource information of the re-determined scheduling cluster, and continuously judging whether the re-determined scheduling cluster meets the load condition or not until the appointed cluster of the subtask is determined.

Optionally, the method further comprises:

when the to-be-selected set of the subtask is an empty set, monitoring resource information of each cluster meeting the scheduling condition of the subtask;

and responding to the monitored resource information of any cluster to meet the load condition, and taking the cluster meeting the load condition as a designated cluster of the subtask.

Optionally, determining the scheduling diagram of the task to be executed according to the task file specifically includes:

determining each subtask corresponding to the task to be executed and the dependency relationship among the subtasks according to the task file;

and determining a scheduling diagram of the task to be executed according to the dependency relationship among the subtasks.

Optionally, determining the subtask queue of the task to be executed according to the scheduling diagram specifically includes:

and decomposing the scheduling diagram according to the topological structure of the scheduling diagram, and determining the subtask queue of the task to be executed.

and determining a cluster meeting the scheduling condition of the subtask from the multi-cluster according to the basic information of the multi-cluster, and taking the cluster as the scheduling cluster of the subtask.

basic information of each cluster is obtained;

determining a plurality of duplicate tasks corresponding to the subtasks, and determining a cluster meeting the scheduling conditions of the duplicate tasks from the multi-clusters according to the basic information of the multi-clusters for each duplicate task to serve as a scheduling cluster of the duplicate task;

when the subtask is executed, determining the resource information of a scheduling cluster of the subtask in the multi-cluster, and determining the scheduling cluster for executing the subtask from the scheduling clusters meeting the load condition according to the determined resource information and the preset load condition, wherein the scheduling cluster is used as a designated cluster of the subtask and specifically comprises the following steps:

when the duplicate task of the subtask is executed, determining the resource information of a scheduling cluster of the duplicate task in the multi-cluster, and determining the scheduling cluster for executing the duplicate task from the scheduling clusters meeting the load condition according to the determined resource information and the preset load condition as a designated cluster of the duplicate task;

the subtask is sent to the appointed cluster, so that the appointed cluster executes the subtask, and the subtask comprises the following specific steps:

And sending the duplicate task to the appointed cluster so that the appointed cluster executes the duplicate task.

Optionally, after sending the subtask to the specified cluster, the method further comprises:

monitoring the execution condition of the subtask;

and when the completion of the execution of the subtask is monitored, sending a resource release instruction to the appointed cluster so as to enable the appointed cluster to delete files required for executing the subtask.

Optionally, before determining the schedule of the task to be performed, the method further includes:

and determining the format of the task file to be a prestored declaration format.

The present specification provides a cloud native workflow engine implementation system, the system including an injection module, an input module, a task scheduling module, and a task execution module, the system being applied to multiple clusters, wherein:

the injection module is used for responding to a task file of a task to be executed, which is input by a user, and determining a scheduling diagram of the task to be executed according to the task file;

the input module is used for determining a subtask queue of the task to be executed according to the scheduling diagram;

the task scheduling module is used for determining a scheduling cluster of each subtask of the subtask queue of the task to be executed, which is determined by the input module, from the multi-cluster, and sending the determined scheduling cluster of the subtask to the task executing module;

The task execution module is used for determining the resource information of the scheduling cluster of the subtask in the multi-cluster when receiving the scheduling cluster of the subtask sent by the task scheduling model, determining the scheduling cluster for executing the subtask from the scheduling clusters meeting the load condition according to the determined resource information and the preset load condition, and sending the subtask to the appointed cluster as the appointed cluster of the subtask so as to enable the appointed cluster to execute the subtask, and determining the execution result of the to-be-executed task when the execution of each subtask of the to-be-executed task is finished.

Optionally, the system further comprises a cluster management module;

the cluster management module is used for acquiring and storing the information of the multiple clusters, wherein the information comprises resource information and basic information;

the task execution module is specifically configured to, when receiving the scheduling cluster of the subtask sent by the task scheduling model, acquire resource information of the multiple clusters from the cluster management module, and determine resource information of the scheduling cluster of the subtask in the multiple clusters.

Optionally, the task scheduling module is specifically configured to obtain basic information of each cluster, determine, from the multiple clusters, a candidate set that is formed by each cluster and meets a scheduling condition of the subtask according to the basic information of each cluster in the candidate set, determine a ranking of performance information of each cluster according to the basic information of each cluster in the candidate set, and select a scheduling cluster of the subtask according to the ranking, where the basic information at least includes the performance information.

Optionally, the task execution module is specifically configured to determine whether the scheduling cluster of the subtask meets a preset load condition according to the determined resource information, if yes, determine that the scheduling cluster of the subtask is a designated cluster of the subtask, if not, delete the scheduling cluster of the subtask from the candidate set of the subtask, redetermine the scheduling cluster of the subtask from the deleted candidate set of the subtask, obtain the resource information of the redetermined scheduling cluster, and continuously determine whether the redetermined scheduling cluster meets the load condition until the designated cluster of the subtask is determined.

Optionally, the task execution module is further configured to detect resource information of each cluster that satisfies a scheduling condition of the subtask when the candidate set of the subtask is an empty set, and respond to the monitored resource information of any cluster that satisfies the load condition, and use the cluster that satisfies the load condition as the designated cluster of the subtask.

Optionally, the injection module is specifically configured to determine, according to the task file, each subtask corresponding to the task to be executed and a dependency relationship between the subtasks, and determine, according to the dependency relationship between the subtasks, a scheduling graph of the task to be executed.

Optionally, the input module is specifically configured to decompose the scheduling diagram according to a topology structure of the scheduling diagram, and determine the subtask queue of the task to be executed.

Optionally, the task scheduling module is specifically configured to obtain the basic information of the multiple clusters from the cluster management module, determine a cluster that meets the scheduling condition of the subtask, and use the cluster as the scheduling cluster of the subtask.

Optionally, the task scheduling module is specifically configured to obtain basic information of the multiple clusters from the cluster management module, determine a plurality of duplicate tasks corresponding to the subtasks, and for each duplicate task, determine, according to the basic information of the multiple clusters, a cluster that meets a scheduling condition of the duplicate task from the multiple clusters, as a scheduling cluster of the duplicate task;

the task execution module is specifically configured to determine, when the duplicate task of the subtask is executed, resource information of a scheduling cluster of the duplicate task in the multiple clusters, and determine, according to the determined resource information and a preset load condition, the scheduling cluster for executing the duplicate task from the scheduling clusters meeting the load condition, as a designated cluster of the duplicate task, and send the duplicate task to the designated cluster, so that the designated cluster executes the duplicate task.

Optionally, the system further comprises a task monitoring module and a resource destructing module;

the task monitoring module is used for monitoring the execution condition of each subtask in the subtask queue;

the resource destructing module is used for sending a resource releasing instruction to the appointed cluster of the subtask when the task monitoring module monitors that the execution of the subtask is completed aiming at each monitored subtask, so that the appointed cluster of the subtask deletes files required by the execution of the subtask.

Optionally, the injection module is specifically configured to verify the format of the task file according to a prestored declaration format, and when verification is passed, determine the schedule of the task to be executed according to the task file.

The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the cloud native workflow engine implementation method described above.

The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the cloud native workflow engine implementation method described above when executing the program.

The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:

according to the cloud primary workflow engine implementation method, firstly, a task file of a task to be executed, which is input by a user, is responded, and a scheduling diagram of the task to be executed is determined according to the task file. And determining a subtask queue of the task to be executed according to the scheduling diagram. Then, for each sub-task in the sub-task queue, a scheduling cluster for the sub-task is determined from the multiple clusters. When the subtask is executed, determining the resource information of a scheduling cluster of the subtask in the multi-cluster, determining the scheduling cluster for executing the subtask from the scheduling clusters meeting the load condition according to the determined resource information and the preset load condition, and sending the subtask to the designated cluster as the designated cluster of the subtask so as to enable the designated cluster to execute the subtask. And when the execution of each subtask of the task to be executed is finished, determining the execution result of the task to be executed.

According to the method, when the cloud primary workflow engine is realized, a scheduling diagram of a task to be executed is determined according to a task file of the task to be executed, which is input by a user, and a subtask queue is determined. Then, for each sub-task in the sub-task queue, a scheduling cluster of the sub-task is determined from the multiple clusters. When the subtask is executed, determining the resource information of a scheduling cluster of the subtask in the multi-cluster, determining a designated cluster from the scheduling clusters meeting the load condition according to the resource information and the preset load condition, and sending the subtask to the designated cluster so that the designated cluster executes the subtask. And when the execution of each subtask is finished, determining an execution result of the task to be executed. Different subtasks can be executed by different clusters, complex and large-scale workflow tasks can be better executed, and the execution rate is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. In the drawings:

fig. 1 is a schematic flow chart of a method for implementing a cloud native workflow engine provided in the present specification;

FIG. 2 is a schematic illustration of a task schedule diagram provided in the present specification;

FIG. 3 is a schematic illustration of another task schedule provided in the present specification;

FIG. 4 is a schematic diagram of a state transition of a lifecycle of a task to be performed provided in the present specification;

FIG. 5 is a schematic diagram of a cloud native workflow engine implementation system provided herein;

FIG. 6 is a schematic diagram of another cloud native workflow engine implementation system provided herein;

fig. 7 is a schematic structural diagram of an electronic device corresponding to fig. 1 provided in the present specification.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.

The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.

Fig. 1 is a flow chart of a method for implementing a cloud native workflow engine provided in the present specification, where the method shown in fig. 1 is applied to multiple clusters, and specifically includes the following steps:

s100: and responding to a task file of a task to be executed, which is input by a user, and determining a scheduling diagram of the task to be executed according to the task file.

In the specification, a device for realizing a cloud native workflow engine responds to a task file of a task to be executed input by a user, and determines a scheduling diagram of the task to be executed according to the task file. The device for implementing the cloud native workflow engine may be the workflow engine itself, a node server or a cluster where the workflow engine is located, or an electronic device such as a desktop computer or a notebook computer. For convenience of description, the method for implementing the cloud native workflow engine provided in the present specification will be described below with only a server as an execution body. The cluster at least comprises a node server, the multiple clusters are multiple clusters, and the cloud native workflow engine implementation method shown in fig. 1 is applied to the multiple clusters. In addition, the clusters are clusters based on portable container orchestration technology, such as the Kubernetes (k 8s for short) tool, which is common at present. Each cluster is provided with k8s, and when a certain cluster receives a task, the k8s of the cluster can schedule the task to the nodes in the cluster so that the nodes execute the task. Since k8s is a container-based orchestration technology, and can be used for managing containers, that is, orchestrating and scheduling tasks, and the containers are implementation carriers of a cloud native architecture, the present specification provides a cloud native workflow engine implementation method.

In this specification, a user may input a task file of a task to be performed in a command line interface (Command Line Interface, abbreviated as CLI) or a front end interface. Specifically, the server may determine, according to a task file of a task to be executed, each subtask corresponding to the task to be executed and a dependency relationship between the subtasks, and determine, according to the dependency relationship between the subtasks, a scheduling graph of the task to be executed, where the task file is a related file describing information of the task to be executed. The task to be executed comprises at least one subtask. The scheduling graph is a directed acyclic graph and is used for representing the dependency relationship of each subtask of a task to be executed, the nodes in the scheduling graph represent each subtask of the task to be executed, and the edges between the nodes represent the dependency relationship among the nodes, namely the dependency relationship among the subtasks represented by the nodes. For example, as shown in fig. 2, fig. 2 is a schematic diagram of a task scheduling diagram provided in the present specification, nodes a to D in fig. 2 are subtasks of a task O to be executed, an edge between the nodes indicates that there is a dependency relationship between two nodes, execution of the subtask a is independent of other subtasks, execution of the subtask B or the subtask C is dependent on the subtask a, and execution of the subtask D is dependent on the subtask B and the subtask C.

When the current node is executed, other nodes are needed to be relied on, namely when the subtasks corresponding to the current node are executed, execution results of the subtasks corresponding to the other nodes or data of the subtasks corresponding to the other nodes are needed, and the fact that the current node is executed and the other nodes are needed to be relied on is indicated, and a dependency relationship exists between the current node and the other nodes. For example, the subtasks of the current node are calculated as "a+b", and the subtasks of other nodes are obtained data corresponding to a and b respectively, so that when the subtasks of the current node are executed, the execution results of the subtasks of the other nodes are needed, and therefore, when the current node is executed, the subtasks of the current node are needed to be dependent on the other nodes, and a dependency relationship exists between the current node and the other nodes.

S102: and determining a subtask queue of the task to be executed according to the scheduling diagram.

The server may determine a subtask queue of tasks to be performed according to the schedule. Specifically, the server may decompose the schedule diagram according to the topology structure of the schedule diagram, and determine a subtask queue of the task to be executed. When the scheduling diagram is decomposed according to the topology structure of the scheduling diagram and the subtask queues are determined, if a node (i.e., a subtask) in the scheduling diagram has a dependency relationship with other nodes at most, the scheduling diagram is directly decomposed according to the topology structure of the scheduling diagram and the subtask queues are determined, for example, as shown in fig. 3, fig. 3 is a schematic diagram of another task scheduling diagram provided in the specification, nodes E-H in fig. 3 are subtasks of a task P to be executed, each node corresponds to one subtask, the execution of the subtask E and other subtasks (i.e., nodes E-H) do not have a dependency relationship, the execution of the subtask F depends on the subtask E, the execution of the subtask G depends on the subtask F, and the execution of the subtask H depends on the subtask G. Each node E to H in fig. 3 has a dependency relationship with at most other nodes, so that the scheduling diagram can be directly decomposed according to the topology structure of the scheduling diagram, and subtask queues of the task P to be executed, namely, subtask E, subtask F, subtask G and subtask H, are determined.

If more than one dependency relationship exists between any node (i.e. subtask) in the scheduling diagram, when the scheduling diagram is decomposed according to the topology structure of the scheduling diagram, the sequence among the subtasks can be randomly determined according to the topology structure of the scheduling diagram when the subtask queue is determined, and the subtask B and the subtask C in FIG. 2 are all dependent on the subtask A, so that the subtask B and the subtask C are executed after the subtask A, but have no dependency relationship, and the subtask D is dependent on the subtask A, so that the scheduling diagram is decomposed according to the topology structure of the scheduling diagram, and the determined subtask queue can be the subtask A, the subtask B, the subtask C, the subtask D, or the subtask A, the subtask C, the subtask B and the subtask D.

S104: for each sub-task in the sub-task queue, determining a scheduling cluster of the sub-task from the multi-cluster.

The server determines, for each sub-task in the sub-task queue, a scheduling cluster for the sub-task from among the multiple clusters. Specifically, basic information of each cluster is acquired for each subtask in a subtask queue, and a cluster meeting the scheduling condition of the subtask is determined from the multiple clusters according to the basic information of the multiple clusters and is used as a scheduling cluster of the subtask. The obtaining basic information of each cluster is obtaining basic information of each cluster in the multiple clusters, where the basic information at least includes performance information, such as main frequency and external frequency cache of a central processing unit (Central Processing Unit, abbreviated as CPU) of a node in the cluster, and may also include attribute information, such as the number of nodes (i.e. servers) in the cluster, and further, whether a graphics processor (Graphic Process Unit, abbreviated as GPU) is provided in the cluster.

The scheduling conditions of each subtask may be the same or different, the scheduling conditions may be set according to the execution requirements of the subtasks, and the execution requirements of different subtasks may be different. When a cluster meets the execution requirement of a subtask, that is, the cluster can execute the subtask, the cluster meets the scheduling condition of the subtask. When a cluster does not meet the execution requirement of a subtask, that is, the cluster cannot execute the subtask, the cluster does not meet the scheduling condition of the subtask. The scheduling condition indicates that resources supporting execution of a subtask, such as a CPU that requires 6 cores to execute a subtask, may provide a cluster with 6 cores of CPU. For another example, if a GPU is required to execute a subtask, then the scheduling condition of the subtask may provide a graphics processor (Graphic Process Unit, GPU for short) for the cluster.

When determining a cluster meeting the scheduling condition of the subtask from the multiple clusters and taking the cluster as the scheduling cluster of the subtask, the server can determine the cluster meeting the scheduling condition of the subtask from the multiple clusters and randomly select at least one cluster from the determined clusters as the scheduling cluster of the subtask.

S106: when the subtask is executed, determining the resource information of a scheduling cluster of the subtask in the multi-cluster, and determining the scheduling cluster for executing the subtask from the scheduling clusters meeting the load condition according to the determined resource information and the preset load condition, wherein the scheduling cluster is used as a designated cluster of the subtask.

When the server executes the subtask, determining the resource information of a scheduling cluster of the subtask in the multi-cluster, and determining the scheduling cluster for executing the subtask from the scheduling clusters meeting the load condition according to the determined resource information and the preset load condition, wherein the scheduling cluster is used as a designated cluster of the subtask. The resource information is available resource information of the cluster, such as the remaining memory of the nodes in the cluster. The load condition is a load condition that needs to be satisfied by a cluster that can execute the subtask, and when the load condition of the cluster satisfies the load condition, the cluster can execute the subtask. The load condition may be a specified threshold, and when the load rate of the cluster reaches the specified threshold, it is indicated that the cluster does not satisfy the load condition. And when the load rate of the cluster does not reach the specified threshold, the cluster is indicated to meet the load condition.

Specifically, when the subtask is executed, the resource information of the scheduling cluster of the subtask in the multi-cluster can be determined, and one scheduling cluster is randomly selected as the scheduling cluster for executing the subtask from the scheduling clusters meeting the load condition according to the determined resource information and the preset load condition.

Since the server may randomly select at least one scheduling cluster of the subtasks from among the clusters satisfying the scheduling conditions in the above step S104. If the scheduling clusters of the subtask determined in the step S104 do not meet the load condition when the subtask is executed, the server may reselect the scheduling cluster from the clusters of the subtask that meet the scheduling condition in the step S104, and continuously determine whether the reselected scheduling cluster meets the load condition until determining that the designated cluster for executing the subtask is executed.

S108: and sending the subtask to the appointed cluster so that the appointed cluster executes the subtask.

S110: and when the execution of each subtask of the task to be executed is finished, determining the execution result of the task to be executed.

The server may first send the subtask to a designated cluster of the subtask to cause designated turns of execution of the subtask. Then, when the execution of each subtask of the task to be executed is finished, determining the execution result of the task to be executed. The designated clusters corresponding to each subtask can be the same or different. When the execution of each subtask of the task to be executed is finished, the execution of the task to be executed is finished, and the server can determine the execution result of the task to be executed.

According to the method, when the workflow engine is realized, the scheduling diagram of the task to be executed is determined according to the task file of the task to be executed, which is input by the user, and the subtask queue is determined. Then, for each sub-task in the sub-task queue, a scheduling cluster of the sub-task is determined from the multiple clusters. When the subtask is executed, determining the resource information of a scheduling cluster of the subtask in the multi-cluster, determining a designated cluster from the scheduling clusters meeting the load condition according to the resource information and the preset load condition, and sending the subtask to the designated cluster so that the designated cluster executes the subtask. And when the execution of each subtask is finished, determining an execution result of the task to be executed. Different subtasks can be executed by different clusters, complex and large-scale workflow tasks can be better executed, and the execution rate is improved. Meanwhile, the cluster meeting the scheduling condition is determined to be the scheduling cluster of the subtask according to the basic information of the multi-cluster, and then the scheduling cluster meeting the load condition is determined to be the appointed cluster of the subtask from the scheduling clusters of the subtask according to the resource information of the multi-cluster, so that the scheduling cluster of each subtask is determined, the appointed cluster of each subtask is determined, the subtask is executed, and the execution speed of the to-be-executed task can be accelerated.

For each subtask, the server firstly executes the step S104 to obtain a scheduling cluster of the subtask, then executes the steps S106 and S108 to determine a designated cluster of the subtask, and sends the subtask to the determined designated cluster. However, for a task to be executed, when a certain subtask of the task to be executed is executing, the server may determine the scheduling cluster of the next subtask of the subtask in advance according to the subtask queue, and may determine the scheduling cluster of the next subtask of the subtask without waiting for the execution end of the subtask, so as to accelerate the execution speed of the task to be executed and reduce the waiting time. Along the example shown in fig. 3, the subtask queue of the task P to be executed in fig. 3 is a subtask E, a subtask F, a subtask G, and a subtask H, and the server may determine a scheduling cluster of a next subtask of the subtask F, that is, determine a scheduling cluster of the subtask G while executing the subtask F.

Of course, the server may also determine the scheduling clusters of the plurality of subtasks in advance, where the specific number may be set according to needs, for example, it may be determined in advance that the number of subtasks of the scheduling cluster is inversely related to the load of the server, when the load of the server is low, it may be determined that the scheduling clusters of the subtasks have not yet been executed, and when the load of the server is high, it may be determined that the scheduling clusters of the specified number of subtasks are determined, which is not limited in this specification.

In addition, when any sub-task without a dependency exists in the sub-tasks of the task to be executed, the sub-tasks without the dependency may be executed out of the determined sub-task queue order, and the steps S104 to S108 may be executed simultaneously, so as to accelerate the execution speed of the task to be executed and reduce the waiting time. Along the example shown in fig. 2 described above, assuming that the subtask queue of the task O to be executed is subtask a, subtask B, subtask C, subtask D, when the subtask a ends after execution (execution of the above-described steps S106 and S108 ends) or when the subtask is executing (execution of the above-described steps S106 and S108), since the subtask B and the subtask C have no dependency relationship, the above-described steps S104 and S106 can be executed simultaneously, i.e., for the subtask B and the subtask C. After the execution of the subtask B and the subtask C ends (execution of the above-described step S106 and step S108 ends) or while being executed (execution of the above-described step S106 and step S108 is being performed), the subtask D is executed.

In the step S104, when determining the scheduling cluster of the subtask from the multiple clusters, the server may acquire the basic information of each cluster, and determine, from the multiple clusters, a candidate set composed of each cluster satisfying the scheduling condition of the subtask according to the basic information of the multiple clusters. And then, according to the basic information of each cluster in the to-be-selected set, determining the sequence of the performance information of each cluster, and selecting the scheduling cluster of the subtask according to the sequence. When determining the ordering of the performance information of each cluster, each cluster in the candidate set may be ordered according to the order of the performance from low to high or from high to low. Therefore, when the scheduling cluster of the subtask is selected according to the order, the cluster with the highest performance can be selected as the scheduling cluster according to the order, or the designated number of clusters with high performance can be selected as the scheduling clusters according to the order, for example, three clusters with higher performance are selected as the scheduling clusters according to the order, and the specification is not limited specifically.

Based on this, in step S106, according to the determined resource information and the preset load condition, the scheduling cluster for executing the subtask is determined from the scheduling clusters satisfying the load condition, and when the scheduling cluster is used as the designated cluster of the subtask, the server may determine whether the scheduling cluster of the subtask satisfies the preset load condition according to the determined resource information, and if yes, determine that the scheduling cluster of the subtask is the designated cluster of the subtask. If not, deleting the scheduling cluster of the subtask from the candidate set of the subtask, re-determining the scheduling cluster of the subtask from the deleted candidate set of the subtask, acquiring the resource information of the re-determined scheduling cluster, and continuously judging whether the re-determined scheduling cluster meets the load condition or not until the appointed cluster of the subtask is determined.

Because the scheduling clusters meeting the scheduling conditions of the subtasks may not meet the load conditions when the subtasks are executed, the server needs to monitor the resource information of each cluster meeting the scheduling conditions of the subtasks, and when any cluster meeting the load conditions is monitored, the cluster meeting the load conditions is used as the appointed cluster of the subtasks. Therefore, in the above step S106, when the candidate set of the task is an empty set, the resource information of each cluster satisfying the scheduling condition of the subtask is monitored. And responding to the monitored resource information of any cluster meeting the load condition, and taking the cluster meeting the load condition as a designated cluster of the subtask.

In this specification, each subtask may be a single instance or multiple instances. The subtasks of a single instance are tasks with only one data source, the subtasks of multiple instances correspond to a plurality of duplicate tasks, the task content of each duplicate task is the same, but the data sources corresponding to each duplicate task are different. That is, when the data source of a subtask is a single data source, the subtask is a single instance of the subtask. When the subtask corresponds to at least two data sources, the subtask is a multi-instance subtask, at least two copy tasks of the subtask are determined according to the data sources corresponding to the subtask, and the task content of each copy task is the same but the corresponding data sources are different.

Continuing to use the above examples, assume that the subtask a is a multi-example subtask, the task content of the subtask a is text information acquired from a data source, and the data sources are a to C, so that 3 duplicate tasks corresponding to the subtask a are respectively a duplicate task a, a duplicate task B and a duplicate task C, each duplicate task corresponds to one data source, that is, the task content of the duplicate task a is text information acquired from the data source a, and the task content of other duplicate tasks is similar to the task content of the duplicate task a and will not be described herein.

Based on this, in step S104, when determining the scheduling cluster of the subtask from the multiple clusters, the server may acquire the basic information of each cluster, determine a plurality of duplicate tasks corresponding to the subtask, and for each duplicate task, determine, from the multiple clusters, a cluster that meets the scheduling condition of the duplicate task according to the basic information of the multiple clusters, as the scheduling cluster of the duplicate task. Then, in the step S106, the server may determine, when executing the duplicate task of the subtask, resource information of a scheduling cluster of the duplicate task in the multiple clusters, and determine, according to the determined resource information and a preset load condition, the scheduling cluster for executing the duplicate task from the scheduling clusters meeting the load condition, as a designated cluster of the duplicate task. Thereafter, in the step S108 described above, the server may send the duplicate task to the designated cluster, so that the designated cluster executes the duplicate task.

In order to reduce the storage space of each cluster and increase the execution speed of the task, after the server sends the subtask to the designated cluster in step S108, the server may monitor the execution situation of the subtask, and when it is monitored that the execution of the subtask is completed, send a resource release instruction to the designated cluster, so that the designated cluster deletes the files required for executing the subtask. The files required for executing the subtasks include files acquired when the subtasks are executed and files generated in the process of executing the subtasks.

Before determining the schedule of the task to be executed in step S100, the server also needs to determine that the format of the task file is a pre-stored declaration format. That is, before determining the schedule of the task to be executed, the server may verify whether the format of the task file of the task to be executed is a prestored declaration format according to the prestored declaration format, and when the verification is passed, determine the schedule of the task to be executed according to the task file. When the verification is not passed, the server can send prompt information to the user to prompt the user that the task file sent by the user has errors.

In this specification, a user can view the execution state of a task to be executed, pause the task to be executed, run the task to be executed, and end the task to be executed. Therefore, the server can respond to the clicking operation of the user to determine the operation instruction of the task to be executed, and operate the task to be executed according to the operation instruction. The operation instruction may be a query instruction, a pause instruction, a run execution instruction, and an end instruction. If the determined operation instruction of the task to be executed is a query instruction in response to the click operation of the user, the server may check the execution state of the task to be executed, such as an running state, a pause state or an end state, according to the query instruction, where the execution state of the task to be executed may also be the execution state of each subtask of the task to be executed, and the description is not limited specifically.

If the operation instruction of the task to be executed is a pause instruction in response to the clicking operation of the user, the server can pause the task to be executed according to the pause instruction. The suspending of the execution of the task to be executed may be suspending the execution of the entire task to be executed, or may be suspending the execution of a subtask of the task to be executed. If the operation instruction of the task to be executed is an operation instruction in response to the clicking operation of the user, the server can operate the task to be executed according to the operation instruction. If the operation instruction of the task to be executed is an ending instruction in response to the clicking operation of the user, the server can end the task to be executed according to the ending instruction.

In the present specification, as shown in fig. 4, fig. 4 is a schematic diagram of state transition of a life cycle of a task to be performed provided in the present specification. The tasks to be executed can be divided into common tasks and distributed tasks, wherein the common tasks are states of tasks without decomposing tasks, and the content in the following description is the state of the tasks. For the distributed task, the initial state is [ not running ], the distributed task is decomposed, the state is converted from [ not running ] to [ decomposing ], if an abnormality occurs in the decomposition process or the decomposition frequency exceeds a threshold value, the state is converted from [ decomposing ] to [ abnormal ], the abnormality is processed, namely, the abnormal condition occurring is recorded, the state is converted from [ abnormal ] to [ running failure ], and the flow is ended. After the decomposition is completed, judging the decomposition state of the distributed task, and re-decomposing the distributed task when the state is [ decomposition failure ]. When the state is [ decomposition success ], the decomposition result is determined, and after confirmation, the state is converted from [ decomposition success ] to [ decomposition confirmed ].

Thereafter, the distributed task starts to run, and the state is converted from [ decomposition confirmed ] to [ running ]. If an abnormality occurs, the state of the operation is entered from the state of the operation, and the abnormality is processed, recording the abnormal condition, converting the state into [ operation failure ], and ending the flow. If the user clicks (i.e., the clicking operation of the user) to pause (i.e., the operation corresponding to the pause instruction in the foregoing description) the distributed task, the state of the distributed task is changed from [ running ] to [ pause ], and the user continues to click to run (i.e., the operation corresponding to the running instruction in the foregoing description) the distributed task, and the state of the distributed task is changed from [ pause ] to [ running ]. If the user clicks to end (i.e., the operation corresponding to the end instruction in the above description) the distributed task, the state of the distributed task is changed from [ running ] to [ finished ], and the flow ends. After the distributed task is executed, determining that the state of the distributed task can be converted from [ running ] to [ running successfully ] or [ running failure ] according to the execution result of the distributed task by judging the execution result, and ending the flow.

For a normal task, the initial state is [ not running ]. After the ordinary task is executed, the state of the ordinary task is [ in operation ], if abnormality occurs, the ordinary task enters from [ in operation ] to [ in abnormality ], and the abnormality is processed, namely, the abnormal situation occurring is recorded, the state is converted into [ in operation failure ], and the flow is ended. If the user clicks to pause (i.e. the operation corresponding to the pause instruction in the above description) the normal task, the state of the normal task is changed from [ running ] to [ pause ], and the user continues to click to run (i.e. the operation corresponding to the running instruction in the above description) the normal task, and the state of the normal task is changed from [ pause ] to [ running ]. If the user clicks to end (i.e., the operation corresponding to the end instruction in the above description) the normal task, the state of the normal task is changed from [ running ] to [ finished ], and the flow ends. After the execution of the normal task is finished, the state can be converted from [ running ] to [ running successfully ] or [ running failure ] according to the execution result of the normal task, and the flow is finished.

The foregoing is a method implemented by one or more embodiments of the present specification, and based on the same concept, the present specification further provides a cloud native workflow engine implementation system, where the system is applied to multiple clusters, as shown in fig. 5.

Fig. 5 is a schematic diagram of a cloud native workflow engine implementation system provided in the present specification, where the system shown in fig. 5 includes an injection module 200, an input module 201, a task scheduling module 202, and a task execution module 203, where:

the injection module 200 is configured to determine a schedule diagram of a task to be executed according to a task file of the task to be executed, which is input by a user. Specifically, the injection module 200 may determine, according to a task file of a task to be executed, each subtask corresponding to the task to be executed and a dependency relationship between the subtasks, and determine a scheduling graph of the task to be executed according to the dependency relationship between the subtasks, in response to a task file of the task to be executed input by a user.

The input module 201 is configured to determine a subtask queue of a task to be executed according to the schedule diagram. Specifically, the input module 201 may decompose the schedule chart according to the topology structure of the schedule chart, and determine a subtask queue of the task to be executed.

The task scheduling module 202 is configured to determine, for each sub-task of the sub-task queue of the task to be executed determined by the input module, a scheduling cluster of the sub-task from the multiple clusters, and send the determined scheduling cluster of the sub-task to the task executing module. Specifically, for each subtask in the subtask queue, the task scheduling module 202 may obtain basic information of each cluster, and determine, from the multiple clusters, a cluster that meets a scheduling condition of the subtask as a scheduling cluster of the subtask according to the basic information of the multiple clusters. Wherein the basic information includes at least performance information.

When determining a cluster meeting the scheduling condition of the subtask from the multiple clusters and taking the cluster as the scheduling cluster of the subtask, the task scheduling module 202 may determine a cluster meeting the scheduling condition of the subtask from the multiple clusters, and randomly select at least one cluster from the determined clusters as the scheduling cluster of the subtask.

The task execution module 203 is configured to determine, when receiving a scheduling cluster of the subtask sent by the task scheduling model, resource information of the scheduling cluster of the subtask in the multiple clusters, determine, according to the determined resource information and a preset load condition, a scheduling cluster for executing the subtask from the scheduling clusters meeting the load condition, and send the subtask to the designated cluster as the designated cluster of the subtask, so that the designated cluster executes the subtask, and determine an execution result of the task to be executed when execution of each subtask of the task to be executed is completed.

Specifically, first, when receiving the scheduling cluster of the subtask sent by the task scheduling model, the task execution module 203 may determine resource information of the scheduling cluster of the subtask in the multiple clusters, and randomly select, according to the determined resource information and a preset load condition, one scheduling cluster from the scheduling clusters meeting the load condition as the scheduling cluster for executing the subtask.

Since the task scheduling module 202 may randomly select at least one scheduling cluster of the subtasks from the clusters that satisfy the scheduling conditions, when the task execution module 203 receives the scheduling cluster of the subtasks sent by the task scheduling model, none of the scheduling clusters of the subtasks determined by the task scheduling module 202 satisfies the load conditions. The task execution module 203 may reselect a scheduling cluster from the clusters that the task scheduling module 202 satisfies the scheduling condition, and continue to determine whether the reselected scheduling cluster satisfies the load condition until it is determined to execute the designated cluster of the subtask.

The task execution module 203 may then send the subtask to the designated cluster to cause the designated cluster to execute the subtask. And then, when the execution of each subtask of the task to be executed is finished, determining the execution result of the task to be executed.

As can be seen from the above-mentioned cloud native workflow engine implementation system, when implementing the workflow engine, the injection module 200 firstly responds to the task file of the task to be executed input by the user, and determines the scheduling diagram of the task to be executed according to the task file. The input module 201 determines a subtask queue of tasks to be performed according to the schedule. Then, the task scheduling module 202 determines, for each sub-task of the sub-task queue of the task to be executed determined by the input module, a scheduling cluster of the sub-task from among the multiple clusters, and sends the determined scheduling cluster of the sub-task to the task executing module 203. Then, when receiving the scheduling cluster of the subtask sent by the task scheduling model, the task execution module 203 determines the resource information of the scheduling cluster of the subtask in the multiple clusters, determines the scheduling cluster for executing the subtask from the scheduling clusters meeting the load condition according to the determined resource information and the preset load condition, and sends the subtask to the appointed cluster as the appointed cluster of the subtask, so that the appointed cluster executes the subtask, and when the execution of each subtask of the task to be executed is finished, the execution result of the task to be executed is determined. Different subtasks can be executed by different clusters, complex and large-scale workflow tasks can be better executed, and the execution rate is improved.

In addition, the task scheduling module 202 may determine, according to the basic information of the multiple clusters, that the cluster satisfying the scheduling condition is the scheduling cluster of the subtask. When receiving the scheduling cluster of the subtask sent by the task scheduling model, the task execution module 203 determines, from the scheduling clusters of the subtask, a scheduling cluster meeting the load condition as a designated cluster of the subtask according to the resource information of the multi-cluster. So that the task scheduling module 202 may continue to determine the scheduling cluster of the next sub-task of the sub-task after sending the scheduling cluster of the sub-task to the task executing module 203, so as to accelerate the execution speed of the task to be executed.

For each sub-task, the task scheduling module 202 determines a scheduling cluster for the sub-task and sends it to the task execution module 203. The task execution module 203 determines the designated cluster of the subtask again and sends the subtask to the determined designated cluster. However, for a task to be executed, when a certain subtask of the task to be executed is executing, the task scheduling module 202 may determine, in advance, a scheduling cluster of a next subtask of the subtask according to the subtask queue, and may not need to wait for the execution of the subtask to end, and determine the scheduling cluster of the next subtask of the subtask, so as to accelerate the execution speed of the task to be executed and reduce the waiting time.

Of course, the task scheduling module 202 may also determine the scheduling clusters of the multiple subtasks in advance, where the specific number may be set according to needs, for example, the number of subtasks of the scheduling clusters may be determined in advance to be inversely related to the load of the task scheduling module 202, when the load of the task scheduling module 202 is low, the scheduling clusters of the subtasks that have not yet been executed may be determined, and when the load of the task scheduling module 202 is high, the scheduling clusters of the specified number of subtasks may be determined, which is not limited in this specification.

In addition, when any sub-task without a dependency relationship exists in each sub-task of the task to be executed, the task scheduling module 202 may execute the sub-tasks without a dependency relationship out of the determined sub-task queue order, and may determine scheduling clusters of the several sub-tasks (without a dependency relationship) at the same time, and send the scheduling clusters to the task execution module 203. The task execution module 203 determines the designated clusters of the plurality of subtasks, and sends the plurality of subtasks to the determined designated clusters respectively, so as to speed up the execution speed of the task to be executed and reduce the waiting time.

In the above cloud native workflow engine implementation system, the system further includes a cluster management module 204, where the cluster management module 204 is configured to obtain and store information of multiple clusters. Wherein the information includes resource information and basic information. The task scheduling module 202 may be configured to obtain basic information of multiple clusters from the cluster management module, and determine a cluster that meets the scheduling condition of the subtask, as a scheduling cluster of the subtask. The task execution module 203 may be configured to, when receiving the scheduling cluster of the subtask sent by the task scheduling model, obtain resource information of the multiple clusters from the cluster management module, and determine resource information of the scheduling cluster of the subtask in the multiple clusters.

In the above cloud primary workflow engine implementation system, the task scheduling module 202 may be configured to obtain basic information of each cluster, determine, from among the multiple clusters, a candidate set composed of each cluster that meets a scheduling condition of the subtask according to the basic information of each cluster in the candidate set, determine a ranking of performance information of each cluster according to the basic information of each cluster in the candidate set, and select, according to the ranking, a scheduling cluster of the subtask, where the basic information at least includes the performance information. In determining the ordering of the performance information for each cluster, each cluster in the candidate set may be ordered in order of performance from low to high or high to low. Therefore, when the scheduling cluster of the subtask is selected according to the order, the cluster with the highest performance can be selected as the scheduling cluster according to the order, or the designated number of clusters with high performance can be selected as the scheduling clusters according to the order, for example, three clusters with higher performance are selected as the scheduling clusters according to the order, and the specification is not limited specifically.

Based on this, the task execution module 203 may determine, according to the determined resource information, whether the scheduling cluster of the subtask meets a preset load condition, and if so, determine that the scheduling cluster of the subtask is a designated cluster of the subtask. If not, deleting the scheduling cluster of the subtask from the candidate set of the subtask, re-determining the scheduling cluster of the subtask from the deleted candidate set of the subtask, acquiring the resource information of the re-determined scheduling cluster, and continuously judging whether the re-determined scheduling cluster meets the load condition or not until the appointed cluster of the subtask is determined.

Since when executing the subtask, that is, when the task execution module 203 receives the scheduling cluster of the subtask sent by the task scheduling model, the scheduling clusters meeting the scheduling conditions of the subtask may not meet the load conditions, the task execution module 203 needs to monitor the resource information of each cluster meeting the scheduling conditions of the subtask, and when any cluster meeting the load conditions is monitored, the cluster meeting the load conditions is used as the designated cluster of the subtask. Therefore, the task execution module 203 is further configured to monitor resource information of each cluster that satisfies the scheduling condition of the subtask when the candidate set of the task is an empty set. And responding to the monitored resource information of any cluster meeting the load condition, and taking the cluster meeting the load condition as a designated cluster of the subtask.

In this specification, each subtask may be a single instance or multiple instances. The subtasks of multiple cases correspond to a plurality of duplicate tasks, the task content of each duplicate task is the same, and only the data sources or the tasks correspond to different. Therefore, the task scheduling module 202 may obtain the basic information of the multiple clusters from the cluster management module, determine a plurality of duplicate tasks corresponding to the subtasks, and for each duplicate task, determine, according to the basic information of the multiple clusters, a cluster that meets the scheduling condition of the duplicate task from the multiple clusters, as a scheduling cluster of the duplicate task. Correspondingly, when executing the duplicate task of the subtask, the task execution module 203 may determine resource information of a scheduling cluster of the duplicate task in the multiple clusters, determine, according to the determined resource information and a preset load condition, the scheduling cluster for executing the duplicate task from the scheduling clusters meeting the load condition, as a designated cluster of the duplicate task, and send the duplicate task to the designated cluster, so that the designated cluster executes the duplicate task.

In order to reduce the storage space of each cluster and increase the execution speed of the task, the cloud native workflow engine implementation system further includes a task monitoring module 205 and a resource destructing module 206, where the task monitoring module 205 is configured to monitor the execution condition of each subtask in the subtask queue. The resource destructing module 206 is configured to send, for each monitored subtask, a resource release instruction to a designated cluster of the subtask when the task monitoring module 205 monitors that the execution of the subtask is completed, so that the designated cluster of the subtask deletes a file required for executing the subtask. The files required for executing the subtasks include files acquired when the subtasks are executed and files generated in the process of executing the subtasks.

In the above cloud native workflow engine implementation system, the injection module 200 may be configured to verify the format of the task file according to a pre-stored declaration format, and when verification passes, determine a schedule of the task to be executed according to the task file. Specifically, the injection module 200 may verify whether the format of the task file of the task to be executed is a prestored declaration format according to the prestored declaration format, and when the verification is passed, determine the schedule of the task to be executed according to the task file. When the verification is not passed, the injection module 200 may send a prompt message to the user to prompt the user that the task file sent by the user has an error.

In this specification, a user can view the execution state of a task to be executed, pause the task to be executed, run the task to be executed, and end the task to be executed. Therefore, the cloud native workflow engine implementation system further includes a task management module 207, where the task management module 207 is configured to determine an operation instruction of a task to be executed in response to a click operation of a user, and operate the task to be executed according to the operation instruction. The operation instruction may be a query instruction, a pause instruction, a run execution instruction, and an end instruction.

The task management module 207 is mainly in communication with the task execution module 203 to obtain the execution status of the task to be executed and perform the operation on the task to be executed. When the operation instruction is a pause instruction, the task management module 207 transmits a pause instruction to the task execution module 203 to cause the task execution module 203 to pause execution of the subtasks of the task to be executed. Other operation instructions are similar to the pause instruction and will not be described in detail herein. Of course, the task management module 207 may also communicate with other modules in the system and send operation instructions to other modules to perform tasks to be performed, which is not particularly limited in this specification.

In the cloud primary workflow engine implementation system, an agent exists in each cluster, and the agent is configured to collect information about the cluster in which the agent is located, and the cluster management module 204 may communicate with the agent of each cluster to obtain information about each cluster. In addition, each cluster is provided with k8s for scheduling and scheduling tasks received by the cluster. When a cluster receives a subtask, the subtask may be scheduled by k8s of the cluster for execution by a node in the cluster. The execution of the subtasks may be collected by the agents of the cluster and sent to the cluster management module 204. The task monitoring module 205 may monitor the execution of the subtasks by communicating with the cluster management module 204.

The present disclosure also provides a schematic diagram of a cloud native workflow engine implementation system, as shown in fig. 6, and fig. 6 is a schematic diagram of another cloud native workflow engine implementation system provided in the present disclosure. The cloud primary workflow engine implementation system further includes the cluster management module 204, the task monitoring module 205, the resource destructing module 206, and the task management module 207, and fig. 6 only shows communication between the task management module 207 and the task execution module 203, and communication between the task monitoring module 205 and the task execution module 203, and between the task destructing module 206.

The present specification also provides a computer readable storage medium storing a computer program operable to perform a method of implementing a cloud native workflow engine as provided in fig. 1 above.

The present specification also provides a schematic structural diagram of an electronic device corresponding to fig. 1 shown in fig. 7. At the hardware level, as shown in fig. 7, the electronic device includes a processor, an internal bus, a network interface, a memory, and a nonvolatile storage, and may of course include hardware required by other services. The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to realize the cloud native workflow engine implementation method described in fig. 1.

Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.

Claims

1. A method for implementing a cloud native workflow engine, wherein the method is applied to multiple clusters, the method comprising:

2. The method according to claim 1, wherein determining the scheduling cluster of the subtask from the multiple clusters, in particular comprises:

3. The method according to claim 2, wherein determining, from among the scheduling clusters satisfying the load condition, the scheduling cluster performing the subtask as the designated cluster of the subtask according to the determined resource information and the preset load condition, specifically includes:

4. A method as claimed in claim 3, wherein the method further comprises:

5. The method of claim 1, wherein determining the schedule of the task to be performed based on the task file, specifically comprises:

6. The method of claim 1, wherein determining the subtask queue of the task to be performed according to the schedule map specifically comprises:

7. The method according to claim 1, wherein determining the scheduling cluster of the subtask from the multiple clusters, in particular comprises:

8. The method according to claim 1, wherein determining the scheduling cluster of the subtask from the multiple clusters, in particular comprises:

basic information of each cluster is obtained;

9. The method of claim 1, wherein after sending the subtask to the designated cluster, the method further comprises:

monitoring the execution condition of the subtask;

10. The method of claim 1, wherein prior to determining the schedule of tasks to be performed, the method further comprises:

11. A cloud native workflow engine implementation system, the system comprising an injection module, an input module, a task scheduling module, and a task execution module, the system being applied to multiple clusters, wherein:

12. The system of claim 11, wherein the system further comprises a cluster management module;

13. The system of claim 11, wherein the task scheduling module is specifically configured to obtain basic information of each cluster, determine, from the multiple clusters, a candidate set composed of each cluster satisfying a scheduling condition of the subtask according to the basic information of each cluster in the candidate set, determine a ranking of performance information of each cluster according to the basic information of each cluster in the candidate set, and select a scheduling cluster of the subtask according to the ranking, where the basic information at least includes the performance information.

14. The system of claim 13, wherein the task execution module is specifically configured to determine whether the scheduling cluster of the subtask satisfies a preset load condition according to the determined resource information, if yes, determine that the scheduling cluster of the subtask is a designated cluster of the subtask, if not, delete the scheduling cluster of the subtask from the candidate set of the subtask, redetermine the scheduling cluster of the subtask from the deleted candidate set of the subtask, obtain the resource information of the redetermined scheduling cluster, and continue to determine whether the redetermined scheduling cluster satisfies the load condition until the designated cluster of the subtask is determined.

15. The system of claim 14, wherein the task execution module is further configured to detect resource information of each cluster that satisfies a scheduling condition of the subtask when the candidate set of the subtask is an empty set, and to use the cluster that satisfies the loading condition as the designated cluster of the subtask in response to the monitored resource information of any cluster satisfying the loading condition.

16. The system of claim 11, wherein the injection module is specifically configured to determine, according to the task file, each subtask corresponding to the task to be executed and a dependency relationship between the subtasks, and determine, according to the dependency relationship between the subtasks, a schedule of the task to be executed.

17. The system of claim 11, wherein the input module is specifically configured to decompose the schedule according to a topology of the schedule, and determine a subtask queue of the task to be performed.

18. The system of claim 12, wherein the task scheduling module is specifically configured to obtain the basic information of the multiple clusters from the cluster management module, and determine a cluster that meets a scheduling condition of the subtask as the scheduling cluster of the subtask.

19. The system of claim 12, wherein the task scheduling module is specifically configured to obtain basic information of the multiple clusters from the cluster management module, determine a plurality of duplicate tasks corresponding to the subtasks, and for each duplicate task, determine, according to the basic information of the multiple clusters, a cluster that meets a scheduling condition of the duplicate task from the multiple clusters, as a scheduling cluster of the duplicate task;

20. The system of claim 11, wherein the system further comprises a task monitoring module and a resource destructor module;

21. The system of claim 11, wherein the injection module is specifically configured to verify the format of the task file according to a pre-stored declaration format, and when the verification is passed, determine the schedule of the task to be performed according to the task file.

22. A computer readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-10.

23. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-10 when executing the program.