CN113225269B - Container-based workflow scheduling method, device and system and storage medium - Google Patents

Container-based workflow scheduling method, device and system and storage medium Download PDF

Info

Publication number
CN113225269B
CN113225269B CN202110417260.9A CN202110417260A CN113225269B CN 113225269 B CN113225269 B CN 113225269B CN 202110417260 A CN202110417260 A CN 202110417260A CN 113225269 B CN113225269 B CN 113225269B
Authority
CN
China
Prior art keywords
workflow
target
scheduler
scheduling
container
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110417260.9A
Other languages
Chinese (zh)
Other versions
CN113225269A (en
Inventor
李宗哲
符永铨
韩伟红
黄珺
向文丽
余宜诚
吉青利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peng Cheng Laboratory
Original Assignee
Peng Cheng Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peng Cheng Laboratory filed Critical Peng Cheng Laboratory
Priority to CN202110417260.9A priority Critical patent/CN113225269B/en
Publication of CN113225269A publication Critical patent/CN113225269A/en
Application granted granted Critical
Publication of CN113225269B publication Critical patent/CN113225269B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • H04L47/58Changing or combining different scheduling modes, e.g. multimode scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management

Abstract

The invention discloses a method, a device, a system and a storage medium for dispatching a workflow based on a container, which are applied to a system for dispatching the workflow based on the container, wherein the system comprises a primary dispatcher and a secondary dispatcher, and the method comprises the following steps: when a workflow execution request is received, determining a target workflow corresponding to the workflow execution request, and distributing the target workflow to a preset workflow queue; monitoring a preset workflow queue through a primary scheduler to obtain a target workflow from the preset workflow queue, and determining a computing resource node corresponding to the target workflow through the primary scheduler; and starting a target secondary scheduler corresponding to the computing resource node through the primary scheduler, and scheduling the target workflow through the target secondary scheduler. The invention adopts a multilevel scheduling architecture to realize distributed scheduling at different computing resource nodes, thereby achieving the purpose of extensible scheduling and improving the scheduling performance of the system.

Description

Container-based workflow scheduling method, device and system and storage medium
Technical Field
The present invention relates to the field of network communication technologies, and in particular, to a method, an apparatus, a system, and a storage medium for container-based workflow scheduling.
Background
The workflow (Work Flow) refers to an automated process of a part or the whole of a business process under a computer environment, is a calculation model of the workflow, and is an abstract and general description of the workflow and business rules among operation steps of the workflow. With the development of the information technology, the container technology has become one of core technologies of new-generation cloud computing, and the concept of workflow management also provides a new business view for the development of the information system, so how to schedule the workflow based on the container has important significance for the deployment, operation and maintenance of the workflow system.
The traditional workflow scheduling method focuses on scheduling on a single computing resource, the scheduling performance is low, and with the increase of data scale and the complication of workflow, the method for scheduling on the single computing resource can not meet the current workflow management requirement.
Disclosure of Invention
The invention mainly aims to provide a container-based workflow scheduling method, device, system and storage medium, aiming at improving the expansibility of a workflow scheduling system.
In order to achieve the above object, the present invention provides a method for scheduling a container-based workflow, which is applied to a container-based workflow scheduling system, wherein the container-based workflow scheduling system includes a primary scheduler and a secondary scheduler, and the method includes the following steps:
when a workflow execution request is received, determining a target workflow corresponding to the workflow execution request, and distributing the target workflow to a preset workflow queue;
monitoring the preset workflow queue through the primary scheduler to obtain the target workflow from the preset workflow queue, and determining a computing resource node corresponding to the target workflow through the primary scheduler;
and starting a target secondary scheduler corresponding to the computing resource node through the primary scheduler, and scheduling the target workflow through the target secondary scheduler.
Preferably, the step of determining, by the primary scheduler, a computing resource node corresponding to the target workflow comprises:
determining an initial computing resource node corresponding to the target workflow according to the workflow execution request;
and acquiring the load condition corresponding to the initial computing resource node, and determining a target computing resource node corresponding to the target workflow according to the load condition through the primary scheduler.
Preferably, the step of determining, by the primary scheduler, a target computing resource node corresponding to the target workflow according to the load condition includes:
when the load condition is in a preset load range, determining the initial computing resource node as a target computing resource node corresponding to the target workflow through the primary scheduler;
when the load condition is not in a preset load range, detecting the node load condition of other computing resource nodes through the primary scheduler, and determining a target computing resource node corresponding to the target workflow from the other computing resource nodes of which the node load condition is in the preset load range.
Preferably, the step of scheduling the target workflow by the target secondary scheduler comprises:
analyzing the target workflow through the target secondary scheduler to determine a task node corresponding to the target workflow;
and performing concurrent execution processing on the task nodes through the target secondary scheduler so as to schedule the target workflow.
Preferably, after the step of performing concurrent execution processing on the task node by the target secondary scheduler to schedule the target workflow, the method further includes:
receiving an execution processing result returned by each task node through the target secondary scheduler;
and determining whether the target workflow is scheduled to be completed or not according to the execution processing result.
Preferably, the step of monitoring the preset workflow queue through the primary scheduler to obtain the target workflow from the preset workflow queue, and determining the computing resource node corresponding to the target workflow through the primary scheduler includes:
monitoring the preset workflow queue through the primary scheduler, and actively pulling the target workflow from the preset workflow queue through the primary scheduler to acquire workflow information in the target workflow;
and determining a computing resource node corresponding to the target workflow according to the workflow information through the primary scheduler.
Preferably, before the step of determining the target workflow corresponding to the workflow execution request, the method further includes:
determining a workflow to be executed, and describing the workflow to be executed according to a preset description mode to obtain a corresponding description model;
a workflow execution request is received based on the description model.
In addition, to achieve the above object, the present invention further provides a container-based workflow scheduling apparatus applied to a container-based workflow scheduling system, where the container-based workflow scheduling system includes a primary scheduler and a secondary scheduler, and the container-based workflow scheduling apparatus includes:
the system comprises a receiving and determining module, a processing module and a processing module, wherein the receiving and determining module is used for determining a target workflow corresponding to a workflow execution request when the workflow execution request is received, and distributing the target workflow to a preset workflow queue;
the monitoring determining module is used for monitoring the preset workflow queue through the primary scheduler to acquire the target workflow from the preset workflow queue and determining a computing resource node corresponding to the target workflow through the primary scheduler;
and the starting scheduling module is used for starting the target secondary scheduler corresponding to the computing resource node through the primary scheduler and scheduling the target workflow through the target secondary scheduler.
Preferably, the listening determining module further comprises a node determining unit, and the node determining unit is configured to:
determining an initial computing resource node corresponding to the target workflow according to the workflow execution request;
and acquiring the load condition corresponding to the initial computing resource node, and determining a target computing resource node corresponding to the target workflow according to the load condition through the primary scheduler.
Preferably, the node determination unit is further configured to:
when the load condition is within a preset load range, determining the initial computing resource node as a target computing resource node corresponding to the target workflow through the primary scheduler;
when the load condition is not in a preset load range, detecting the node load condition of other computing resource nodes through the primary scheduler, and determining a target computing resource node corresponding to the target workflow from the other computing resource nodes of which the node load condition is in the preset load range.
Preferably, the start scheduling module is further configured to:
analyzing the target workflow through the target secondary scheduler to determine a task node corresponding to the target workflow;
and performing concurrent execution processing on the task nodes through the target secondary scheduler so as to schedule the target workflow.
Preferably, the container-based workflow scheduling apparatus further comprises an execution feedback unit, and the execution feedback unit is configured to:
receiving an execution processing result returned by each task node through the target secondary scheduler;
and determining whether the target workflow is scheduled to be finished or not according to the execution processing result.
Preferably, the listening determination module is further configured to:
monitoring the preset workflow queue through the primary scheduler, and actively pulling the target workflow from the preset workflow queue through the primary scheduler to acquire workflow information in the target workflow;
and determining a computing resource node corresponding to the target workflow according to the workflow information through the primary scheduler.
Preferably, the container-based workflow scheduling apparatus further includes a description triggering unit, and the description triggering unit is configured to:
determining a workflow to be executed, and describing the workflow to be executed according to a preset description mode to obtain a corresponding description model;
a workflow execution request is received based on the description model.
In addition, to achieve the above object, the present invention further provides a container-based workflow scheduling system, including: a memory, a processor, and a container-based workflow scheduler stored on the memory and executable on the processor, the container-based workflow scheduler when executed by the processor implementing the steps of the container-based workflow scheduling method as described above.
In addition, to achieve the above object, the present invention further provides a storage medium having a container-based workflow scheduler stored thereon, wherein the container-based workflow scheduler implements the steps of the container-based workflow scheduling method when being executed by a processor.
The invention provides a container-based workflow scheduling method, which is applied to a container-based workflow scheduling system, wherein the system comprises a primary scheduler and a secondary scheduler, and the method comprises the following steps: when a workflow execution request is received, determining a target workflow corresponding to the workflow execution request, and distributing the target workflow to a preset workflow queue; monitoring a preset workflow queue through a primary scheduler to obtain a target workflow from the preset workflow queue, and determining a computing resource node corresponding to the target workflow through the primary scheduler; and starting a target secondary scheduler corresponding to the computing resource node through the primary scheduler, and scheduling the target workflow through the target secondary scheduler. The invention adopts a multi-stage scheduling framework to realize distributed scheduling on different computing resource nodes, thereby achieving the purpose of extensible scheduling and improving the scheduling performance of the system.
Drawings
FIG. 1 is a system diagram of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a first embodiment of a container-based workflow scheduling method according to the present invention;
FIG. 3 is a diagram of a multi-level scheduling architecture of a preferred embodiment of the container-based workflow scheduling method of the present invention;
FIG. 4 is a functional block diagram of a container-based workflow scheduling method according to a preferred embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, fig. 1 is a system structural diagram of a hardware operating environment according to an embodiment of the present invention.
The system of the embodiment of the invention comprises a primary scheduler, a secondary scheduler and the like.
As shown in fig. 1, the system may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the system architecture shown in FIG. 1 is not intended to be limiting of the system, and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a type of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a container-based workflow scheduler.
The operating system is a program for managing and controlling the container-based workflow scheduling system and software resources, and supports the operation of a network communication module, a user interface module, the container-based workflow scheduling program and other programs or software; the network communication module is used for managing and controlling the network interface 1002; the user interface module is used to manage and control the user interface 1003.
In the container-based workflow scheduling system shown in fig. 1, the container-based workflow scheduling system calls a container-based workflow scheduling program stored in a memory 1005 through a processor 1001 and performs operations in various embodiments of a container-based workflow scheduling method described below.
Based on the hardware structure, the embodiment of the workflow scheduling method based on the container is provided.
Referring to fig. 2, fig. 2 is a schematic flowchart of a first embodiment of a container-based workflow scheduling method according to the present invention, where the method includes:
step S10, when a workflow execution request is received, determining a target workflow corresponding to the workflow execution request, and distributing the target workflow to a preset workflow queue;
the container-based workflow scheduling method is applied to container-based workflow scheduling systems of large enterprises and groups. For convenience of description, the container-based workflow scheduling system is referred to as a scheduling system for short. In this embodiment, the scheduling system includes a primary scheduler and a secondary scheduler, where the primary scheduler is responsible for performing distribution processing on a workflow, and when a workflow execution request is received, the workflow is distributed according to the workflow request to determine an operating environment corresponding to the workflow; the secondary scheduler is responsible for analyzing the workflow, then executing the logic scheduling of the workflow, and scheduling the node tasks of the workflow.
With the development of information technology, a container technology has become one of core technologies of new generation cloud computing, and a workflow management concept also provides a new business view for the development of an information system, so how to schedule workflows based on containers is of great significance to the deployment, operation and maintenance of the workflow system. The traditional workflow scheduling method focuses on scheduling on a single computing resource, has low scheduling performance, and cannot meet the current workflow management requirement by the scheduling method on the single computing resource along with the increase of data scale and the complication of workflow.
In this embodiment, since the workflow execution request carries the target workflow to be executed and the corresponding workflow information, when the scheduling system receives the workflow execution request, the target workflow to be executed may be determined by reading the workflow execution request, and then the target workflow is allocated to the preset workflow queue. Before allocating the target workflow to the preset workflow queue, a preset workflow queue may be established by using a preset middleware technology, where the preset middleware technology is preferably an efficient middleware technology such as Redis (Remote Dictionary Service), so that the workflow queue may execute the target workflow according to a first-in first-out rule. In addition, the preset workflow queue established by Redis includes a producer consumer model, a publisher subscriber model and other queue models. Specifically, a producer-consumer model is adopted, a workflow is generated by a producer and distributed to a queue, and meanwhile, the workflow is processed by consumer scheduling, for example, a plurality of consumers monitor the workflow queue at the same time, so that a person can take messages from the workflow queue by first preempting the messages and then taking the messages out of the workflow queue, and if no message exists in the workflow queue, the consumer continues to monitor, so that the producer-consumer model is generally used for processing high concurrency conditions; by utilizing the message publishing and subscribing mechanism, the workflow can be processed asynchronously and efficiently. Different scene requirements of the workflow can be met by utilizing a plurality of queue modes, and high-efficiency scheduling can be realized by utilizing an efficient middleware technology and a queue mode of a producer-consumer and the like.
In addition, the target workflows are distributed to the preset workflow queues, the target queues corresponding to the target workflows can be determined by acquiring the workflow operation environment IDs corresponding to the target workflows, and then the target workflows are written into the corresponding target queues. The workflow operation environment ID is a unique identification of the workflow operation environment, wherein the workflow operation environment ID can be obtained through an active mode strategy or a passive self-adaptive strategy. Acquiring a workflow operation environment ID through an active mode strategy, namely actively selecting different workflow queues to acquire the workflow operation environment ID when a user executes a target workflow through a client; the workflow operation environment ID is obtained through a passive self-adaptive strategy, namely, the scheduling system carries out self-adaptive distribution to determine the workflow operation environment ID by monitoring the congestion condition of each preset workflow queue. After determining a target queue corresponding to each target workflow, acquiring an operation unique identifier ID corresponding to the target workflow, and generating a key corresponding to each target workflow according to the operation unique identifier ID and a corresponding operation environment; and then taking the workflow information corresponding to the target workflow as a value, and correspondingly writing the value and the key of each target workflow into a preset workflow queue in a key value pair mode, so that the problem of collision during workflow execution can be effectively solved.
Step S20, monitoring the preset workflow queue through the primary scheduler to obtain the target workflow from the preset workflow queue, and determining a computing resource node corresponding to the target workflow through the primary scheduler;
in this embodiment, different workflow queues are monitored in real time by the primary scheduler, the workflow information of each target workflow can be read from the workflow queue by using a key generated by each target workflow, and once a certain primary scheduler acquires a target workflow from the workflow queue, the acquired target workflow is distributed by the primary scheduler to determine a computing resource node operated by the secondary scheduler. The computing resource node corresponding to the target workflow, that is, the operating environment of the target workflow, determines the computing resource node of the target workflow, that is, which computing resource nodes the workflow is to run on, and if it is determined that the target workflow needs to run on the virtual machine 1, the virtual machine 1 is the computing resource node corresponding to the target workflow.
Further, step S20 further includes:
step a1, monitoring the preset workflow queue through the primary scheduler, and actively pulling the target workflow from the preset workflow queue through the primary scheduler to acquire workflow information in the target workflow;
and a2, determining a computing resource node corresponding to the target workflow according to the workflow information through the primary scheduler.
In this embodiment, the primary scheduler includes at least one, and if there are multiple primary schedulers, the primary schedulers distributed on different computing resource nodes monitor different preset workflow queues in real time, and the primary scheduler is configured at the front end by a user to monitor different preset workflow queues and is expanded by actively or passively adding queues; and the automatic expansion can also be carried out by monitoring the congestion degree of the preset workflow queue. And then the primary scheduler actively pulls the workflow information from the preset workflow queue in a competitive consumption queue mode, and the primary scheduler which pulls the workflow information determines corresponding computing resource nodes according to the respective pulled workflow information. Through a mode of competing for a consumption queue, the self-adaptive function of the primary scheduler side can be realized, the complexity of algorithm configuration parameters can be reduced, and the scheduling structure can be expanded more easily.
And step S30, starting a target secondary scheduler corresponding to the computing resource node through the primary scheduler, and scheduling the target workflow through the target secondary scheduler.
In this embodiment, since the application programs corresponding to different tasks are different, and the computing resource nodes corresponding to different application programs are also different, the task nodes in the designated queue will be scheduled to the computing resource nodes in the corresponding queue, and the computing resource nodes in the corresponding queue will only run the task nodes in the designated queue. Therefore, after the primary scheduler is detected to finish the distribution processing of the target workflow, the primary scheduler starts a secondary scheduler on a computing resource node corresponding to the target workflow, namely the target secondary scheduler, and then the target workflow is scheduled by the target secondary scheduler according to a configurable scheduling algorithm. And a multilevel scheduling architecture is utilized, and a distributed technology is adopted to realize the extensible scheduling function.
Further, the step of scheduling the target workflow by the target secondary scheduler comprises:
b1, analyzing the target workflow through the target secondary scheduler to determine a task node corresponding to the target workflow;
in this embodiment, before triggering a workflow execution request, a workflow model needs to be described through a directed acyclic graph, and workflow information stored in the model is related to a modeling language, and if a json structure is adopted to model a workflow to be executed, an element stored in a preset workflow queue is workflow information in a json format. Therefore, when the target secondary scheduler is used for scheduling the workflow, the target secondary scheduler needs to analyze the target workflow first, for example, workflow information of a json structure is analyzed into a directed acyclic graph, so as to determine a task node corresponding to the target workflow. It can be understood that in the directed acyclic graph, the circles are task nodes and represent scheduled tasks; the edge connecting the two circles is a communication edge representing communication between the scheduled tasks; the direction of the edge is represented as the sequence of task execution, so that the target secondary scheduler can analyze the target workflow and determine the task node corresponding to the target workflow.
And b2, performing concurrent execution processing on the task nodes through the target secondary scheduler to schedule the target workflow.
In this embodiment, since the task nodes in the directed acyclic graph include task information to be executed, the task nodes in each layer of the scheduling system are traversed through a preset traversal algorithm, and the target secondary scheduler performs concurrent execution processing on each task node, so that the target workflow can be scheduled by the target secondary scheduler according to the task information in the target workflow, and concurrent execution of child task nodes in the same layer can be realized by using a concurrency technology.
Specifically, the preset traversal algorithm is preferably a width traversal algorithm, which is also called breadth-first traversal (BFS). Breadth-first traversal means that, starting from an un-traversed node of the graph, the neighboring nodes of the node are traversed first, and then the neighboring nodes of each neighboring node are traversed in sequence. Since the corresponding values of each task node are their respective traversal orders, breadth-first traversal is also called hierarchical traversal. When the BFS traverses each task node, each task node is accessed once by the shortest path, which is beneficial to improving the execution efficiency of the workflow. In addition, parallel processing is the running of several programs by a computer or multiple processes or threads of the same program. The main purpose of parallel processing is to save time for the resolution of large and complex problems. As shown in fig. 3, fig. 3 is a multi-level scheduling structure diagram of a preferred embodiment of a container-based workflow scheduling method of the present invention, where a front-end APP sends a workflow execution request to a back-end APP, the back-end APP allocates workflows to different primary schedulers for execution according to an execution environment corresponding to workflows in the workflow request, and one primary scheduler supports concurrent starting of multiple secondary schedulers, and secondary schedulers on different compute resource nodes can also support the same node to concurrently send multiple APPs, so as to complete scheduling of workflows through multiple specific APPs. The concurrent execution of each layer of scheduler is realized by utilizing a multithreading or multiprogramming technology, so that the scheduling capability of the scheduling system can be improved.
It should be noted that the "primary scheduler" and the "secondary scheduler" according to the present invention only serve to distinguish different schedulers, and the number of the primary scheduler and/or the secondary scheduler may be multiple according to task requirements.
The container-based workflow scheduling method of the embodiment is applied to a container-based workflow scheduling system, the system comprises a primary scheduler and a secondary scheduler, and the method comprises the following steps: when a workflow execution request is received, determining a target workflow corresponding to the workflow execution request, and distributing the target workflow to a preset workflow queue; monitoring a preset workflow queue through a primary scheduler to obtain a target workflow from the preset workflow queue, and determining a computing resource node corresponding to the target workflow through the primary scheduler; and starting a target secondary scheduler corresponding to the computing resource node through the primary scheduler, and scheduling the target workflow through the target secondary scheduler. The invention adopts a multi-stage scheduling framework to realize distributed scheduling on different computing resource nodes, thereby achieving the purpose of extensible scheduling and improving the scheduling performance of the system.
Further, based on the first embodiment of the container-based workflow scheduling method of the present invention, a second embodiment of the container-based workflow scheduling method of the present invention is proposed.
The second embodiment of the container-based workflow scheduling method differs from the first embodiment of the container-based workflow scheduling method in that, before the step of determining the target workflow corresponding to the workflow execution request, the method further includes:
step c1, determining a workflow to be executed, and describing the workflow to be executed according to a preset description mode to obtain a corresponding description model;
in this embodiment, first, the application program required by the workflow is encapsulated by a container technology, and then the workflow to be executed is determined according to the various application programs required in the workflow scheduling and the precedence logic relationship called by the application programs. It will be appreciated that the encapsulation of applications required by the workflow by container technology ensures that the applications possess the necessary libraries, dependencies and files and, more importantly, that the user can migrate these applications freely in production without any negative impact. Therefore, the container technology provides guarantee for cross-platform operation of the application program. In addition, by adopting a container cluster technology, distributed operation of the application program can be realized, the purpose of load balancing is achieved while the stability of the application program is enhanced, and the convenience of large-scale container cluster management is improved, wherein the container cluster technology can be k8s (Kubernets, an open source container cluster management system), and the k8s can provide a series of complete functions of deployment operation, resource scheduling, service discovery, dynamic scaling and the like for the containerized application program on the basis of a Docker (open source application container engine) technology.
After determining the workflow to be executed, if a Directed Acyclic Graph (DAG) is described in a preset description mode, a corresponding description model, namely a workflow model, is obtained. In graph theory, a directed graph is an acyclic directed graph if it cannot go from a vertex back to the point through several edges. The workflow is composed of workflow nodes, each workflow node may contain one or more applications, the DAG is a data structure, and the workflow to be executed is described through the DAG, that is, information such as the calling relationship of the node and the node application program of the workflow to be executed is stored through the data structure. In addition, when the DAG is modeled, the DAG can be modeled by adopting a modeling language in a format of json or yaml and the like, so that workflow information corresponding to a workflow to be executed is saved in a workflow model. Because the directed acyclic graph supports asynchronous concurrency and can form a topological tree structure, the description of the workflow to be executed through DAG can greatly improve the expansibility of the workflow scheduling system.
Specifically, the description model comprises a workflow vertex set and a workflow edge set, wherein each vertex in the workflow vertex set represents an application interface array description; each edge in the workflow edge set consists of an edge description array. The workflow vertex set comprises an application interface Unique Identifier ID, an application interface name, an application interface parameter and the like, wherein the application interface Unique Identifier ID is a Unique representation character string generated by adopting a UUID (Universal Unique Identifier) specification; the application interface name is expressed by adopting a character string; the application interface parameters are represented by an array of key value pairs. In addition, the workflow edge set, namely each edge is composed of an edge description array and comprises an edge unique identifier ID, a source vertex, a destination vertex, conditions and the like, wherein the edge unique identifier ID is a unique representation character string generated by adopting a UUID specification; the source vertex and the destination vertex are both represented by application interface unique identifiers ID; conditions, i.e., the set of conditions that the edge executes.
It should be noted that UUID is a standard for software construction, so that all elements in the distributed system can have unique identification information without the need of specifying the identification information through the central control end. In this way, each workflow can create a unique UUID without conflict with other workflows, and the problem of name duplication does not need to be considered when creating the database.
And c2, receiving a workflow execution request based on the description model.
In this embodiment, if the workflow to be executed is described through the DAG, after the corresponding description model is obtained, the description model in the DAG format may be scanned and stored in the preset database. Then, related information of the workflow in a preset database can be obtained on a client through a user, and a workflow execution request is triggered according to actual application requirements; or, a workflow execution starting time point of each workflow to be executed may be preset in the scheduling system, and when the starting time point is reached, the client automatically triggers a workflow execution request. When a workflow execution request is received, the scheduling system can stably and orderly schedule the workflow according to the workflow execution request. For example, if a customer submits order information in the scheduling system, that is, if the order information is generated, an order processing flow in the description model is activated, and the scheduling system performs workflow scheduling according to the order processing flow.
According to the container-based workflow scheduling method, the workflow execution request is triggered by constructing the workflow model, so that the execution efficiency of the workflow can be improved, and the stability of a scheduling system can be improved.
Further, based on the first and second embodiments of the container-based workflow scheduling method of the present invention, a third embodiment of the container-based workflow scheduling method of the present invention is proposed.
The third embodiment of the container-based workflow scheduling method differs from the first and second embodiments of the container-based workflow scheduling method in that the step of determining, by the primary scheduler, a computing resource node corresponding to the target workflow comprises:
step d1, determining an initial computing resource node corresponding to the target workflow according to the workflow execution request;
and d2, acquiring the load condition corresponding to the initial computing resource node, and determining a target computing resource node corresponding to the target workflow according to the load condition through the primary scheduler.
In this embodiment, when a user executes a target workflow through a client, a computing resource node corresponding to the workflow is selected by the user, that is, a workflow execution request includes related information of an initial computing resource node corresponding to the target workflow. After the scheduling system receives the workflow execution request, the load condition on the initial computing resource node is considered preferentially, and then the primary scheduler distributes the target workflow according to the specific load condition, so that the computing resource node corresponding to the target workflow is determined.
Further, the step of determining, by the primary scheduler, a target computing resource node corresponding to the target workflow according to the load condition includes:
step e1, when the load condition is in a preset load range, determining the initial computing resource node as a target computing resource node corresponding to the target workflow through the primary scheduler;
and e2, when the load condition is not in a preset load range, detecting the node load condition of other computing resource nodes through the primary scheduler, and determining a target computing resource node corresponding to the target workflow from the other computing resource nodes of which the node load condition is in the preset load range.
In this embodiment, in order to ensure efficient execution of the workflow, a load range on the computing resource node is set in advance, that is, a preset load range. When the load condition on the initial computing resource node is detected to be within a preset load range, the initial computing resource node is indicated to have sufficient resources, and the primary scheduler can preferentially distribute the target workflow to the initial computing resource node; when the load condition on the initial computing resource node is detected not to be in the preset load range, namely when the resource on the initial computing resource node is limited, the primary scheduler rearranges the target workflow and allocates a proper running environment for the target workflow. Specifically, the primary scheduler detects load conditions of other computing resource nodes, that is, node load conditions, and selects one computing resource node as a target computing resource node corresponding to the target workflow from the other computing resource nodes whose node load conditions are within a preset load range.
It can be understood that, because a certain primary scheduler can preempt resources from a workflow queue in a contention consumption queue manner, it indicates that the load on the primary scheduler is lighter than that of other primary schedulers, and thus the primary scheduler can acquire workflow information from the workflow queue in a contention manner, and therefore, the primary scheduler preferentially uses the initial computing resource node to process the target workflow.
In the method for scheduling a workflow based on a container according to the embodiment, the target computing resource node corresponding to the target workflow is determined according to the load condition of the initial computing resource node, and the target workflow is processed on the default initial computing resource node in the workflow execution request by giving priority to the target workflow, so that the workflow scheduling efficiency of the scheduling system is improved.
Further, based on the first, second and third embodiments of the container-based workflow scheduling method of the present invention, a fourth embodiment of the container-based workflow scheduling method of the present invention is proposed.
The fourth embodiment of the container-based workflow scheduling method differs from the first, second, and third embodiments of the container-based workflow scheduling method in that, after the step of concurrently executing, by the target secondary scheduler, the task node to schedule the target workflow, the method further includes:
step f1, receiving execution processing results returned by the task nodes through the target secondary scheduler;
and f2, determining whether the target workflow is scheduled completely according to the execution processing result.
In this embodiment, after the execution of each task node is completed, the execution processing result of the task node may be returned by calling the Restful API, and then the execution processing result returned by each task node is received by the target secondary scheduler and fed back to the user. And after the execution processing results of all the task nodes are fed back, the target secondary scheduler uniformly collects the execution processing results and feeds back the workflow scheduling result of each layer according to the execution processing results, so that whether the scheduling of the target workflow to be executed is finished is determined according to the workflow scheduling result. Specifically, if a successful DAG scheduling result is fed back to the user, the target workflow can be mapped to the computing resource for scheduling after the user confirms; if the result of DAG scheduling failure is fed back to the user, the user can select the next activity according to the execution processing result fed back until the scheduling work of the workflow is completed.
In the container-based workflow scheduling method of this embodiment, after the target secondary scheduler performs concurrent execution processing on each task node, whether the target workflow is scheduled is determined by receiving an execution processing result returned by each task node, which is helpful for a user to know the execution progress of workflow scheduling in time, and improves the stability of the scheduling system to a certain extent.
The invention also provides a container-based workflow scheduling device. Referring to fig. 4, the container-based workflow scheduling apparatus of the present invention is applied to a container-based workflow scheduling system including a primary scheduler and a secondary scheduler, the apparatus including:
the receiving and determining module 10 is configured to, when a workflow execution request is received, determine a target workflow corresponding to the workflow execution request, and allocate the target workflow to a preset workflow queue;
a monitoring determining module 20, configured to monitor the preset workflow queue through the primary scheduler to obtain the target workflow from the preset workflow queue, and determine a computing resource node corresponding to the target workflow through the primary scheduler;
and the starting scheduling module 30 is configured to start the target secondary scheduler corresponding to the computing resource node through the primary scheduler, and schedule the target workflow through the target secondary scheduler.
Preferably, the listening determining module further comprises a node determining unit, and the node determining unit is configured to:
determining an initial computing resource node corresponding to the target workflow according to the workflow execution request;
and acquiring the load condition corresponding to the initial computing resource node, and determining a target computing resource node corresponding to the target workflow according to the load condition through the primary scheduler.
Preferably, the node determination unit is further configured to:
when the load condition is in a preset load range, determining the initial computing resource node as a target computing resource node corresponding to the target workflow through the primary scheduler;
when the load condition is not in a preset load range, detecting the node load condition of other computing resource nodes through the primary scheduler, and determining a target computing resource node corresponding to the target workflow from the other computing resource nodes of which the node load condition is in the preset load range.
Preferably, the start scheduling module is further configured to:
analyzing the target workflow through the target secondary scheduler to determine a task node corresponding to the target workflow;
and performing concurrent execution processing on the task nodes through the target secondary scheduler so as to schedule the target workflow.
Preferably, the container-based workflow scheduling apparatus further comprises an execution feedback unit, and the execution feedback unit is configured to:
receiving an execution processing result returned by each task node through the target secondary scheduler;
and determining whether the target workflow is scheduled to be finished or not according to the execution processing result.
Preferably, the listening determination module is further configured to:
monitoring the preset workflow queue through the primary scheduler, and actively pulling the target workflow from the preset workflow queue through the primary scheduler to acquire workflow information in the target workflow;
and determining a computing resource node corresponding to the target workflow according to the workflow information through the primary scheduler.
Preferably, the container-based workflow scheduling apparatus further includes a description triggering unit, and the description triggering unit is configured to:
determining a workflow to be executed, and describing the workflow to be executed according to a preset description mode to obtain a corresponding description model;
a workflow execution request is received based on the description model.
The invention also provides a storage medium.
The storage medium of the present invention has stored thereon a container-based workflow scheduler which, when executed by a processor, implements the steps of the container-based workflow scheduling method as described above.
The method implemented when the container-based workflow scheduling program running on the processor is executed may refer to each embodiment of the container-based workflow scheduling method of the present invention, and details are not described here again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal system (e.g., a mobile phone, a computer, a server, an air conditioner, or a network system) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes performed by the present specification and the attached drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (9)

1. A method for scheduling a container-based workflow is applied to a container-based workflow scheduling system, wherein the container-based workflow scheduling system comprises a primary scheduler and a secondary scheduler, and the method comprises the following steps:
when a workflow execution request is received, determining a target workflow corresponding to the workflow execution request, and distributing the target workflow to a preset workflow queue;
monitoring the preset workflow queue through the primary scheduler to obtain the target workflow from the preset workflow queue, and determining a computing resource node corresponding to the target workflow through the primary scheduler;
starting a target secondary scheduler corresponding to the computing resource node through the primary scheduler, and scheduling the target workflow through the target secondary scheduler;
before the step of determining the target workflow corresponding to the workflow execution request, the method further includes:
determining a workflow to be executed, and describing the workflow to be executed according to a preset description mode to obtain a corresponding description model, wherein the description model comprises a workflow vertex set and a workflow edge set, and each vertex in the workflow vertex set represents an application interface array description; each edge in the workflow edge set consists of an edge description array;
a workflow execution request is received based on the description model.
2. The method for container-based workflow scheduling of claim 1 wherein said step of determining by said primary scheduler the computational resource node to which said target workflow corresponds comprises:
determining an initial computing resource node corresponding to the target workflow according to the workflow execution request;
and acquiring the load condition corresponding to the initial computing resource node, and determining a target computing resource node corresponding to the target workflow according to the load condition through the primary scheduler.
3. The method for container-based workflow scheduling of claim 2 wherein said step of determining, by said primary scheduler, a target computing resource node corresponding to said target workflow based on said load condition comprises:
when the load condition is within a preset load range, determining the initial computing resource node as a target computing resource node corresponding to the target workflow through the primary scheduler;
when the load condition is not in a preset load range, detecting the node load condition of other computing resource nodes through the primary scheduler, and determining a target computing resource node corresponding to the target workflow from the other computing resource nodes of which the node load condition is in the preset load range.
4. The container-based workflow scheduling method of claim 1 wherein said step of scheduling said target workflow by said target secondary scheduler comprises:
analyzing the target workflow through the target secondary scheduler to determine a task node corresponding to the target workflow;
and performing concurrent execution processing on the task nodes through the target secondary scheduler so as to schedule the target workflow.
5. The method for container-based workflow scheduling of claim 4 wherein said step of concurrently executing processing of said task nodes by said target secondary scheduler to schedule said target workflow is followed by further comprising:
receiving execution processing results returned by the task nodes through the target secondary scheduler;
and determining whether the target workflow is scheduled to be completed or not according to the execution processing result.
6. The method for scheduling workflow based on container according to claim 1, wherein the step of listening to the preset workflow queue through the primary scheduler to obtain the target workflow from the preset workflow queue and determining the computing resource node corresponding to the target workflow through the primary scheduler comprises:
monitoring the preset workflow queue through the primary scheduler, and actively pulling the target workflow from the preset workflow queue through the primary scheduler to acquire workflow information in the target workflow;
and determining a computing resource node corresponding to the target workflow according to the workflow information through the primary scheduler.
7. A container-based workflow scheduling apparatus applied to a container-based workflow scheduling system including a primary scheduler and a secondary scheduler, the container-based workflow scheduling apparatus comprising:
the system comprises a receiving and determining module, a processing module and a processing module, wherein the receiving and determining module is used for determining a target workflow corresponding to a workflow execution request when the workflow execution request is received, and distributing the target workflow to a preset workflow queue;
the monitoring determining module is used for monitoring the preset workflow queue through the primary scheduler to acquire the target workflow from the preset workflow queue and determining a computing resource node corresponding to the target workflow through the primary scheduler;
the starting scheduling module is used for starting a target secondary scheduler corresponding to the computing resource node through the primary scheduler and scheduling the target workflow through the target secondary scheduler;
wherein the apparatus further comprises:
the monitoring determining module is further configured to determine a workflow to be executed, and describe the workflow to be executed according to a preset description mode to obtain a corresponding description model, where the description model includes a workflow vertex set and a workflow edge set, and each vertex in the workflow vertex set represents an application interface array description; each edge in the workflow edge set consists of an edge description array;
the receiving determination module is further configured to receive a workflow execution request based on the description model.
8. A container-based workflow scheduling system, the container-based workflow scheduling system comprising: memory, a processor and a container-based workflow scheduler stored on the memory and executable on the processor, the container-based workflow scheduler when executed by the processor implementing the steps of the container-based workflow scheduling method according to any of claims 1 to 6.
9. A storage medium having stored thereon a container-based workflow scheduler, the container-based workflow scheduler when executed by a processor implementing the steps of the container-based workflow scheduling method according to any of claims 1 to 6.
CN202110417260.9A 2021-04-16 2021-04-16 Container-based workflow scheduling method, device and system and storage medium Active CN113225269B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110417260.9A CN113225269B (en) 2021-04-16 2021-04-16 Container-based workflow scheduling method, device and system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110417260.9A CN113225269B (en) 2021-04-16 2021-04-16 Container-based workflow scheduling method, device and system and storage medium

Publications (2)

Publication Number Publication Date
CN113225269A CN113225269A (en) 2021-08-06
CN113225269B true CN113225269B (en) 2022-11-22

Family

ID=77087887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110417260.9A Active CN113225269B (en) 2021-04-16 2021-04-16 Container-based workflow scheduling method, device and system and storage medium

Country Status (1)

Country Link
CN (1) CN113225269B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023102869A1 (en) * 2021-12-10 2023-06-15 上海智药科技有限公司 Task management system, method and apparatus, device, and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2532442A1 (en) * 2005-01-10 2006-07-10 Hill-Rom Services, Inc. System and method for managing workflow
CN109639791A (en) * 2018-12-06 2019-04-16 广东石油化工学院 Cloud workflow schedule method and system under a kind of container environment
CN111522730A (en) * 2020-03-09 2020-08-11 平安科技(深圳)有限公司 Program testing method and device, computer device and computer readable medium
CN112202899A (en) * 2020-09-30 2021-01-08 北京百度网讯科技有限公司 Workflow processing method and device, intelligent workstation and electronic equipment

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1770515A1 (en) * 2005-09-27 2007-04-04 Sap Ag Enabling pervasive execution of workflows
CN109471727B (en) * 2018-10-29 2021-01-22 北京金山云网络技术有限公司 Task processing method, device and system
CN111506406A (en) * 2020-04-10 2020-08-07 深圳前海微众银行股份有限公司 Workflow scheduling method, device and system and computer readable storage medium
CN111897622B (en) * 2020-06-10 2022-09-30 中国科学院计算机网络信息中心 High-throughput computing method and system based on container technology
CN111861412B (en) * 2020-07-27 2024-03-15 上海交通大学 Completion time optimization-oriented scientific workflow scheduling method and system
CN112486648A (en) * 2020-11-30 2021-03-12 北京百度网讯科技有限公司 Task scheduling method, device, system, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2532442A1 (en) * 2005-01-10 2006-07-10 Hill-Rom Services, Inc. System and method for managing workflow
CN109639791A (en) * 2018-12-06 2019-04-16 广东石油化工学院 Cloud workflow schedule method and system under a kind of container environment
CN111522730A (en) * 2020-03-09 2020-08-11 平安科技(深圳)有限公司 Program testing method and device, computer device and computer readable medium
CN112202899A (en) * 2020-09-30 2021-01-08 北京百度网讯科技有限公司 Workflow processing method and device, intelligent workstation and electronic equipment

Also Published As

Publication number Publication date
CN113225269A (en) 2021-08-06

Similar Documents

Publication Publication Date Title
CN110704186B (en) Computing resource allocation method and device based on hybrid distribution architecture and storage medium
CN113535367B (en) Task scheduling method and related device
CN112328378B (en) Task scheduling method, computer device and storage medium
CN107943457B (en) Workflow modeling method and system for business object
CN113454614A (en) System and method for resource partitioning in distributed computing
CN111768006A (en) Artificial intelligence model training method, device, equipment and storage medium
CN106681820B (en) Extensible big data computing method based on message combination
CN111338791A (en) Method, device and equipment for scheduling cluster queue resources and storage medium
CN114610474B (en) Multi-strategy job scheduling method and system under heterogeneous supercomputing environment
CN109800937A (en) Robot cluster dispatches system
CN109857535B (en) Spark JDBC-oriented task priority control implementation method and device
WO2021180092A1 (en) Task dispatching method and apparatus
CN111427675A (en) Data processing method and device and computer readable storage medium
CN113010598A (en) Dynamic self-adaptive distributed cooperative workflow system for remote sensing big data processing
WO2016074130A1 (en) Batch processing method and device for system invocation commands
CN115134371A (en) Scheduling method, system, equipment and medium containing edge network computing resources
CN113225269B (en) Container-based workflow scheduling method, device and system and storage medium
CN116010064A (en) DAG job scheduling and cluster management method, system and device
CN117435324A (en) Task scheduling method based on containerization
CN113255165A (en) Experimental scheme parallel deduction system based on dynamic task allocation
CN112199184A (en) Cross-language task scheduling method, device, equipment and readable storage medium
CN116302448A (en) Task scheduling method and system
CN114595041A (en) Resource scheduling system and method
CN115904673B (en) Cloud computing resource concurrent scheduling method, device, system, equipment and medium
CN117421109B (en) Training task scheduling method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant