CN111338774A

CN111338774A - Distributed timing task scheduling system and computing device

Info

Publication number: CN111338774A
Application number: CN202010107648.4A
Authority: CN
Inventors: 李红
Original assignee: Huayun Data Co ltd
Current assignee: Huayun Data Co ltd
Priority date: 2020-02-21
Filing date: 2020-02-21
Publication date: 2020-06-26
Anticipated expiration: 2040-02-21
Also published as: CN111338774B

Abstract

The invention provides a distributed timed task scheduling system and a computing device, wherein the distributed timed task scheduling system comprises a task scheduler, a service scheduler, a task timer, a message queue, a scheduling center and a timed task issuing component which are configured in a declaration mode in at least two nodes; the task scheduler issues the timing task to the message queue, and the rest nodes monitor the message queue and establish the timing task which is the same as the timing task issued to the message queue; and the message queue issues at least one timing job contained in the timing task to the scheduling center, and the scheduling center detects the response capability of the nodes so as to determine a service scheduler corresponding to the execution of the timing job according to the response capability of each node. The invention ensures the fine granularity of the scheduling process of the timing task, improves the processing capability of the multi-timing task and realizes the self-adaptive arrangement processing of the business data.

Description

Distributed timing task scheduling system and computing device

Technical Field

The invention relates to the technical field of computers, in particular to a distributed timed task scheduling system and a computing device.

Background

A cloud platform, a server cluster configuration environment runs one or more Project projects (Project) that require a configuration timing Task (Task). Based on the situation that in a large-scale cloud platform and server cluster configuration environment, there are many projects and multiple tasks included in a single project, it is necessary to perform task determination, for example, timeout state determination of an order system, timed update of cache data, timed sending of mails to users, even some regularly calculated reports, and the like. In order to ensure high availability and high fault tolerance, a large-scale cloud platform or server cluster usually adopts a distributed architecture, so that distributed timing tasks are generated at the same time.

Currently, mainstream distributed timing task scheduling is based on the Quartz framework and relies on Zookeeper. The Zookeeper can realize data fragmentation, thereby ensuring that data cannot be processed repeatedly and improving the data processing speed. The method plays a significant role in the fields of financial industry, mobile payment and the like. However, the technical route adopting Zookeeper can cause very large pressure on the database, and the database is at risk of downtime; in the technical route adopting Quartz, the dependence on the database is strong, and the deployment of the database is complex; particularly, the Quartz framework is adopted to realize the over-occupation of resources when the distributed timing task is scheduled, the uniqueness in the task scheduling process cannot be ensured, and the technical means of only adopting the distributed database in the prior art can cause the calculation overhead and the calculation pressure on the distributed database.

More importantly, the distributed timed task scheduling system and the distributed timed task scheduling method in the prior art have the following defects due to the characteristics that the diversity is presented on the specification and the massively presented on the quantity based on the user request or the issued task. Firstly, for a complex environment or a scene requiring frequent scheduling of a timing task, there are usually many services, a service usually includes at least one service flow, and different service flows often include at least one service logic. This requires the developer to know the logic or logics contained in a particular service, which obviously cannot be realized in practice, because it is not possible for the developer to make all the service flows in advance that meet the specific needs of the user; and may lead to increased difficulty in developing the application/program running the distributed timed task scheduling system by the front-end developer and the back-end developer. Meanwhile, in the prior art, a distributed timed task scheduling system is usually operated in computing nodes, the scale or configuration of the computing nodes in a cloud platform is the same, but resources required for scheduling timed tasks are different, so that reference bases cannot be provided for personalized addition or deletion of the computing nodes in computing clusters such as the cloud platform of the distributed timed task scheduling system disclosed in the prior art, and resource waste is caused.

In view of the above, there is a need for an improved distributed timing task scheduling method and related similar technical solutions in the prior art to solve the above problems.

Disclosure of Invention

The invention aims to disclose a distributed timed task scheduling system and a computing device based on the scheduling system, which improve the fault tolerance of a node running the timed task and realize the consistency of task scheduling, and realize the fine granularity of the scheduling process of the timed task, so that the timed task contained in an entity service can be reasonably determined to be a specific execution subject, the overall scheduling processing capability in a multi-timed task scheduling scene is solved, and the difficulty of service scheduling is reduced.

To achieve the first object, the present invention provides a distributed timed task scheduling system, including:

a task scheduler, a service scheduler and a task timer configured in a declarative manner in at least two nodes,

a queue of messages is provided for each of the plurality of messages,

at least two dispatch centers disposed between the message queue and the service dispatcher for detecting node response capabilities,

the timing task issuing component selects at least one node as a designated node responding to the timing task and a first storage device for storing an execution result corresponding to the timing task;

the task scheduler issues the timing task to the message queue, and the rest nodes monitor the message queue and establish the timing task which is the same as the timing task issued to the message queue; and the message queue issues at least one timing job contained in the timing task to a dispatching center, and the dispatching center detects the response capability of the nodes so as to determine a service dispatcher corresponding to the execution of the timing job according to the response capability of each node.

As a further improvement of the present invention, the task scheduler includes a flow scheduling process, the service scheduler includes a service scheduling process, the nodes are based on a distributed lock corresponding to the timed task, the flow scheduling process included in the task scheduler is used to determine whether to notify the scheduling center by a polling mechanism in the message queue according to a time limit set by the task timer, and the scheduling center determines at least one service scheduler corresponding to the execution of the timed job according to the response capability of each node.

As a further improvement of the invention, at least two scheduling centers detect the response capability of the nodes in a period of time set by a task timer deployed in at least one node selected by the timed task issuing component.

As a further improvement of the invention, at least one pair of flow scheduling process and service scheduling process is configured in a declaration mode in at least two nodes, and the flow scheduling process and the service scheduling process are decoupled by using a message queue.

As a further improvement of the present invention, after the first storage device stores the execution result corresponding to the response timing task, the first storage device notifies the node executing the timing task and the service scheduler deployed in the node by a message queue;

the first storage device is selected from a JVM memory, a distributed storage component or a database.

As a further improvement of the invention, the timing task issuing component comprises: a user interface and load balancer;

the user interface receives the timing task and issues the timing task to the load balancer, and at least one node is selected among the nodes as a designated node through a load balancing strategy built in the load balancer based on the distributed lock corresponding to the timing task to respond to the timing task.

As a further improvement of the present invention, the scheduling center responds to the timing task issued from the message queue, performs splitting on the timing task according to a configuration on which at least one timing job included in the timing task depends, and determines at least one service scheduler corresponding to the executed sub-timing task for a plurality of sub-timing tasks formed after splitting the timing task;

the nodes are computing nodes, storage nodes or network nodes.

As a further improvement of the present invention, the distributed timed task scheduling system is used for responding to a multi-timed task scene;

the task scheduler and the service scheduler configured by the node comprise one or more pairs of flow scheduling processes and service scheduling processes, and the flow scheduling processes and the service scheduling processes are decoupled by using a message queue.

As a further improvement of the invention, the message queue is maintained on a second storage device logically separate from the first storage device, the second storage device being selected from a physical server or a virtual server or a non-volatile storage device, the message queue being a RabbitMQ.

As a further improvement of the invention, the service scheduling process is configured as one or more service units for responding to the timing task, the service scheduling process and the service units have a mapping relationship, and the service units are deployed at any one node.

As a further improvement of the present invention, the process scheduling process defines at least one business process, and the service scheduling process defines at least one business logic, so as to configure different distributed timing task scheduling services through a mapping relationship between the business process and the business logic;

the service process comprises service data issued from a user to the timing task issuing component, and the service process is not integrated into an executable code of an actual service;

the business logic is integrated into the service unit;

wherein the service unit is a container, a virtual machine or a microservice.

Based on the same inventive idea, the invention further discloses a computing device, and the computing device deploys a distributed timed task scheduling system as disclosed in any one of the inventions of the invention.

As a further improvement of the invention, the computing device is configured as a cloud computing platform, a server cluster or a data center, and the nodes are computing nodes, storage nodes or network nodes.

Compared with the prior art, the invention has the beneficial effects that:

(1) by at least two scheduling centers arranged between the message queue and the service scheduler, the timing task can be reasonably issued to the service scheduler through the message queue to be executed, so that the fine granularity of the scheduling process of the timing task is ensured, the overall scheduling processing capacity of the distributed timing task scheduling system in a multi-timing task scheduling scene is improved, the use of physical resources and/or virtual resources in the process that a computing device or a cloud computing platform configuring the distributed timing task scheduling system responds to the resource to the timing task is more reasonable, and the waste or the idle of the physical resources and/or the virtual resources is effectively prevented;

(2) the method can overcome the defect that the timing task cannot be effectively responded due to node faults in the scheduling process of the timing task, and realizes the horizontal telescopic capability of scheduling the timing task.

(3) In the application, different distributed timing task scheduling services are configured according to the mapping relationship between the business process and the business logic, so that a user or a manager and other personnel without bottom layer code development capability can flexibly schedule tasks according to business needs to package the business process which is adaptive to the business and comprises a plurality of business logics, the execution of the business is prevented from being interfered by manually modifying the bottom layer codes, the user or the manager only needs the business data corresponding to the relation and the business, the internal association of the business and the logics is thoroughly stripped, and the self-adaptive scheduling processing of the business data is realized.

Drawings

FIG. 1 is a topology diagram of a distributed timed task scheduling system according to the present invention;

fig. 2 is a schematic diagram illustrating that the scheduling center detects response capabilities formed by all nodes and stores the response capabilities of the nodes in a message queue;

FIG. 3 is a diagram illustrating a scenario where a node selected by a timed task issuing component fails to respond to a timed task including a timed job, and a scheduling center reselects a new node to respond to the timed task;

FIG. 4 is a diagram illustrating a scenario where a node selected by a timed task issuing component fails to respond to a timed task including a timed job, and two new nodes are reselected by a scheduling center to respond to the timed task;

FIG. 5 is a service unit formed in a node for responding to a timed task based on different response capabilities, wherein a plurality of service units shown in FIG. 5 are individually or arbitrarily combined to respond to a timed job or a sub-timed task associated with a service scheduling process;

FIG. 6 is a schematic diagram of a service flow and service logic separated from each other and respectively deployable in different nodes;

FIG. 7 is a schematic diagram of a business process associating business logic to form a dispatch service 301 a;

FIG. 8 is a schematic diagram of a business process associating business logic to form a dispatch service 301 b;

FIG. 9 is a schematic diagram of a business process associating business logic to form a dispatch service 301 c;

FIG. 10 is a schematic illustration of a business process associating business logic to form a dispatch service 301 d;

FIG. 11 is a diagram of a service architecture of a timed task issuing module;

fig. 12 is a service architecture diagram in which a message queue responds to a plurality of timing tasks issued by a designated node;

FIG. 13 is a flowchart of a distributed timed task scheduling system configured to execute an inspection task on a cloud computing platform according to the present invention;

fig. 14 is a schematic diagram of splitting a timing task initially issued to a designated node into a plurality of sub-timing tasks, where each sub-timing task includes at least one timing job;

FIG. 15 is a schematic illustration of a timed task containing at least one timed job;

FIG. 16 is a topology diagram of an example computing device configured as a cloud computing platform in accordance with the present invention.

Detailed Description

The present invention is described in detail with reference to the embodiments shown in the drawings, but it should be understood that these embodiments are not intended to limit the present invention, and those skilled in the art should understand that functional, methodological, or structural equivalents or substitutions made by these embodiments are within the scope of the present invention.

Before describing in detail various embodiments of the present invention, technical terms and meanings referred to in the specification are summarized as follows.

Term "Logic"includes any physical and tangible functions for performing a task. For example, each operation illustrated in the flowcharts corresponds to a logical component for performing the operation. Operations may be performed using, for example, software running on a computer device, hardware (e.g., chip-implemented logic functions), etc., and/or any combination thereof. When implemented by a computing device, the logical components represent electrical components that are physical parts of the computer system, regardless of the manner in which they are implemented.

Term "Task"in this application and"Timed tasks'or'Task"has equivalent meaning and can be used in the actual code programming"Job"replace it. The scheduling of the timed task can be expressed in actual scenes as that a payment system runs for batch at 1 and half a hour each day, one-day clearing is carried out, and the number 1 of each month is carried out for the last month clearing, or in various business scenes such as that delivery information and logistics information are reminded to a customer in the form of short messages or mails after goods are delivered successfully, or that forced recovery is carried out on cloud hosts distributed to the user according to a term set by a lease and the like.

Term "Appointed sectionDot"and term"New designated node"(or"Reselected designated node") refers to a Node or a computing Node responding to a timing task formed based on different time points in the scheduling process of the timing task, as shown in fig. 1, if Node-1 is selected as a responding Node when the first issued timing task is issued, then Node-1 is a designated Node, and when Node-1 cannot execute the scheduling processing of the timing task due to special reasons such as downtime, response timeout, etc., the scheduling processing of the timing task is transferred to Node-2 and/or Node-3 through the message queue 10 and the scheduling centers 301 and 302 for processing, then Node-2 and/or Node-3 can be understood as the specific Node in this application"New Is assigned to a node'or'Reselected designated node". Of course, in this embodiment, the nodes that can be used to respond to the timed task and/or the sub-timed task or the timed job must be alive, and the overhead and load conditions of the nodes do not have to be considered.

Term "Node point"and term"Computing node"has technical meaning equivalent to that of a term"Scheduling services"and term"Distributed type Timed task scheduling service"has the technical meaning of equivalent.

Phrase "Is configured as"or a phrase"Is configured to"includes any manner in which any kind of physical and tangible functionality may be constructed to perform the identified operations. The functions may be configured to perform operations using, for example, software running on a computer device, hardware (e.g., chip-implemented logic functions), and/or the like, and/or any combination thereof.

The first embodiment is as follows:

referring to fig. 1-15, an embodiment of a distributed timed task scheduling system 100 (hereinafter referred to as "scheduling system") according to the present invention is disclosed. The distributed timed task scheduling system 100 is most preferably used in response to multiple timed task scenarios, although it may be applied to a single timed task scenario. The distributed timing task scheduling system 100 is implemented based on golang, and the programs forming the distributed timing task scheduling system 100 are decompressed under linux servers/opt/directories.

To facilitate the description of the core concepts of the present invention, the applicant shows a general scenario to exemplify and outline the invention. For example, the timed task issuing component 500 receives a command issued by a user or an administrator to create a timed task, and selects Node-1 as a designated Node in response to the timed task. When the message queue 10(RabbitMQ Cluster) can execute one or more timed jobs contained in the timed task 90 by the response capability of the Node-1, the scheduling center 301 executes processing on the one or more timed jobs through a Service scheduling process or a Service execution mechanism (e.g. a virtual CPU or an application) mounted to the Service scheduler (Service-scheduler)901 by using a Service scheduling process for the Service scheduler 901.

A service scheduling process and a service execution mechanism (not shown) may also be understood as a relationship of software and hardware (or physical system, software system) having some kind of computer executable function. When part of the timed jobs or part of the sub-timed tasks in the timed task 90 cannot respond due to Node-1 power failure, downtime or program failure, the dispatching center 301 (and/or the dispatching center 302) reselects one or more Service schedulers 902,903 deployed in Node-2 and/or Node-3 (i.e. the remaining nodes) through the dispatching center 301, and forwards one or more of the part of the timed jobs or part of the sub-timed tasks or the sub-timed tasks in the timed task 90 to a Service scheduler (Service-scheduler)902 and/or a Service scheduler (Service-scheduler)903 for execution.

After the timing task 90 is executed by the service scheduling process or the service executing mechanism, one or more task schedulers of Node-1 to Node-3 store the execution result in the first storage device 21. The task scheduler 801 determines the time points at which a specific timed task 90 starts and ends execution based on the timed task and task timer 811 to determine the execution period of the timed task 90, and thus in embodiments, the period or execution period may be a specific length of time.

In a scenario where a plurality of timed tasks are continuously processed, when the first timed task is executed, the task scheduler 801 stores the execution result in the first storage device 21. A user or administrator may access the first storage device 21 to determine whether the timed task 90 is reliably performed. Referring to fig. 1, in the present embodiment, only Node-1 to Node-3, Task-Scheduler (Timer)811, Task-Scheduler (Task-Scheduler)801 and Service Scheduler (Service) are shown in the computation cluster 200, Task-Timer (Timer)812 and Task-Scheduler (Task-Scheduler)802 are shown in Node-2, and Task-Timer (Timer)813 and Task-Scheduler (Task-Scheduler)803 are shown in Node-2.

Next, the applicant describes in detail a specific technical solution of the distributed timed task scheduling system 100 and a computing apparatus based on the distributed timed task scheduling system 100 based on the core idea of the present invention. The example shown in fig. 1 is applied to a cloud computing platform, and the number of nodes is not limited to three, and may be one, two, or more than three.

Specifically, the distributed timed task scheduling system 100 includes:

task schedulers 801-803, Service schedulers 901-903 and task timers 811-813 configured in a declarative manner in at least two nodes, a message queue 10, and at least two scheduling centers, namely a scheduling center 301 and a scheduling center 302, for detecting node response capability disposed between the message queue 10 and the Service schedulers (namely, the Service-schedulers 901-903 shown in fig. 1). The timed task issuing component 500 selects at least one node as a designated node responding to the timed task, and stores the execution result corresponding to the timed task in the first storage device 21.

Referring to fig. 11, the timed task issuing component 500 includes: a user interface 501 and a load balancer 502; the user interface 501 receives a timing task and issues the timing task to the load balancer 502, and at least one node is selected as a designated node through a load balancing policy built in the load balancer 502 based on a distributed lock corresponding to the timing task among the nodes to respond to the timing task.

Meanwhile, after the first storage device 21 stores the execution result corresponding to the response timer task, the Node (e.g., Node-1) that executes the timer task and the service scheduler (e.g., service scheduler 901) deployed in the Node are notified from the message queue. Specifically, the first storage device 21 is selected from a JVM memory, a distributed storage component, or a database, and is preferably a database. Meanwhile, the distributed storage component can be a block storage or a file storage. In scenes such as retrieval, insertion, modification, deletion and the like of a certain object based on a timing task, file storage is preferred; in the scenario of accessing and downloading streaming media files such as videos based on the timing task, block storage is preferable. In the embodiment, since the first storage device 21 for storing the execution result of the timed task is separated from each node, the calculation overhead and the calculation pressure in the case that the first storage device 21 selects the database are reduced.

The task scheduler 801 issues the timing task to the message queue 10, and the remaining nodes monitor the message queue 10 and create the same timing task as the timing task issued to the message queue 10. The message queue 10 issues at least one timing job included in the timing task to the scheduling center 301, and the scheduling center 301 detects the response capability of the node to determine a service scheduler corresponding to the execution of the timing job according to the response capability of each node. The reason why two scheduling centers (certainly, not limited to the scheduling centers, but a greater number of scheduling centers may also be configured) are configured in the distributed timed task scheduling system 100 is to solve the problem of high availability, when the scheduling center 301 has a fault, since the scheduling center 302 can synchronously receive the timed task of at least one timed job sent from the message queue 10, the two scheduling centers may select one reliable scheduling center to perform subsequent operations based on a load balancing algorithm of the message queue 10. If the dispatch center 301 cannot detect the response capability of the node or cannot select a corresponding service scheduler for the timing task/timing job, the message queue 10 issues at least one timing job included in the timing task to the dispatch center 302, so as to implement disaster recovery switching.

In the present embodiment, the dispatching center 301 and the dispatching center 302 are logically independent from any Node, and are logically independent from the timed task issuing component 500. The roles of the dispatch center 301 and the dispatch center 302 are mainly two.

One is used for detecting the response capability of the Node-1 to the Node-3. The response capability herein may be understood as a processing capability of a specific node to respond to a timing task, and may also be understood as a processing capability of a service scheduling process configured in the specific node to a complete timing task or a sub-timing task, where the complete timing task (for example, the aforementioned "performing a forced recovery on a cloud host allocated to a user according to a term set by a lease") or the sub-timing task may include one or more timing jobs. Therefore, a reasonable task execution main body is provided for the final completion of the timing task or the sub-timing task, and the condition that the response cannot be carried out is prevented. The meaning of the sub-timing tasks is described below.

And secondly, matching a service scheduler corresponding to the timing task in one or more nodes for the timing task, the sub-timing task or the timing task. In the present application, matching a corresponding service scheduler for a timed job covers a process of matching one or more corresponding service schedulers for a timed task or a sub-timed task including one or more timed jobs. In addition, the roles and roles of the dispatch center 301 and the dispatch center 302 and the load balancer 502 in the timed task issuing component 500 are different, and the dispatch center 301 and the dispatch center 302 as a whole and the load balancer 502 are respectively located on two sides of the message queue 10 in a logical architecture, and no session or interaction occurs between them. More importantly, in the present application, only the business processes 201 to 20N of the timed task are concerned between the message queue 10 and the timed task issuing component 500, that is, a specific function (for example, starting a cloud host) is executed, and the business logic 401 to 40N on which the specific function depends does not need to be concerned or loaded. Different distributed timing task scheduling services (or simply called scheduling services) 301a to 301d are configured according to the mapping relationship between the business process and the business logic, so that a user or a manager and other personnel without the bottom layer code development capability can flexibly schedule tasks according to business needs to package a business process which is adaptive to the business and contains a plurality of business logics, the execution of the business is prevented from being interfered in a mode of manually modifying the bottom layer codes, the user or the manager only needs to pay attention to the business data corresponding to the business, the internal association between the business (namely the business process) and the logic (namely the business logic) is thoroughly stripped, and the self-adaptive scheduling processing of the business data is realized. Meanwhile, the business logic with finer differences can be configured in batch and in a modularized manner by bottom developers so as to adapt to diversified business scenes, and therefore the individual requirements of users are met.

As shown in fig. 5, the service scheduling process is configured as one or more service units for responding to the timing task, the service scheduling process and the service units have a mapping relationship, and the service units are deployed in any one node. Meanwhile, the process scheduling process defines at least one business process, and the service scheduling process defines at least one business logic, so as to configure different distributed timing task scheduling services through the mapping relation between the business processes and the business logic. The service flow includes service data issued from the user to the timing task issuing component 500, and the service flow is not integrated into an executable code of an actual service. Integrating the service logic into the service unit; the service unit is a Container (Container), a Virtual Machine (Virtual Machine), or a Microservice (Microservice), or other components with independent functions, and may even be a service middleware, a plug-in, a system patch, a program patch, or a software upgrade package file, so as to achieve high availability and extensibility of the service. Meanwhile, the mapping relationship formed between the business process and the business logic can be customized according to the instruction of the user or the administrator through the timed task issuing component 500, so as to package different scheduling services.

A service unit is a basic processing unit that can independently perform a certain timing task, sub-timing task or timing job. The service units can be developed and deployed in a batched mode by bottom developers, so that reasonable use of resources can be fully exerted in a later timing task scheduling process, iterative development can be realized, and bottom codes forming the service units can be reused, so that development intensity and development difficulty of software/system bottom developers are reduced, and repeated work is avoided. The bottom layer developer only needs to pay attention to the business logic for realizing a certain function, and does not need to pay attention to the personalized requirements of the front-end developer, the administrator or the user on the business process, so that the separation of the business and the logic and the flexible packaging of the specific timing task scheduling service are thoroughly realized.

For example, in the example shown in FIG. 5, Node-1 deploys

service units

991, 992, 993 through 99z (parameter z is generic and takes a positive integer greater than or equal to 4). Any one service unit can independently respond to the timing task (or the sub-timing task or the timing job) as a scheduling process, and also can form a business logic based on one or more service units as a whole. For example, the service unit 991 and the service unit 992 constitute one service logic 91a, or the service unit 992 and the service unit 993 constitute one service logic 92a, or all the service units in the Node-1 constitute one service logic 93 a. Service logic 91 a-93 a may be detected by dispatch center 301 to determine which service unit or units are required to respond to a particular timed task. Because the resources in each node have upper limits, the resources in the nodes, especially the residual resources, can be utilized to the maximum extent through the technical scheme, and the situation that a timing task cannot be responded in a single node is avoided.

Referring to fig. 6, the deployed service flows and the service logics in the nodes do not necessarily require one-to-one correspondence, as long as a mapping relationship is obtained between the service flows and the service logics. A Node1-1 forms a service flow 201 and service logics 401-402; forming service flows 202-203 and service logics 403-40N in the Node-2; as for the Node-P, only one or more business processes 204-20N can be formed. It should be noted that, in this embodiment, the service flows and the service logics respectively corresponding to the task scheduling processes and the service scheduling processes in each node may be in a one-to-one relationship, a many-to-one relationship, or a one-to-many relationship, or even a task scheduling process and/or a service scheduling process may not be formed in some nodes, so that the scheduling center 301 matches a dependent scheduling service 301a to a scheduling service 301d for a specific real-time task, as shown in fig. 7 to fig. 10. The business processes 201-20N are positioned in a user business layer 2, the scheduling services 301 a-301 d are positioned in a service scheduling layer 3, and the business logics 401-40N are positioned in a business logic layer 4; the user service layer 2 is located in the timing task issuing component 500, and forms a User Interface (UI)501 through a UI designer, and is usually expressed as a client, an APP, or a software interactive interface. The business logic layer 4 is located at the bottom of the computing device configuring the distributed timed task scheduling system 100. The service scheduling layer 3 is located between the service logic layer 4 and the user service layer 2, performs logic judgment and execution operation, data connection and instruction transmission, can perform logic processing on received data, realizes functions of data modification, data acquisition, data deletion and the like, feeds back processing results to the user service layer 2, and realizes software functions.

Referring to FIG. 7, Node-1 forms business processes 201-202, Node-1 forms business logics 401-402, and Node-N forms business logics 403-40N. And establishing a mapping relation between the business process 201 and the business logics 403-40N to form a scheduling service 301a, so that scheduling and response of a customized timing task are met. In this scenario, the Node-1 that was initially selected to execute the timed task does not rely entirely on Node-1 to participate in the timed task or timed job scheduling, but is instead participated in by multiple business logic in Node-N. Wherein the parameter M is smaller than the parameter N.

Referring to FIG. 8, Node-1 forms business processes 201-202, Node-1 forms business logics 401-402, and Node-N forms business logics 403-40N. Node-1 forms service logics 401-402, Node-2 forms service logic 403, Node-M includes multiple service logics of service logic 40M, and Node-N forms service logic 40N. The business processes 201-203 in the Node-1 and the Node-2 are jointly mapped with the business logic 401 in the Node-1 to form the scheduling service 301b, so that the scheduling and the response of a customized timing task are met. In this scenario, a timing task including multiple service flows can implement scheduling operation on the timing task or timing job only by service logic 401 in Node-1, so that the whole scheduling process is actually completed only by Node-1 alone.

Referring to FIG. 9, which is similar to the example scenario shown in FIG. 8, a mapping relationship is only established between business process 20M and business logic 40M to form dispatch service 301 c. The service process 20M and the service logic 40M belong to Node-N and Node-M respectively. In this scenario, an event similar to the timed task performing role migration in FIG. 3 may be implemented, so that the initially selected Node is Node-N, but the timed task is actually completed by Node-M in the entire scheduling process.

Referring to FIG. 10, in an example scenario similar to that shown in FIG. 8, a mapping relationship is only established between business process 20N and business logic 403 to form dispatch service 301 d. Therefore, in the whole scheduling process, although the Node selected initially is Node-N, the timing task is actually completed by Node-2.

The business logic is implemented by executing executable code corresponding to the actual business. When determining to execute a timing job or a sub-timing task included in the timing task, the scheduling center 301 or the scheduling center 302 focuses only on a specific business logic. Therefore, through the distributed timed task scheduling system disclosed by the invention, the business processes 201 to 20N and the business logics 401 to 40N can be separated, and the business processes 201 to 20N and the business logics 401 to 40N are associated through the scheduling center 301 or the scheduling center 302, so that services are scheduled by various distributed timed tasks in any combination, such as the scheduling service 301a to the scheduling service 301d shown in fig. 7 to fig. 10.

By the technical scheme, the simplicity of page layout and distributed timing task scheduling operation of front-end developers is improved, the fine granularity of a timing task scheduling process is remarkably improved, the processing capacity of multi-timing tasks is improved, and adaptive layout processing of business data is realized. In this embodiment, the fine granularity embodied in the scheduling process of the timing task refers to a description dimension for performing job division on the timing task, the sub-timing task or the timing job in the scheduling process.

In this embodiment, the task scheduler 801 includes a flow scheduling process, the service scheduler 901 includes a service scheduling process, the flow scheduling process included in the task scheduler 801 is used to determine whether to notify the scheduling center 301 of the polling mechanism in the message queue 10 according to the time limit set by the task timer 811 based on the distributed lock corresponding to the timed task between the nodes, and the scheduling center 301 determines at least one service scheduler 901 corresponding to the execution of the timed job according to the response capability of each node. When Node-1 issues the timing task 90, Node-2 issues the timing task 90 to the message queue 10 as shown by arrow task2, and Node-3 issues the timing task to the message queue 10 as shown by arrow task 3. Meanwhile, snoops are established between Node-2, Node-3 and the message queue 10, and the task scheduler 802 and the task timer 812 are configured synchronously in Node2, and the task scheduler 803 and the task timer 813 are configured synchronously in Node-3.

As shown in fig. 3 and fig. 4, when determining that the timed task is executed by the service scheduling process included in the service scheduler 901 in Node-1 based on the detection result of the response capability of the scheduling center 301 for each Node, the scheduling center 301 issues the timed task (including one or more timed jobs) to the service scheduler 901, returns the result to the scheduling center 301 after the timed task is executed, and finally notifies the result to the task scheduler 801 through the message queue 10, so that the task scheduler 801 stores the execution result of the timed task generated by the service scheduling process to the storage device 201 for being called or accessed by the user or the administrator. At this point, a complete timed task scheduling operation is completed. The three nodes are respectively provided with an independent flow scheduling process, a service scheduling process and task timers 811-813. After the service scheduling process in the selected Node responds to the timed task, the execution result corresponding to the responded timed task is stored in the first storage device 21, and the message queue 10 notifies the Node-1. Preferably, the first storage means 21 is selected from the JVM memory or database, and most preferably a distributed database, to improve CRUD operation efficiency.

In this embodiment, the applicant takes Node-1 as the designated Node for responding to the timing task. Node-1 configures a pair of process scheduling process and service scheduling process, and decouples the process scheduling process and the service scheduling process by using message queue 10. The message queue 10 is a RabbitMQ. Node-2 and Node-3 are configured with reference to Node-1.

If in the above process, if the dispatch center 301 fails to call the service scheduler 901 of Node-1 or the service scheduler 901 cannot satisfy a specific timing job in the current state, the timing task executes the main role migration. If the dispatch center 301 detects Node-2 and determines that the response capability (computing capability, storage capability, etc.) of the Node-2 can satisfy the timing job, the dispatch center 301 reselects the Node, re-issues the timing job to the service scheduler 902 in Node-2, and executes the processing through the built-in service scheduling process or the service execution mechanism (e.g., virtual CPU or application) associated with the service scheduler 902. The timed task execution subject role migration may be transferred from one node to another node of the computing cluster 200 (see fig. 3), or may be transferred from one node to two other nodes of the computing cluster 200 (see fig. 4).

After the timing operation is completed, Node-2 feeds back Ack response to the dispatching center 301 to inform the dispatching center 301 that the timing operation is completed; finally, the execution result is written in the message queue 10 by the scheduling center 301. As shown in fig. 4, the number of the nodes reselected by the scheduling center may also be two or more, and it is determined which one or more service units in Node-2 and Node-3 are used for responding to the timing task or the timing task(s) including at least one timing job (the same is true for the sub-timing tasks) according to the response capability required by the timing job and the mapping relationship configuration between the service flow and the service logic.

Meanwhile, in the present embodiment, the dispatch center 301 and the dispatch center 302 detect the response capability of the node according to the time limit set by the task timer deployed in the at least one node selected by the timed task issuing component 500. Although the Node initially selected to respond to the timing task 90 is Node-1, the timing task issued to the message queue 10 and the timing task included therein are still monitored by Node-2 and Node-3. Before the time point set by the task timer 811 in Node-1 comes, if the service scheduler 901 cannot respond to the timed task and the timed job contained therein, the scheduling center 301 will quickly switch the role of the execution subject executing the timed task and the timed job contained therein to the service scheduler deployed by Node-2 and/or Node-3, in particular, see the procedures shown by arrows task2 and task3 above.

The message queue 10 is stored in a second storage device 22 logically independent from the first storage device 21, the second storage device 22 is selected from a physical server or a virtual server or a non-volatile storage device, and the message queue 10 is a RabbitMQ. The timing task or the timing job included in the timing task is temporarily stored in the second storage device 22, which not only can ensure the read-write pressure and the calculation overhead of the first storage device 21, but also can ensure the stability and consistency of the timing task or the timing job included in the timing task in the scheduling process, and prevent various applications or services formed based on the timing task from being interrupted, mistaken and the like.

Specifically, as described in fig. 12, the RabbitMQ is implemented according to the distributed characteristics of Erlang (the RabbitMQ bottom layer is implemented by an Erlang architecture, so the rabbitmqctl starts an Erlang node, and connects the RabbitMQ node using the Erlang system based on the Erlang node, and a correct Erlang Cookie and a node name are required in the connection process, and the Erlang node obtains authentication by exchanging the Erlang Cookie), so the Erlang is installed first when deploying the RabbitMQ distributed cluster, and the Cookie of one service is copied to another node.

In the RabbitMQ cluster (i.e. RabbitMQ cluster), each RabbitMQ is a peer node, i.e. each node is provided for a client connection to receive and send messages. The nodes are divided into memory nodes and disk nodes, and generally, the memory nodes and the disk nodes are all established as the disk nodes so as to prevent messages after the machine is restarted from disappearing; exchange601 is the key component that accepts producer messages and routes the messages to the message queue 10. The exchange type and Binding determine the routing rules of the message. So the producer wants to send a message, first has to declare an Exchange601 and the Binding602 corresponding to the Exchange 601. This can be done by exchange Declar and BindingDeclar. In the Rabbit MQ, three parameters, namely, ExchangeName, Exchange type and Durable, are required for declaring one Exchange 601. The Exchange name is the name of the Exchange, which needs to be specified when creating Binding and the producer pushes the message through publishing. Exchange type, refers to the type of Exchange, and in the RabbitMQ, there are four types of Exchange: direct type, Fanout type and Topic type, different exchanges will exhibit different routing behavior. Durable is a persistent attribute of this Exchange 601. Declaring that a Binding needs to provide a QueueName, an ExchangeName and a BindingKey. The following sets forth different routing rules exhibited by different exchange types.

When a producer sends a message, the producer needs to designate a RoutingKey and Exchange, and after receiving the RoutingKey, the Exchange can judge the Exchange type.

a) If the type is Direct, comparing the routingKey in the message with the Binding keys in all Binding associated with the Exchange, and if the routingKey is equal to the Binding key in all Binding, sending the comparison to the Queue corresponding to the Binding.

b) If the type is Fanout, the message is sent to all queue defined by Binding with the Exchange, which is actually a broadcasting action.

c) And if the type is the Topic type, matching the RoutengKey with the BindingKey according to the regular expression, and if the matching is successful, sending the RoutengKey to the corresponding Queue.

The RabbitMQ cluster will send the messages to each consumer (consumer) in sequence. On average each consumer receives an equal number of messages. This manner of sending messages is called round-robin. Referring to fig. 12, after the timed task (task) is issued to Exchange601, a plurality of Queues including Q1-Qn are formed based on the binding process, the plurality of Queues including Q1-Qn constitute Queues603, and Q1-Qn are issued one by one to the service scheduling process and executed.

In this embodiment, at least one pair of flow scheduling process and service scheduling process is configured in a declarative manner in at least two nodes, and the flow scheduling process and the service scheduling process are decoupled by using the message queue 10. Based on the polling distribution mechanism of the message queue 10, load balancing capability can be provided for task messages corresponding to the timed tasks.

As shown in connection with fig. 1, three listening arrows represent that the service scheduling process in the three nodes listens for the timed task messages in the message queue 10. Because the service scheduling processes allocated and configured by the three nodes monitor the same timing task message of the exchangeable type, the timing task can be sequentially distributed to the selected service scheduling process through the load balancing policy of the message queue 10, and after the service scheduling process finishes executing the timing task, the service scheduling process sends a confirmation receipt to the message queue 10 along the direction indicated by the arrow corresponding to the "execution result" sent by the scheduling center 301 in fig. 1 to the message queue 10, and after receiving the confirmation receipt, the message queue 10 confirms that the timing task is executed; if the service scheduling process in Node-1 does not send the acknowledgement to the message queue 10, the message queue 10 sends the timing task that has been sent to the service scheduling process in Node-2 after a set time period (e.g., 0.5 seconds) until the message queue 10 receives the acknowledgement. Because the service scheduling process and the process scheduling process in the nodes are decoupled through the message queue 10, the uniqueness of the timing task scheduling process can be realized through the message queue 10, and the HA and load balancing problems in the distributed timing task scheduling process are solved.

Particularly, in a cloud platform example with a plurality of computing nodes, the message queue 10 has horizontal expansion and contraction capability, so that the computing nodes with different scales can be accessed simultaneously, thereby being beneficial to improving the logic stability and reliability in the capacity expansion or capacity reduction process of the computing nodes, and matching a service scheduling process in a last proper node for a timing task, thereby simplifying the convenience of the cloud computing platform applying the distributed timing task scheduling method for capacity expansion of the computing nodes.

The distributed timed task scheduling system 100 disclosed in the basic embodiment can respond to the use requirement of the timed task for the cloud platform of two to any plurality of computing nodes. Meanwhile, the distributed timing task scheduling system 100 disclosed in this embodiment does not need to rely on a traditional Quartz + Zookeeper frame, so that the technical problem of excessive resource occupation in the distributed timing task scheduling process is solved, the calculation overhead and the calculation pressure on the database (the lower concept of the first storage device 21) are reduced, and the deployment difficulty of the database is reduced.

Applicants have further optimized the disclosed distributed timed task scheduling system 100. The scheduling center 301 responds to the timing task issued from the message queue 10, performs splitting on the timing task according to a configuration on which at least one timing job included in the timing task depends, and determines at least one service scheduler corresponding to the executed sub-timing task for a plurality of sub-timing tasks formed after splitting the timing task. The node is a computing node, a storage node or a network node.

As shown in fig. 14, a timing task 90 issued to the distributed timing task scheduling system 100 can be exemplarily divided into a sub-timing task 91, a sub-timing task 92, a sub-timing task 93 and a sub-timing task 94. The granularity and the basis of the sub-timing task 90 can be known by detecting the response capability of the dispatch center 301 and the dispatch center 302 to the node, and different dispatch services formed by the mapping relationship established between the business process and the business logic are used as the basis of splitting or the basis of splitting for executing the split timing task operation. The sub-timing task 91 may include a timing operation 911 and a timing operation 912; the sub-timing task 92 includes a timing operation 921 and a timing operation 922; the sub-timing task 93 and the sub-timing task 94, and so on, may comprise one or more timing operations.

As shown in fig. 15, the timing task 90 may not be split by the dispatch center 301 and may include a timing job 911, a timing job 912, a timing job 921, and a timing job 922. Timing job 911, timing job 912, timing job 921 and timing job 922 correspond to different business logics, and one or more business logics associated with the different timing jobs are determined at the scheduling center 301.

Referring to fig. 13, applicant shows the detailed steps of performing patrol timed tasks on a cloud platform based on the distributed timed task scheduling system 100.

Step 701, configuring node polling parameters: the user may select one or more or all of the compute/network/storage nodes to configure in the node patrol parameters.

Step 702, judging the node type, selecting the inspection to be executed according to the node type configured by the user, and if the node is a computing node, executing step 703 and physical node cluster inspection; if the network node is configured, executing step 703 and switch node cluster inspection; if the storage node is configured, step 705 and Ceph storage node cluster polling are executed.

After the physical node cluster inspection is finished, executing a judging step 706, judging whether the physical node cluster inspection is successful, if so, skipping to execute a step 707 and performing cloud host node cluster inspection; if not, skipping to execute the step 710 and directly entering the cluster resource for use inspection. And after the polling of the switch nodes is finished, whether the polling is normal or not, skipping to execute the step 710 and entering the cluster resource use polling. After the Ceph storage cluster inspection is finished, judging whether the step 708 is executed, and if so, skipping to execute the step 709 and performing distributed storage cluster inspection; if not, go to step 710.

Step 711, judging whether the cluster resource use polling is successful; if not, skipping to execute step 712 and generating a routing inspection report; if yes, go to step 713, cluster alarm polling, and then go to step 712.

After the polling report is generated, the forwarding execution step 714 is performed, and a mail notification is performed, and the notification is sent to the target user through a mail.

The above mentioned physical node cluster inspection, switch node inspection, Ceph storage cluster inspection, distributed storage cluster inspection, cloud host node cluster inspection, cluster resource usage inspection, cluster alarm inspection, inspection report generation, mail notification and the like are timing operations in a platform inspection task, and the timing operations are executed in a front-back correlation manner and a parallel manner, and are independent. And forming the business process in the timing task of platform inspection.

The user only needs to design and submit the jobs from the user interface 501 to the distributed timed task scheduling system 100 through the UI designer according to the business process, and submit the timed jobs contained in the timed tasks to the distributed timed task scheduling system 100 through the business process and business logic related to the whole scheduling process and dependent in the user interface 501 of the system according to the process, that is, after the design of the declarative scheduling policy is completed, the timed tasks can be registered through analysis.

When the period set by the task timer corresponding to the timing task comes, the timing task is distributed to the service scheduling process through the message queue 10, the service scheduling process sends a message to the scheduling center 301, and the scheduling center 301 selects a proper node to execute, that is, the job execution is completed.

Example two:

an embodiment of a computing device is also disclosed in the present application based on the distributed timed task scheduling system 100 disclosed in the first embodiment.

In the present embodiment, the computing device deploys a distributed timed task scheduling system 100 as disclosed in the first embodiment. The computing clusters are configured as cloud computing platforms 400 (or simply "cloud platforms 400"), server clusters, or data centers, the nodes being computing nodes. The specific technical solution of the distributed timed task scheduling system 100 disclosed in this embodiment is shown in the first embodiment, and is not described herein again.

Referring to FIG. 16, applicants take an example of a computing device that is selected to be a cloud platform 400 for short. The cloud platform 400 may be a cloud computing platform based on the OpenStack architecture, and deploys a control Node 441, a network Node 442, a computing Node 443, and a storage Node 444, and functional nodes such as the control Node 441, the network Node 442, the computing Node 443, and the storage Node 444 may be understood as "nodes" or "nodes" in the first embodiment.

The cloud platform 400 deploys the distributed timed task scheduling system 100, and performs a session with a user through the distributed timed task scheduling system 100, so as to execute a timed task issued by the user through the cloud platform 400, and perform continuous scheduling on the timed task issued by the user based on the distributed timed task scheduling system 100. The specific technical solution of the distributed timed task scheduling system 100 disclosed in this embodiment is shown in the first embodiment, and is not described herein again.

Meanwhile, the applicant indicates that in the cloud platform 400 disclosed in this embodiment, the response capabilities of the nodes can be detected by the scheduling centers 301 and 302, so as to determine the service scheduler corresponding to the execution of the timing job according to the response capabilities of each node. Therefore, when the cloud platform 400 deploying the distributed timed task scheduling system 100 performs capacity expansion (or capacity reduction), the performance and specific configuration parameters of the added (or deleted) computing nodes 443, storage nodes 444 and other functional nodes may be ignored, so that the capacity expansion (or capacity reduction) of the cloud platform 400 including the distributed timed task scheduling system 100 is easier.

Example three:

based on a specific implementation of the computing device disclosed in the second embodiment, a computer is also disclosed in the present embodiment. The computer comprises a CPU, a bus and a storage medium, wherein the storage medium and the CPU are mounted on the bus, and the distributed timed task scheduling system 100 disclosed by the embodiment is configured in the storage medium. The specific technical solution of the distributed timed task scheduling system 100 disclosed in this embodiment is shown in the first embodiment, and is not described herein again.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above-listed detailed description is only a specific description of a possible embodiment of the present invention, and they are not intended to limit the scope of the present invention, and equivalent embodiments or modifications made without departing from the technical spirit of the present invention should be included in the scope of the present invention.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims

1. A distributed timed task scheduling system, comprising:

a queue of messages is provided for each of the plurality of messages,

2. The distributed timed task scheduling system according to claim 1, wherein the task scheduler includes a process scheduling process, the service scheduler includes a service scheduling process, the nodes are based on the distributed lock corresponding to the timed task, the process scheduling process included in the task scheduler is used to determine whether to notify the scheduling center by the polling mechanism in the message queue according to the time limit set by the task timer, and the scheduling center determines at least one service scheduler corresponding to the executed timed job according to the response capability of each node.

3. The distributed timed task scheduling system according to claim 2, wherein at least two scheduling centers detect the responsiveness of nodes at a time limit set by a task timer deployed in at least one node selected by the timed task issuing component.

4. The distributed timed task scheduling system according to claim 2, wherein at least one pair of flow scheduling process and service scheduling process is configured in a declarative manner in at least two nodes, and the flow scheduling process and the service scheduling process are decoupled using a message queue.

5. The distributed timed task scheduling system according to claim 1, wherein after the first storage device stores the execution result corresponding to the response timed task, the first storage device notifies the node executing the timed task and the service scheduler deployed in the node by a message queue;

6. The distributed timed task scheduling system according to claim 1, wherein the timed task issuing component comprises: a user interface and load balancer;

7. The distributed timed task scheduling system according to any one of claims 2 to 6, wherein the scheduling center responds to the timed task issued from the message queue, splits the timed task according to a configuration on which at least one timed job included in the timed task depends, and determines at least one service scheduler corresponding to the executed sub-timed task for a plurality of sub-timed tasks formed after splitting the timed task;

the nodes are computing nodes, storage nodes or network nodes.

8. The distributed timed task scheduling system according to claim 7, wherein the distributed timed task scheduling system is configured to respond to a multi-timed task scenario;

9. The distributed timed task scheduling system according to claim 8, wherein said message queue is maintained on a second storage device logically separate from the first storage device, said second storage device being selected from a physical server or a virtual server or a non-volatile storage device, said message queue being a RabbitMQ.

10. The distributed timed task scheduling system according to claim 7, wherein the service scheduling process is configured as one or more service units for responding to the timed task, and the service scheduling process has a mapping relationship with the service units, and the service units are deployed at any one node.

11. The distributed timed task scheduling system according to claim 10, wherein the process scheduling process defines at least one business process, and the service scheduling process defines at least one business logic, so as to configure different distributed timed task scheduling services according to the mapping relationship between the business processes and the business logic;

the business logic is integrated into the service unit;

wherein the service unit is a container, a virtual machine or a microservice.

12. A computing device deploying a distributed timed task scheduling system as claimed in any one of claims 1 to 11.

13. The computing device of claim 12, wherein the computing device is configured as a cloud computing platform, a server cluster, or a data center;

the nodes are computing nodes, storage nodes or network nodes.