CN111338774B

CN111338774B - Distributed timing task scheduling system and computing device

Info

Publication number: CN111338774B
Application number: CN202010107648.4A
Authority: CN
Inventors: 李红
Original assignee: Huayun Data Co ltd
Current assignee: Huayun Data Co ltd
Priority date: 2020-02-21
Filing date: 2020-02-21
Publication date: 2023-09-19
Anticipated expiration: 2040-02-21
Also published as: CN111338774A

Abstract

The invention provides a distributed timing task scheduling system and a computing device, wherein the distributed timing task scheduling system comprises a task scheduler, a service scheduler, a task timer, a message queue, a scheduling center and a timing task issuing component, wherein the task scheduler, the service scheduler and the task timer are configured in at least two nodes in a declarative mode; the task scheduler issues the timing tasks to the message queue, and the rest nodes monitor the message queue and create timing tasks identical to the timing tasks issued to the message queue; the message queue transmits at least one timing job contained in the timing task to a dispatching center, and the dispatching center detects the response capability of the nodes so as to determine a service dispatcher corresponding to the timing job according to the response capability of each node. The invention ensures the fine granularity of the scheduling process of the timing tasks, improves the processing capacity of the multi-timing tasks and realizes the self-adaptive arrangement processing of the business data.

Description

Distributed timing task scheduling system and computing device

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a distributed timing task scheduling system and a computing device.

Background

One or more Project projects (projects) are run in the cloud platform, server cluster configuration environment, and the Project projects need to configure timing tasks (tasks). Based on the situations that a plurality of projects and a plurality of tasks contained in a single project exist in a large-scale cloud platform and server cluster configuration environment, tasks such as overtime state judgment of an order system, timing update of cache data, timing mail sending to a user, even some reports calculated regularly and the like are needed. To ensure high availability and high fault tolerance, large-scale cloud platforms or server clusters typically employ a distributed architecture, and thus distributed timing tasks have evolved.

Currently, the mainstream distributed timing task scheduling is based on the Quartz framework and relies on Zookeeper. The Zookeeper can realize data slicing, thereby ensuring that data cannot be repeatedly processed and improving the data processing speed. This has a significant role in the financial industry, mobile payment, etc. However, the technical route of adopting Zookeeper can cause great pressure on the database, and the database is at risk of downtime; in the technical route adopting the Quartz, the dependence on the database is strong, and the database deployment is complex; particularly, when the Quartz frame is adopted to realize the distributed timing task scheduling, the occupation of resources is too heavy, the uniqueness in the task scheduling process cannot be ensured, and the technical means of the distributed database in the prior art can cause calculation overhead and calculation pressure to the distributed database.

More importantly, based on the characteristics that the user requests or the issuing tasks show diversity in the specification and show mass in quantity, the distributed timing task scheduling system and the scheduling method in the prior art have the following defects. First, for complex environments or scenarios where frequent scheduling of timed tasks is required, there are often many businesses that typically include at least one business process, and different business processes often include at least one business logic. This requires the developer to be aware of the logic or logic contained by a particular business, which obviously is virtually impossible to achieve, as it is impossible for the developer to make in advance all business processes that meet the user's particular needs; and may result in increased difficulty in developing applications/programs for running the distributed timed task scheduling system by front-end developers and back-end developers. Meanwhile, in the prior art, a distributed timing task scheduling system is usually operated in computing nodes, and the scale or configuration of the computing nodes in a cloud platform is the same, however, resources required for scheduling the timing tasks are different, so that personalized addition or deletion of the computing nodes in the computing clusters such as the cloud platform of the distributed timing task scheduling system disclosed in the prior art cannot be realized to provide a reference basis, and resource waste is caused.

In view of the foregoing, there is a need for an improved distributed timing task scheduling method and related similar technical solutions in the prior art to solve the above-mentioned problems.

Disclosure of Invention

The invention aims to disclose a distributed timing task scheduling system and a computing device based on the scheduling system, which improve the fault tolerance of nodes running the timing tasks, realize consistency of task scheduling, realize fine granularity of a timing task scheduling process, enable the timing tasks contained in entity business to be reasonably determined into specific execution bodies, solve the overall scheduling processing capability in a multi-timing task scheduling scene and reduce the business scheduling difficulty.

To achieve the first object, the present invention provides a distributed timing task scheduling system, including:

a task scheduler, a service scheduler and a task timer, which are declaratively configured in at least two nodes,

the message queue is used to store a message,

at least two dispatch centers disposed between the message queues and the service dispatcher for detecting node response capabilities,

a timing task issuing component, which selects at least one node as a designated node for responding to the timing task, and a first storage device for storing an execution result corresponding to the timing task;

The task scheduler issues the timing tasks to the message queue, and the rest nodes monitor the message queue and create timing tasks identical to the timing tasks issued to the message queue; the message queue transmits at least one timing job contained in the timing task to a dispatching center, and the dispatching center detects the response capability of the nodes so as to determine a service dispatcher corresponding to the timing job according to the response capability of each node.

As a further improvement of the present invention, the task scheduler includes a flow scheduling process, the service scheduler includes a service scheduling process, the nodes use a distributed lock corresponding to the timing task to determine whether to notify a scheduling center by a polling mechanism in a message queue according to a time limit set by a task timer by using the flow scheduling process included in the task scheduler, and determine at least one service scheduler corresponding to executing the timing job according to response capability of each node through the scheduling center.

As a further improvement of the invention, at least two dispatching centers detect the response capability of the nodes by taking the time limit set by the task timer arranged in at least one node selected by the timing task issuing component as a period.

As a further improvement of the invention, at least one pair of a flow scheduling process and a service scheduling process are configured in a declarative manner in at least two nodes, the flow scheduling process and the service scheduling process being decoupled using message queues.

As a further improvement of the present invention, after the first storage device stores the execution result corresponding to the response timing task, the node that executes the timing task and the service scheduler deployed in the node are notified by the message queue;

the first storage device is selected from a JVM memory, a distributed storage component, or a database.

As a further improvement of the present invention, the timed task issuing component includes: a user interface and a load balancer;

the user interface receives the timing task and transmits the timing task to the load balancer, and at least one node is selected as a designated node to respond to the timing task through a load balancing strategy built in the load balancer based on a distributed lock corresponding to the timing task.

As a further improvement of the invention, the scheduling center responds to the timing task issued from the message queue, performs splitting on the timing task according to the configuration on which at least one timing job contained in the timing task depends, and determines at least one service scheduler corresponding to the sub-timing task for a plurality of sub-timing tasks formed after the timing task is split;

The node is a computing node, a storage node or a network node.

As a further improvement of the present invention, the distributed timed task scheduling system is configured to respond to a multi-timed task scenario;

the task scheduler and the service scheduler configured by the nodes comprise one or more pairs of flow scheduling processes and service scheduling processes, and the flow scheduling processes and the service scheduling processes are decoupled by using message queues.

As a further development of the invention, the message queue is stored in a second storage means logically independent of the first storage means, the second storage means being selected from a physical server or a virtual server or a non-volatile storage means, the message queue being a RabbitMQ.

As a further improvement of the invention, the service scheduling process is configured as one or more service units for responding to the timing tasks, the service scheduling process and the service units have a mapping relation, and the service units are deployed on any node.

As a further improvement of the present invention, the flow scheduling process defines at least one service flow, and the service scheduling process defines at least one service logic, so as to configure different distributed timing task scheduling services according to the mapping relationship between the service flow and the service logic;

The business process comprises business data issued from a user to a timing task issuing component, and the business process is not integrated into executable codes of actual business;

the business logic is integrated into the service unit;

the service unit is a container, a virtual machine or a micro service.

Based on the same inventive ideas described above, the present invention also discloses a computing device deploying a distributed timed task scheduling system as disclosed in any of the above inventions.

As a further refinement of the invention, the computing device is configured as a cloud computing platform, a server cluster or a data center, the node being a computing node, a storage node or a network node.

Compared with the prior art, the invention has the beneficial effects that:

(1) Through at least two dispatching centers arranged between the message queue and the service dispatcher, the timing task can be reasonably issued to the service dispatcher through the message queue for execution, thereby ensuring the fine granularity of the dispatching process of the timing task, improving the overall dispatching processing capacity of the distributed timing task dispatching system in a multi-timing task dispatching scene, further ensuring that a computing device or a cloud computing platform configured with the distributed timing task dispatching system is more reasonable in using physical resources and/or virtual resources in the process of responding to the timing task, and effectively preventing the waste or idling of the physical resources and/or virtual resources;

(2) The method can be used for solving the defect that the timing task cannot be effectively responded due to node faults in the scheduling process of the timing task, and realizing the horizontal expansion capability of scheduling the timing task.

(3) In the application, different distributed timing task scheduling services are configured according to the mapping relation between the business process and the business logic, so that users or administrators and other personnel without bottom code development capability can flexibly arrange tasks according to business requirements to package business processes which are adaptive to the business and comprise a plurality of business logic, the execution of the business is prevented from being interfered by manually modifying the bottom code, the users or administrators only need business data corresponding to the business, the internal association of the business and the logic is thoroughly stripped, and the self-adaptive arrangement processing of the business data is realized.

Drawings

FIG. 1 is a topology diagram of a distributed timed task scheduling system of the present application;

FIG. 2 is a schematic diagram of a dispatch center detecting response capability formed by all nodes and storing the response capability of the nodes in a message queue;

FIG. 3 is a schematic diagram of a timing task issuing component selecting a node that cannot respond to a timing task that includes a timing job, and a scheduling center reselecting a new node to respond to the timing task;

FIG. 4 is a schematic diagram of a timing task issuing component selecting a node that cannot respond to a timing task that includes a timing job, and a scheduling center reselecting two new nodes to respond to the timing task;

FIG. 5 is a diagram of service units formed in a node based on different response capabilities for responding to timed tasks, wherein the plurality of service units shown in FIG. 5 are individually or in any combination to respond to timed jobs or sub-timed tasks associated with a service scheduling process;

FIG. 6 is a schematic diagram of a business process and business logic separated from each other and deployed at different nodes, respectively;

FIG. 7 is a schematic diagram of business processes associating business logic to form a dispatch service 301 a;

FIG. 8 is a schematic diagram of business processes associating business logic to form a dispatch service 301 b;

FIG. 9 is a schematic diagram of business processes associating business logic to form a dispatch service 301 c;

FIG. 10 is a schematic diagram of business logic associated with business processes to form a dispatch service 301 d;

FIG. 11 is a business architecture diagram of a timed task issuing assembly;

FIG. 12 is a diagram of a traffic architecture in which message queues respond to multiple timing tasks issued by a given node;

FIG. 13 is a flow chart of configuring a distributed timed task scheduling system of the present invention to perform a patrol task on a cloud computing platform;

FIG. 14 is a schematic diagram of splitting a timed task initially issued to a designated node into a plurality of sub-timed tasks, wherein each sub-timed task contains at least one timed job;

FIG. 15 is a schematic diagram of a timed task including at least one timed job;

FIG. 16 is a topology of an example of a computing device configured as a cloud computing platform in accordance with the present invention.

Detailed Description

The present invention will be described in detail below with reference to the embodiments shown in the drawings, but it should be understood that the embodiments are not limited to the present invention, and functional, method, or structural equivalents and alternatives according to the embodiments are within the scope of protection of the present invention by those skilled in the art.

Before explaining the various embodiments of the invention in detail, technical terms and meanings referred to in the specification are summarized as follows.

The term'Logic for logic control"includes any physical and tangible function for performing a task. For example, each operation shown in the flowchart corresponds to a logic component for performing the operation. Operations may be performed using, for example, software running on a computer device, hardware (e.g., chip-implemented logic functions), etc., and/or any combination thereof. When implemented by a computing device, the logical components represent electrical components that are physical portions of the computer system, regardless of how they are implemented.

The term'Tasks"and" in the present application "Timed tasks"or"Task"has equivalent meaning and can be used in actual code programming"Job"replace. The scheduling of the timing tasks can be expressed as ' 1 am of the payment system running batch every day, conducting one day clearing, and 1 st of the payment system conducting the last month clearing ' in the actual scene ', or ' after goods are successfully shipped, reminding customers of shipping information and logistics information in the form of short messages or mails ', or ' executing forced recovery on cloud hosts allocated to users according to lease set time limit '.

The term'Designated node"and term"New designated node"(or'Reselected designated node") means that for the Node or the computing Node responding to the timing task formed based on different time points in the timing task scheduling process, as shown in fig. 1, if Node-1 is selected as the responding Node when the first issued timing task is selected, node-1 is the designated Node, when Node-1 cannot execute the scheduling processing of the timing task due to special reasons such as downtime, response timeout, etc., the scheduling processing of the timing task is migrated to Node-2 and/or Node-3 for processing through the message queue 10 and the scheduling centers 301 and 302, then the Node-2 and/or Node-3 can be understood as the specific" in the application " New type Is defined by the designated node of (2)"or"Reselected designated node". Of course, in the present embodiment, a node that can be used to respond to a timed task sum, a sub-timed task, or a timed jobMust be alive and does not have to take into account the overhead and load situation of the nodes.

The term'Node"and term"Computing node"having equivalent technical meaning, term"Scheduling services"and term"Distributed type Timed task scheduling service"has technical equivalents".

Phrase'Is configured as"or phrase"Is configured to"includes any manner in which any kind of physical and tangible function may be constructed to perform the identified operations. The functions may be configured to perform operations using, for example, software running on a computer device, hardware (e.g., chip-implemented logic functions), etc., and/or any combination thereof.

Embodiment one:

one embodiment of a distributed timed task scheduling system 100 (hereinafter "scheduling system") of the present invention is disclosed with reference to fig. 1-15. The distributed timed task scheduling system 100 is most preferably used to respond to a multi-timed task scenario, although it is applicable to a single timed task scenario as well. The distributed timed task scheduling system 100 is implemented based on golang, and the programs constituting the distributed timed task scheduling system 100 are decompressed under a linux server/opt/directory.

For purposes of describing the core gist of the present invention, applicant shows a general scenario for exemplary illustration and an overview of the invention. For example, the timed task issuing component 500 receives a create timed task instruction issued by a user or administrator and selects Node-1 as the designated Node for responding to the timed task. When the message queue 10 (rabitmq Cluster) enables one or more timing jobs included in the timing task 90 to be executed by the response capability of the Node-1, the scheduling center 301 executes processing on the one or more timing jobs by a Service scheduler (Service-scheduler) 901 disposed in the Node-1 through a Service scheduling process or a Service execution mechanism (e.g., a virtual CPU or an application program) mounted to the Service scheduler (Service-scheduler) 901.

A service dispatch process and a service execution mechanism (not shown) may also be understood as a relationship of software having some computer executable function to hardware (or physical system, software system). The partial or sub-timed tasks in the timed task 90 cannot respond due to Node-1 power down, downtime or program failure, the scheduling center 301 (and/or the scheduling center 302) reselects one or more of the partial or sub-timed tasks in the timed task 90 to the one or more Service schedulers 902,903 deployed in Node-2 and/or Node-3 (i.e., the remaining nodes) through the scheduling center 301, and forwards the one or more of the partial or sub-timed tasks in the timed task 90 to the Service scheduler 902 and/or the Service scheduler 903 for execution.

After the timing task 90 is executed by the service scheduling process or the service execution mechanism, one or more task schedulers in Node-1 to Node-3 save the execution result in the first storage device 21. The task scheduler 801 determines a specific time point at which the timed task 90 starts and ends execution based on the timed task and the task timer 811 to determine an execution period of the timed task 90, and thus the period or the execution period may be a specific time length in an embodiment.

In a scenario in which a plurality of timed tasks are continuously processed, when the first timed task is completed, the task scheduler 801 saves the execution result to the first storage 21. The user or administrator may access the first storage device 21 to determine whether the timed task 90 is reliably performed. Referring to fig. 1, although in the present embodiment, only Node-1 to Node-3 are shown in the computing cluster 200, a Task Timer (Timer) 811, a Task Scheduler (Task-Scheduler) 801, and a Service Scheduler (Service) are shown in the Node-1, a Task Timer (Timer) 812 and a Task Scheduler (Task-Scheduler) 802 are shown in the Node-2, and a Task Timer (Timer) 813 and a Task Scheduler (Task-Scheduler) 803 are shown in the Node-2.

Next, the applicant describes in detail the specific solution of the distributed timed task scheduling system 100 and a computing device based on the distributed timed task scheduling system 100 based on the core idea of the present invention. The example shown in fig. 1 is applied to a cloud computing platform, and the number of nodes is not limited to three, but may be one, two or more than three.

Specifically, the distributed timed task scheduling system 100 includes:

the task schedulers 801 to 803, the Service schedulers 901 to 903, and the task timers 811 to 813, which are declaratively configured in at least two nodes, the message queue 10, and at least two scheduling centers for detecting node response capability, namely, the scheduling center 301 and the scheduling center 302, which are disposed between the message queue 10 and the Service schedulers (namely, the Service schedulers 901 to 903 shown in fig. 1). The timing task issuing component 500 selects at least one node as a designated node responding to the timing task, and the first storage device 21 stores the execution result corresponding to the timing task.

Referring to fig. 11, the timed task issuing module 500 includes: user interface 501 and load balancer 502; the user interface 501 receives the timing task and issues the timing task to the load balancer 502, and at least one node is selected as a designated node to respond to the timing task by a load balancing policy built in the load balancer 502 based on a distributed lock corresponding to the timing task.

Meanwhile, after the first storage device 21 stores the execution result corresponding to the response timing task, the Node (e.g., node-1) that executes the timing task and the service scheduler (e.g., service scheduler 901) disposed in the Node are notified by the message queue. Specifically, the first storage device 21 is selected from JVM memory, a distributed storage component, or a database, and is preferably a database. Meanwhile, the distributed storage component may be a block storage or a file storage. The file storage is optimized in the scenes of searching, inserting, modifying, deleting and the like of a certain object based on the timing task; in a scene such as accessing and downloading a streaming media file such as a video based on a timing task, block storage is preferable. In this embodiment, since the first storage device 21 storing the execution result of the timing task is independent from each node, the calculation overhead and the calculation pressure in the database instance selected by the first storage device 21 are reduced.

The task scheduler 801 issues timed tasks into the message queue 10, and the remaining nodes listen to the message queue 10 and create timed tasks identical to the timed tasks issued into the message queue 10. The message queue 10 issues at least one timing job included in the timing task to the dispatch center 301, and the dispatch center 301 detects the response capability of the node to determine a service scheduler corresponding to the execution timing job according to the response capability of each node. Two scheduling centers (of course, not limited to a scheduling center, but a greater number of scheduling centers may be configured) are configured in the distributed timed task scheduling system 100 to solve the problem of high availability, and when the scheduling center 301 fails, since the scheduling center 302 is capable of synchronizing the timing tasks of at least one timing job contained in the message queue 10, the two scheduling centers can select a reliable scheduling center to perform subsequent operations based on the load balancing algorithm of the message queue 10. If the dispatch center 301 cannot detect the response capability of the node or cannot select a corresponding service dispatcher for the timed task/timed job, the message queue 10 issues at least one timed job contained in the timed task to the dispatch center 302 to implement disaster recovery switching.

In this embodiment, the dispatch center 301 and the dispatch center 302 are logically independent of any Node and are also logically independent of the timing task issuing component 500. The functions of the dispatch center 301 and the dispatch center 302 are mainly two.

And one is used for detecting the response capability of the Node-1 to the Node-3. The response capability may be understood as the processing capability of a specific node to respond to a timed task, and may also be understood as the processing capability of a service scheduling process configured in a specific node to a complete timed task or a sub-timed task, where a complete timed task (e.g., the aforementioned "enforcing the recovery of cloud hosts assigned to users by a lease set period") or a sub-timed task may each include one or more timed jobs. Therefore, a reasonable task execution main body is provided for the final completion of the timing task or the sub-timing task, and the condition that the task cannot respond is prevented. The meaning of the sub-timing tasks is described below.

And secondly, matching the service scheduler corresponding to the timing job in one or more nodes for the timing task, the sub-timing task or the timing job. In the present application, matching the corresponding service scheduler for the timed job covers a process of matching the corresponding one or more service schedulers for the timed task or sub-timed task including one or more timed jobs. In addition, the roles and roles of the load balancers 502 in the dispatch center 301, the dispatch center 302 and the timing task issuing component 500 are different, and the dispatch center 301 and the dispatch center 302 as a whole and the load balancers 502 are respectively located at two sides of the message queue 10 on the logic architecture, and no session or interaction occurs between each other. More importantly, in the application, only the business processes 201-20N of the timing task are concerned between the message queue 10 and the timing task issuing component 500, namely specific functions (for example, starting a cloud host) are executed, and business logics 401-40N on which the specific functions depend are not concerned or loaded. Different distributed timing task scheduling services (or simply referred to as "scheduling services") 301 a-301 d are configured according to the mapping relation between the service flow and the service logic, so that users or administrators and other personnel without bottom code development capability can flexibly arrange tasks according to service needs to package the service flow which is adaptive to the service and contains a plurality of service logics, the condition that the service execution is interfered by manually modifying the bottom code is avoided, the users or administrators only need to pay attention to service data corresponding to the service, the internal association between the service (i.e. the service flow) and the logic (i.e. the service logic) is thoroughly stripped, and the self-adaptive arrangement processing of the service data is realized. Meanwhile, business logic with finer difference can be configured in batches and in a modularized mode through the bottom layer developer, so that diversified business scenes can be adapted, and personalized requirements of users can be met.

Referring to fig. 5, the service scheduling process is configured as one or more service units for responding to the timing task, the service scheduling process and the service units have a mapping relationship, and the service units are deployed at any node. Meanwhile, the flow scheduling process defines at least one business flow, and the service scheduling process defines at least one business logic, so that different distributed timing task scheduling services are configured through the mapping relation between the business flow and the business logic. The business process includes business data issued from the user to the timed task issuing component 500, and is not integrated into the executable code of the actual business. The business logic is integrated into the service unit; the service unit is a Container (Container), a Virtual Machine (Virtual Machine), or a micro service (micro service), or other components with independent functions, and may even be a service middleware, a plug-in unit, a system patch, a program patch, or a software upgrade package file, so as to achieve high availability and extensibility of the service. Meanwhile, the mapping relationship formed between the business process and the business logic can be customized based on the instruction of the user or the administrator through the timing task issuing component 500, so as to package different scheduling services.

The service unit is a basic processing unit capable of independently completing a timing task, sub-timing task or timing job. The service unit can be developed and deployed in a batched mode by the bottom layer developer, so that reasonable use of resources can be fully brought into play in the later scheduling process of the timing tasks, iterative development and repeated utilization of bottom layer codes forming the service unit can be realized, and the development intensity and development difficulty of the software/system bottom layer developer are reduced, and repeated work is avoided. The bottom layer developer only needs to pay attention to the business logic for realizing a certain function, and does not pay attention to the personalized requirements of front-end developers, administrators or users on the business process, so that the separation of business and logic and the flexible encapsulation of specific timing task scheduling service are thoroughly realized.

For example, in the example shown in fig. 5, node-1 deploys service unit 991, service unit 992, and service units 993 through 99z (parameter z is a general term and takes a positive integer greater than or equal to 4). Any one service unit can be independently used as a scheduling process to respond to a timing task (or a sub-timing task or a timing job), and can also be used for forming a business logic as a whole based on one or more service units. For example, service unit 991 and service unit 992 constitute one service logic 91a, or service unit 992 and service unit 993 are regarded as one service logic 92a, or all service units in Node-1 constitute one service logic 93a. The service logic 91 a-93 a may be detected by the dispatch center 301 to determine which service unit or units are required to respond to a particular timing task. Because the resources in each node are limited, the resources in the nodes, especially the residual resources, can be utilized to the greatest extent through the technical scheme, and the situation that the single node cannot respond to the timing task is avoided.

Referring to fig. 6, the deployed service flows and service logics in the respective nodes do not necessarily require a one-to-one correspondence, as long as the mapping relationship between the service flows and the service logics is able to be achieved. Forming a business flow 201 and business logics 401-402 in Node 1-1; forming business processes 202-203 and business logic 403-40N in Node-2; as for Node-P, only one or more of the business processes 204-20N may be formed. In this embodiment, the business processes and business logic corresponding to the task scheduling process and the service scheduling process in each node may be one-to-one relationship, or may be a many-to-one relationship, or may be a one-to-many relationship, or may even not form the task scheduling process and/or the service scheduling process in some nodes, so that the scheduling center 301 matches the dependent scheduling service 301a to the scheduling service 301d for a specific real-time task, as shown in fig. 7 to 10. The business processes 201-20N are positioned in the user business layer 2, the dispatch service 301 a-301 d are positioned in the service dispatch layer 3, and the business logics 401-40N are positioned in the business logic layer 4; the user service layer 2 is located in the timed task issuing component 500, and forms a User Interface (UI) 501 through a UI designer, and is typically represented as a client, APP, or software interactive interface. Business logic layer 4 is located at the bottom layer of the computing devices that configure the distributed timed task scheduling system 100. The service scheduling layer 3 is located between the service logic layer 4 and the user service layer 2, performs logic judgment and execution operation, data connection and instruction transmission, can perform logic processing on received data, realizes functions of modifying, acquiring, deleting and the like of the data, feeds back a processing result to the user service layer 2, and realizes a software function.

Referring to FIG. 7, node-1 forms business processes 201-202, node-1 forms business logic 401-402, and Node-N forms business logic 403-40N. The business process 201 and business logic 403-40N are mapped to form a scheduling service 301a, thereby satisfying scheduling and response of a customized timed task. In this scenario, node-1, which was originally selected to perform the timed task, does not rely entirely on Node-1 to participate in the timed task or the timed job schedule, but rather is commonly engaged by multiple business logic in Node-N. Wherein, parameter M is less than parameter N.

Referring to FIG. 8, node-1 forms business processes 201-202, node-1 forms business logic 401-402, and Node-N forms business logic 403-40N. Node-1 forms business logic 401-402, node-2 forms business logic 403, node-M contains multiple business logic of business logic 40M, and Node-N forms business logic 40N. The business processes 201-203 in Node-1 and Node-2 together establish a mapping relationship with business logic 401 in Node-1 to form scheduling service 301b, thereby satisfying scheduling and response of a customized timing task. In this scenario, the timing tasks comprising multiple business processes only need business logic 401 in Node-1 to implement the scheduling operation for the timing tasks or timing jobs, so the entire scheduling process is actually completed solely depending on Node-1.

Referring to fig. 9, in the example scenario illustrated in fig. 8, a mapping relationship is established only between business process 20M and business logic 40M to form dispatch service 301c. And the business process 20M and the business logic 40M belong to the Node-N and the Node-M respectively. In this scenario, an event similar to the timing task execution role migration in FIG. 3 can be implemented, so that throughout the scheduling process, the timing task is actually performed by the Node-M, although the originally selected Node is Node-N.

Referring to fig. 10, in the example scenario illustrated in fig. 8, a mapping relationship is established between only business process 20N and business logic 403 to form dispatch service 301d. Thus throughout the scheduling process, although the Node initially selected is Node-N, in practice the timing task is done by Node-2.

The business logic is implemented by executing executable code corresponding to the actual business. The dispatch center 301 or 302 only pays attention to specific business logic when determining to execute a timed job or sub-timed task contained in a timed task. It can be seen that, by the distributed timing task scheduling system disclosed by the present invention, the business processes 201 to 20N and the business logics 401 to 40N can be separated, and the business processes 201 to 20N and the business logics 401 to 40N are associated by the scheduling center 301 or the scheduling center 302, so that the distributed timing task scheduling services in any combination can be used, and the scheduling services 301a to 301d shown in fig. 7 to 10 can be referred to.

By the technical scheme, the simplicity of page arrangement and distributed timing task scheduling operation of front-end developers is improved, the fine granularity of the timing task scheduling process is remarkably improved, the processing capacity of multi-timing tasks is improved, and self-adaptive arrangement processing of service data is realized. In this embodiment, the fine granularity embodied in the scheduling process of the timed task refers to a description dimension for performing job division on the timed task, the sub-timed task or the timed job in the scheduling process.

In the present embodiment, the task scheduler 801 includes a flow scheduling process, the service scheduler 901 includes a service scheduling process, the nodes use the flow scheduling process included in the task scheduler 801 to determine whether to notify the scheduling center 301 by the polling mechanism in the message queue 10 according to the time limit set by the task timer 811 based on the distributed lock corresponding to the timing task, and determine at least one service scheduler 901 corresponding to the execution timing task according to the response capability of each node through the scheduling center 301. When Node-1 issues the timing task 90, node-2 issues the timing task 90 to the message queue 10 as shown by arrow task2, and Node-3 issues the timing task to the message queue 10 as shown by arrow task 3. Meanwhile, monitoring is established between Node-2, node-3 and message queue 10, task scheduler 802 and task timer 812 are synchronously configured in Node2, and task scheduler 803 and task timer 813 are synchronously configured in Node-3.

As shown in fig. 3 and fig. 4, when determining that the timing task is executed by the service scheduling process included in the service scheduler 901 in the Node-1 based on the detection result of the response capability of the scheduling center 301 to each Node, the scheduling center 301 issues the timing task (including one or more timing jobs) to the service scheduler 901, returns the result to the scheduling center 301 after the timing task is executed, and finally notifies the task scheduler 801 of the result through the message queue 10, so that the execution result generated by the service scheduling process of the timing task is saved in the storage device 201 by the task scheduler 801 to be called or accessed by the user or the administrator. Thus far, a complete timed task scheduling operation is performed. Independent flow scheduling process, service scheduling process and task timers 811-813 are respectively configured in the three nodes. After the service scheduling process in the selected Node responds to the timing task, the execution result corresponding to the response timing task is saved in the first storage device 21, and the message queue 10 notifies the Node-1. Preferably, the first storage device 21 is selected from a JVM memory or database, and is most preferably a distributed database, so as to improve CRUD operation efficiency.

In this embodiment, the applicant has Node-1 as the designated Node for the responsive timing task. Node-1 configures a pair of flow scheduling process and service scheduling process and decouples the flow scheduling process and service scheduling process using message queue 10. Message queue 10 is a RabbitMQ. Node-2 and Node-3 are configured with reference Node-1.

If in the above process, if the dispatch center 301 fails to call the service dispatcher 901 of Node-1 or the service dispatcher 901 in the current state cannot meet the specific timing job, the timing task executes the principal role migration. If the dispatch center 301 detects that Node-2 recognizes that its response capability (computing capability, storage capability, etc.) can meet the timing job, then the dispatch center 301 reselects the Node, and issues the timing job to the service scheduler 902 in Node-2 again, and performs processing through its built-in service scheduling process or a service execution mechanism (e.g. virtual CPU or application program) associated with the service scheduler 902. The timed task execution subject role migration may be transferred from one node to another node of the computing cluster 200 (see FIG. 3) or from one node to two other nodes of the computing cluster 200 (see FIG. 4).

After the timing job is executed, node-2 feeds back Ack response to the dispatching center 301 to inform the dispatching center 301 that the timing job is completed; finally, the execution result is written into the message queue 10 by the dispatch center 301. As shown in connection with fig. 4, the number of the nodes reselected by the scheduling center may also be two or more, and it is determined which service unit or units of Node-2 and Node-3 are used to respond to the timing task or the timing task (sub-timing task) including at least one timing task according to the response capability required by the timing task and the mapping configuration between the service flow and the service logic.

Meanwhile, in this embodiment, the dispatch center 301 and the dispatch center 302 detect the response capability of the node with the time limit set by the task timer set in at least one node selected by the timing task issuing component 500 as a period. Although Node-1 is initially selected as the Node for responding to the timing task 90, node-2 and Node-3 still monitor the timing task and the timing job contained in the timing task issuing component 500. Before the time point set by the task timer 811 in the Node-1 arrives, if the service scheduler 901 cannot respond to the timed task and the timed job contained therein, the dispatch center 301 will quickly switch the role of the execution subject for executing the timed task and the timed job contained therein to the service scheduler deployed in the Node-2 and/or the Node-3, specifically referring to the procedures shown by the arrow task2 and the arrow task3 above.

The message queue 10 is stored in a second storage device 22 logically independent from the first storage device 21, the second storage device 22 is selected from a physical server or a virtual server or a nonvolatile storage device, and the message queue 10 is a RabbitMQ. The temporary storage of the timing task or the timing job contained therein in the second storage device 22 can ensure the stability and consistency of the timing task or the timing job contained therein in the scheduling process, and prevent various applications or services formed based on the timing task from being interrupted, erroneous, and the like.

Specifically, as described in connection with fig. 12, the rabitmq is implemented according to the distributed characteristics of Erlang (the lower layer of the rabitmq is implemented through an Erlang architecture, so that the rabitmqctl starts an Erlang node, and uses an Erlang system to connect the rabitmq node based on the Erlang node, and in the connection process, a correct Erlang Cookie and a node name are required, and the Erlang node is implemented by exchanging the Erlang Cookie to obtain authentication), so that the Erlang is installed first when the rabitmq distributed cluster is deployed, and a Cookie of one service is copied to another node.

In the rabitmq cluster (i.e., the rabitmq cluster), each rabitmq is a peer node, i.e., each node provides a client connection to receive and send messages. The nodes are divided into a memory node and a disk node, and generally, the nodes are all established as the disk node so as to prevent the message from disappearing after restarting the machine; exchange601 is a key component that accepts producer messages and routes messages to message queue 10. Exchange type and Binding determine the routing rules of the message. So the producer wants to send a message, first it has to declare an Exchange601 and Binding602 corresponding to the Exchange 601. This can be done by exchange declare and bindingdesclare. In Rabbit MQ, an Exchange601 is declared to require three parameters, exchangeName, exchangeType and Duable. The Exchange name is the name of the Exchange and the property needs to be specified when creating Binding and the producer pushes a message through publishing. The Exchange type refers to the type of Exchange, and in RabbitMQ, there are four types of Exchange: different Exchange will exhibit different routing behavior, direct type, fanout type and Topic type. Duable is the persisted attribute of the Exchange 601. Declaring a Binding requires providing a QueueName, exchangeName and Binding key. Different routing rules exhibited by different exchange types are set forth below.

When the producer sends a message, a RoutingKey and an Exchange are required to be specified, and after receiving the RoutingKey, the Exchange type is judged.

a) If the routing key is of the Direct type, the routing key in the message is compared with the Binding keys in all Binding associated with the Exchange, and if the routing keys are equal, the routing keys are sent to a Queue corresponding to the Binding.

b) In the case of the Fanout type, the message is sent to all the queries that have defined Binding with the Exchange, which is a broadcast behavior.

c) If the matching is of the Topic type, matching the routingKey and the BindingKey according to the regular expression, and if the matching is successful, sending the routingKey and the BindingKey to the corresponding Queue.

The RabbitMQ cluster will send messages to each consumer in sequence. Each consumer receives an equal number of messages on average. This way of sending messages is called round-robin (round-robin). Referring to fig. 12, after a timing task (task) is issued to Exchange601, a plurality of Queues including Q1 to Qn are formed based on the binding process, the plurality of Queues including Q1 to Qn form Queues603, and Q1 to Qn are issued one by one to a service scheduling process and executed.

In this embodiment, at least one pair of a flow scheduling process and a service scheduling process are declaratively configured in at least two nodes, and the flow scheduling process and the service scheduling process are decoupled using the message queue 10. Based on the poll dispatch mechanism of the message queue 10, load balancing capability may be provided for task messages corresponding to timed tasks.

As shown in connection with fig. 1, three listening arrows represent service dispatch processes in three nodes listening for timed task messages in message queue 10. Because the three nodes distribute the configured service scheduling process to monitor the timing task message of the same exchange type, the timing task can be distributed to the selected service scheduling process in sequence through the load balancing strategy of the message queue 10, after the service scheduling process finishes executing the timing task, the scheduling center 301 in fig. 1 sends a direction shown by an arrow corresponding to an execution result to the message queue 10, sends a confirmation receipt to the message queue 10, and after the message queue 10 receives the confirmation receipt, the timing task is confirmed to be executed; if the service scheduling process in Node-1 does not send an acknowledgement to the message queue 10, the message queue 10 issues a timed task that has been sent to the service scheduling process in Node-2 after a set period of time (e.g., 0.5 seconds) until the message queue 10 receives the acknowledgement. Because the service scheduling process and the flow scheduling process in the plurality of nodes are decoupled through the message queue 10, the uniqueness of the timing task scheduling process can be realized through the message queue 10, and the problems of HA and load balancing in the distributed timing task scheduling process are solved.

In particular, in the cloud platform instance where multiple computing nodes exist, the message queue 10 has the capability of horizontally stretching and retracting, so that computing nodes with different scales can be accessed simultaneously, thereby being beneficial to improving the logic stability and reliability in the process of capacity expansion or capacity contraction of the computing nodes, and being capable of matching a service scheduling process in a last proper node for a timing task, thereby simplifying the simplicity of capacity expansion of the computing nodes of the cloud computing platform applying the distributed timing task scheduling method.

The distributed timed task scheduling system 100 disclosed in the basic embodiment may respond to the usage requirements of timed tasks for two to any number of computing nodes of the cloud platform. Meanwhile, the distributed timing task scheduling system 100 disclosed in this embodiment does not need to rely on the traditional quantiz+zookeeper framework, so that the technical problem that resources are excessively occupied in the distributed timing task scheduling process is solved, the computing overhead and computing pressure of a database (the lower concept of the first storage device 21) are reduced, and the deployment difficulty of the database is reduced.

Applicant has further made optimizations for the distributed timed task scheduling system 100 disclosed in the embodiments. The scheduling center 301, in response to the timing task issued from the message queue 10, performs splitting on the timing task according to a configuration on which at least one timing job included in the timing task depends, and determines at least one service scheduler corresponding to the sub-timing task for a plurality of sub-timing tasks formed after splitting the timing task. The node is a computing node, a storage node, or a network node.

As shown in FIG. 14, a timed task 90 issued to the distributed timed task scheduling system 100 may be exemplarily split into a sub-timed task 91, a sub-timed task 92, a sub-timed task 93, and a sub-timed task 94. The segmentation granularity and segmentation basis of the sub-timing task 90 can be known after detecting the response capability of the nodes according to the scheduling center 301 and the scheduling center 302, and different scheduling services formed according to the mapping relationship established by the business process and the business logic are used as the segmentation basis or segmentation basis for executing the operation of the segmentation timing task. Timing job 911 and timing job 912 may be included in sub-timing task 91; sub-timing task 92 includes timing job 921 and timing job 922; sub-timed tasks 93 and 94, and so on, may include one or more timed jobs.

As shown in fig. 15, the timed task 90 may not be split by the dispatch center 301, and may include a timed job 911, a timed job 912, a timed job 921, and a timed job 922. Timing job 911, timing job 912, timing job 921, and timing job 922 correspond to different business logic and determine one or more business logic associated with the different timing jobs at dispatch center 301.

Referring to FIG. 13, applicants illustrate the detailed steps of performing a patrol timed task on a cloud platform based on the distributed timed task scheduling system 100.

Step 701, configuring node inspection parameters: the user may select one or more or all of the compute/network/storage nodes among the node patrol parameters to configure.

Step 702, judging the node type, selecting the routing inspection to be executed according to the node type configured by the user, and executing step 703, and physical node cluster routing inspection if the node is configured as a computing node; if the configuration is a network node, executing step 703, switch node cluster inspection; if the storage node is configured, step 705, ceph storage node cluster inspection, is performed.

After the physical node cluster inspection is finished, a judging step 706 is executed, whether the physical node cluster inspection is successful or not is judged, if yes, a step 707 is executed in a jumping mode, and the cloud host node cluster inspection is carried out; if not, the step 710 is executed by jumping, and the cluster resource use patrol is directly entered. After the inspection of the switch node is finished, whether normal or not, the step 710 is performed in a jumping manner, and the cluster resource use inspection is entered. After the Ceph storage cluster inspection is finished, a judging step 708 is executed, if yes, a step 709 is skipped to execute the distributed storage cluster inspection; if not, the process proceeds to step 710.

Step 711, judging whether the cluster resource use inspection is successful; if not, step 712 is executed in a jump mode, and a patrol report is generated; if yes, go to step 713, perform cluster alert inspection, and then perform step 712.

After the patrol report is generated, the jump execution step 714 is entered, the mail notification is sent to the target user through the mail.

The above-mentioned physical node cluster inspection, switch node inspection, ceph storage cluster inspection, distributed storage cluster inspection, cloud host node cluster inspection, cluster resource use inspection, cluster alarm inspection, inspection report generation, mail notification and the like are all timing operations in platform inspection tasks, and the timing operations are executed in a tandem mode or in parallel mode, and are independent. The business process is formed in the timing task of the platform inspection.

The user only needs to design the jobs from the user interface 501 through the UI designer according to the above-mentioned business processes and submit the jobs to the distributed timed task scheduling system 100, and the timed jobs contained in the timed tasks are submitted to the distributed timed task scheduling system 100 after the above-mentioned processes are finished through the business processes and business logic related to the whole scheduling process and depended on in the user interface 501 of the system according to the above-mentioned processes, i.e. the declarative scheduling strategy is finished, and the registration of the timed tasks can be finished through analysis.

When the period set by the task timer corresponding to the timing task arrives, the period is distributed to a service scheduling process through the message queue 10, the service scheduling process sends a message to the scheduling center 301, and the scheduling center 301 selects a proper node to execute, namely, the job execution is completed.

Embodiment two:

based on the distributed timed task scheduling system 100 disclosed in embodiment one, this embodiment also discloses a specific implementation of a computing device.

In this embodiment, the computing device deploys a distributed timed task scheduling system 100 as disclosed in embodiment one. The computing clusters are configured as cloud computing platforms 400 (or simply "cloud platforms 400"), server clusters, or data centers, the nodes being computing nodes. The specific technical solution of the distributed timing task scheduling system 100 disclosed in this embodiment is shown in the first embodiment, and will not be described herein again.

Referring to FIG. 16, applicants have briefly described using a computing device selected from cloud platform 400 as an example. The cloud platform 400 may be a cloud computing platform based on the OpenStack architecture, and is configured with a control Node 441, a network Node 442, a computing Node 443, and a storage Node 444, where functional nodes such as the control Node 441, the network Node 442, the computing Node 443, and the storage Node 444 may be understood as terms "Node" or "Node" in the first embodiment.

The cloud platform 400 deploys the distributed timing task scheduling system 100, and performs a session with a user through the distributed timing task scheduling system 100, so as to execute a timing task issued by the user through the cloud platform 400, and continuously schedule the timing task issued by the user based on the distributed timing task scheduling system 100. The specific technical solution of the distributed timing task scheduling system 100 disclosed in this embodiment is shown in the first embodiment, and will not be described herein again.

Meanwhile, the applicant points out that in the cloud platform 400 disclosed in this embodiment, the response capability of the nodes can be detected by the scheduling centers 301 and 302, so as to determine, according to the response capability of each node, the most reasonable and matched scheduling operation for different timing tasks according to the technical scheme disclosed in the embodiment such as the service scheduler corresponding to the execution timing job. Therefore, when the cloud platform 400 deploying the distributed timing task scheduling system 100 performs capacity expansion (or capacity contraction), the performance and specific configuration parameters of the added (or subtracted) functional nodes such as the computing node 443, the storage node 444 and the like can be ignored, so that the cloud platform 400 including the distributed timing task scheduling system 100 is simpler and more convenient to perform capacity expansion (or capacity contraction).

Embodiment III:

based on a specific implementation of the computing device disclosed in the second embodiment, the embodiment also discloses a computer. The computer includes a CPU, a bus and a storage medium, the storage medium and the CPU mounted to the bus, wherein the storage medium is configured with a distributed timed task scheduling system 100 as disclosed in the first embodiment. The specific technical solution of the distributed timing task scheduling system 100 disclosed in this embodiment is shown in the first embodiment, and will not be described herein again.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated unit may be stored in a computer readable medium if implemented in the form of a software functional unit and sold or used as a stand alone product. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The above list of detailed descriptions is only specific to practical embodiments of the present invention, and they are not intended to limit the scope of the present invention, and all equivalent embodiments or modifications that do not depart from the spirit of the present invention should be included in the scope of the present invention.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.

Claims

1. A distributed timed task scheduling system, comprising:

a task scheduler including a flow scheduling process, a service scheduler including a service scheduling process, and a task timer for determining a timing task execution period, which are declaratively configured in at least two nodes,

the message queue is used to store a message,

2. The distributed timed task scheduling system according to claim 1, wherein each node uses a flow scheduling process included in a task scheduler to determine whether to notify a scheduling center by a polling mechanism in a message queue according to a time limit set by the task timer based on a distributed lock corresponding to the timed task, and determines at least one service scheduler corresponding to executing the timed job according to a response capability of each node through the scheduling center.

3. A distributed timed task scheduling system according to claim 2, wherein at least two scheduling centres detect the responsiveness of the nodes with a time limit set by a task timer disposed in at least one node selected by the timed task delivery component as a period.

4. A distributed timed task scheduling system according to claim 2 in which at least one pair of a flow scheduling process and a service scheduling process are declaratively configured in at least two nodes, the flow scheduling process and service scheduling process being decoupled using message queues.

5. The distributed timed task scheduling system according to claim 1, wherein after the first storage device stores the execution result corresponding to the response timed task, the message queue notifies the node executing the timed task and the service scheduler deployed in the node;

6. The distributed timed task scheduling system of claim 1, wherein the timed task issuing component comprises: a user interface and a load balancer;

7. The distributed timed task scheduling system according to any one of claims 2-6, wherein the scheduling center is responsive to the timed task issued from the message queue to split the timed task according to a configuration on which at least one timed job included in the timed task depends, and to determine at least one service scheduler corresponding to executing a sub-timed task for a plurality of sub-timed tasks formed after splitting the timed task;

the node is a computing node, a storage node or a network node.

8. A distributed timed task scheduling system according to claim 7, characterised in that the distributed timed task scheduling system is arranged to respond to a multi-timed task scenario;

9. A distributed timed task scheduling system according to claim 8, wherein the message queue is stored in a second storage means logically independent of the first storage means, the second storage means being selected from a physical server or a virtual server or a non-volatile storage means, the message queue being a rabkitmq.

10. A distributed timed task scheduling system according to claim 7, wherein the service scheduling process is configured as one or more service units for responding to timed tasks, the service scheduling process and the service units have a mapping relationship, and the service units are deployed at any node.

11. The distributed timed task scheduling system according to claim 10, wherein the flow scheduling process defines at least one business flow, the service scheduling process defines at least one business logic to configure different distributed timed task scheduling services through a mapping relationship between the business flow and the business logic;

the business logic is integrated into the service unit;

the service unit is a container, a virtual machine or a micro service.

12. A computing device deploying a distributed timed task scheduling system according to any one of claims 1 to 11.

13. The computing device of claim 12, wherein the computing device is configured as a cloud computing platform, a server cluster, or a data center;

the node is a computing node, a storage node or a network node.