CN112527488A

CN112527488A - Distributed high-availability task scheduling method and system

Info

Publication number: CN112527488A
Application number: CN202011514535.2A
Authority: CN
Inventors: 魏少龙
Original assignee: Zhejiang Baiying Technology Co Ltd
Current assignee: Zhejiang Baiying Technology Co Ltd
Priority date: 2020-12-21
Filing date: 2020-12-21
Publication date: 2021-03-19

Abstract

The invention discloses a distributed high-availability task scheduling method and a distributed high-availability task scheduling system. In the invention, the scheduling tasks are thinned to the task data units, the task data units are isolated from each other, and the task data units are used for distribution processing in a distributed environment, so that the processing efficiency and the fault processing capability of the scheduling tasks are greatly improved, and a retry strategy can be carried out according to the condition of a failed task.

Description

Distributed high-availability task scheduling method and system

Technical Field

The invention belongs to the technical field of on-line service task scheduling, and particularly relates to a distributed high-availability task scheduling method and system.

Background

Most companies use some current open-source scheduling tools when using scheduling tasks, and need to introduce a series of work such as relevant jar packages, increase configuration files, develop corresponding codes and the like, so that the time and the labor are consumed, and the cost is very high; at present, the popular scheduling method in the market basically performs distributed node distribution by taking tasks as base points, and if a certain distributed node fails or a certain task is abnormal, the task batch fails, so that the influence range is large;

the current scheduling scheme has limited utilization of distributed resources, coarser data granularity, failure of fully utilizing the distributed resources and low processing performance.

Disclosure of Invention

In view of this, the present invention provides a distributed high-availability task scheduling method and system.

In order to achieve the purpose, the technical scheme provided by the invention is as follows:

the invention relates to a distributed high-availability task scheduling method, which comprises the following steps:

(1) the thread manager calculates the thread number of the distributed scheduling tasks according to the number of the scheduling tasks configured by the scheduling center and the historical execution condition of the tasks and starts an isolation thread pool;

(2) the coordination processor acquires a thread from the thread pool, acquires task data units in batch through the exposed task data unit batch acquisition interface, and sends the task data units to the task distributor;

(3) the task distributor sends the task data units to a task distributor placing queue on the principle of first-in first-out;

(4) the task distributor dynamically distributes and executes tasks to the coordination processor according to the scheduling strategy and the processing capacity of each node;

(5) the coordination processor acquires a thread from the thread manager to process the task data unit;

(6) the coordination processor sends the task execution condition to the task analyzer for analysis, and sends the failure task data unit to the tail of the task distributor placement queue;

(7) the task analyzer analyzes and calculates the supporting capacity of each node according to the task execution condition and informs the task distributor of the supporting capacity;

(8) and the task analyzer summarizes and analyzes the number of the processing failure tasks and gives an alarm through the monitoring center.

Preferably, the scheduling center schedules task data to be stored in the zookeeper cluster, and the zookeeper cluster coordinates and recovers the scheduled task data.

Preferably, the task data unit is executed by the service processing module, and the execution condition is fed back to the coordination processor.

Preferably, a retry strategy is planned according to the supporting capacity of each node and the reason of the failed task data unit, and the failed task data unit is redistributed and processed.

A distributed high-availability task scheduling system comprises a scheduling center system, a service system and a zookeeper cluster;

the dispatching center system comprises a task publisher, a coordination processor, a thread manager and a task analyzer;

the service system comprises a plurality of service processing modules, and the service processing modules acquire task data units to be processed from the dispatching center system through the exposed task data unit processing interfaces and execute the task data units;

the zookeeper cluster is used for storing files in the dispatching center system to ensure the consistency of task data of the dispatching center system;

the thread manager is used for configuring the number of scheduling tasks, calculating the number of threads of the allocated scheduling tasks and starting an isolation thread pool;

the coordination processor is used for acquiring the task data unit, sending the task data unit to the task distributor and sending the task execution condition to the task analyzer;

the task publisher is used for distributing and executing tasks to the coordination processor;

and the task analyzer is used for analyzing and calculating the supporting capacity of each node, informing the task distributor and summarizing and analyzing the number of the tasks which fail to be processed.

Preferably, the task analyzer is connected to the monitoring center, and feeds back the number of tasks that fail to be processed to the monitoring center, and the monitoring center sends out alarm information.

Compared with the prior art, the technical scheme provided by the invention has the following beneficial effects:

1. the invention greatly reduces the workload of system scheduling and development, reduces the cost of enterprise development and improves the production efficiency of enterprises.

2. The scheduling tasks are refined to the task data units, the task data units are isolated from one another, and the task data units are distributed and processed in a distributed environment, so that the processing efficiency and the fault processing capability of the scheduling tasks are greatly improved.

3. In the invention, if a certain machine in the executor cluster fails or processing of a certain task data unit fails, the dispatching center system triggers the fault switching and retry strategies, and because the dispatching center system distributes the task data units, the data units fail to influence other data units in the batch, and the switching is more flexible and reliable.

4. The invention has flexible rule configuration and complete scheduling strategy, can dynamically adjust the execution strategy and the interruption strategy of the task data unit, better ensures the stability of the system and improves the overall scheduling efficiency.

5. The service system in the invention has no perception of 0 code development and low coupling, can complete the scheduling task only by configuring the scheduling strategy by the scheduling center system, and is convenient and quick and has friendly function.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a block diagram of the system of the present invention;

description of the labels in the schematic:

1-a dispatch center system; 2-a service system; 3-zookeeper clustering; 11-a task publisher; 12-a coordination processor; 13-a thread manager; 14-a task analyzer; 15-monitoring center.

Detailed Description

For further understanding of the present invention, the present invention will be described in detail with reference to examples, which are provided for illustration of the present invention but are not intended to limit the scope of the present invention.

Example 1

Referring to fig. 1, the present embodiment relates to a distributed high availability task scheduling method, which includes the following steps:

And the scheduling task data of the scheduling center is stored in the zookeeper cluster, and the scheduling task data is coordinated and recovered through the zookeeper cluster. The consistency of the task data of the dispatching center system in the distributed environment is ensured, and the coordination and crash recovery capability in the distributed environment is also ensured.

The task data unit is executed through the service processing module, and the execution condition is fed back to the coordination processor.

And planning a retry strategy according to the supporting capacity of each node and the reason of the failed task data unit, and redistributing and processing the failed task data unit.

The scheduling tasks are refined to the task data units, the task data units are isolated from one another, and the task data units are distributed and processed in a distributed environment, so that the processing efficiency and the fault processing capability of the scheduling tasks are greatly improved. And if a certain machine in the executor cluster fails or processing of a certain task data unit fails, the dispatching center system triggers a fault switching and retry strategy, and because the dispatching center system distributes the task data units, the data units fail to influence other data units in the batch, and the switching is more flexible and reliable.

Example 2

Referring to fig. 2, the present embodiment relates to a distributed high-availability task scheduling system, which includes a scheduling center system 1, a service system 2, and a zookeeper cluster 3;

the dispatching center system 1 comprises a task publisher 11, a coordination processor 12, a thread manager 13 and a task analyzer 14;

the service system 2 comprises a plurality of service processing modules, and the service processing modules acquire task data units to be processed from the dispatching center system 1 through the exposed task data unit processing interfaces and execute the task data units;

the zookeeper cluster 3 is used for storing files in the dispatching center system 1 to ensure the consistency of task data of the dispatching center system;

the thread manager 13 is used for configuring the number of scheduling tasks, calculating the number of threads of the allocated scheduling tasks and starting an isolation thread pool;

the coordination processor 12 is configured to obtain a task data unit, send the task data unit to the task distributor 11, and send a task execution condition to the task analyzer 14;

the task publisher 11 is used for distributing execution tasks to the coordination processor 12;

the task analyzer 14 is used for analyzing and calculating the supporting capability of each node, informing the task distributor 11 and summarizing and analyzing the number of tasks which fail to be processed.

The task analyzer 14 is connected to the monitoring center 15, and feeds back the number of the tasks failing to be processed to the monitoring center 15, and the monitoring center 15 sends out alarm information.

The present invention and its embodiments have been described above schematically, without limitation, and the embodiments of the present invention are shown in the drawings, and the actual structures are not limited thereto. Therefore, those skilled in the art should understand that they can easily and effectively design and modify the structure and embodiments of the present invention without departing from the spirit and scope of the present invention.

Claims

1. A distributed high-availability task scheduling method is characterized by comprising the following steps:

2. The distributed high availability task scheduling method of claim 1, wherein the scheduling center schedules task data to be stored in a zookeeper cluster, and the scheduled task data is coordinated and recovered by the zookeeper cluster.

3. The distributed high availability task scheduling method of claim 1, wherein the task data unit is executed by the service processing module and feeds back the execution status to the coordination processor.

4. The distributed high availability task scheduling method of claim 1, wherein a retry strategy is planned according to the supporting capability of each node and the reason of the failed task data unit, and the failed task data unit is redistributed.

5. A distributed high-availability task scheduling system is characterized by comprising a scheduling center system, a service system and a zookeeper cluster; the dispatching center system comprises a task publisher, a coordination processor, a thread manager and a task analyzer; the service system comprises a plurality of service processing modules, and the service processing modules acquire task data units to be processed from the dispatching center system through the exposed task data unit processing interfaces and execute the task data units; the zookeeper cluster is used for storing files in the dispatching center system to ensure the consistency of task data of the dispatching center system;

the coordination processor is used for acquiring the task data unit, sending the task data unit to the task distributor and sending the task execution condition to the task analyzer; the task publisher is used for distributing and executing tasks to the coordination processor; and the task analyzer is used for analyzing and calculating the supporting capacity of each node, informing the task distributor and summarizing and analyzing the number of the tasks which fail to be processed.

6. The distributed highly available task scheduling system of claim 1, wherein the task analyzer is connected to the monitoring center, and feeds back the number of tasks that fail to be processed to the monitoring center, and the monitoring center sends out alarm information.