CN112527488A - Distributed high-availability task scheduling method and system - Google Patents
Distributed high-availability task scheduling method and system Download PDFInfo
- Publication number
- CN112527488A CN112527488A CN202011514535.2A CN202011514535A CN112527488A CN 112527488 A CN112527488 A CN 112527488A CN 202011514535 A CN202011514535 A CN 202011514535A CN 112527488 A CN112527488 A CN 112527488A
- Authority
- CN
- China
- Prior art keywords
- task
- scheduling
- task data
- tasks
- data unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 16
- 238000012545 processing Methods 0.000 claims abstract description 29
- 238000002955 isolation Methods 0.000 claims description 6
- 238000011161 development Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/485—Task life-cycle, e.g. stopping, restarting, resuming execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5011—Pool
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5018—Thread allocation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/548—Queue
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Hardware Redundancy (AREA)
Abstract
The invention discloses a distributed high-availability task scheduling method and a distributed high-availability task scheduling system. In the invention, the scheduling tasks are thinned to the task data units, the task data units are isolated from each other, and the task data units are used for distribution processing in a distributed environment, so that the processing efficiency and the fault processing capability of the scheduling tasks are greatly improved, and a retry strategy can be carried out according to the condition of a failed task.
Description
Technical Field
The invention belongs to the technical field of on-line service task scheduling, and particularly relates to a distributed high-availability task scheduling method and system.
Background
Most companies use some current open-source scheduling tools when using scheduling tasks, and need to introduce a series of work such as relevant jar packages, increase configuration files, develop corresponding codes and the like, so that the time and the labor are consumed, and the cost is very high; at present, the popular scheduling method in the market basically performs distributed node distribution by taking tasks as base points, and if a certain distributed node fails or a certain task is abnormal, the task batch fails, so that the influence range is large;
the current scheduling scheme has limited utilization of distributed resources, coarser data granularity, failure of fully utilizing the distributed resources and low processing performance.
Disclosure of Invention
In view of this, the present invention provides a distributed high-availability task scheduling method and system.
In order to achieve the purpose, the technical scheme provided by the invention is as follows:
the invention relates to a distributed high-availability task scheduling method, which comprises the following steps:
(1) the thread manager calculates the thread number of the distributed scheduling tasks according to the number of the scheduling tasks configured by the scheduling center and the historical execution condition of the tasks and starts an isolation thread pool;
(2) the coordination processor acquires a thread from the thread pool, acquires task data units in batch through the exposed task data unit batch acquisition interface, and sends the task data units to the task distributor;
(3) the task distributor sends the task data units to a task distributor placing queue on the principle of first-in first-out;
(4) the task distributor dynamically distributes and executes tasks to the coordination processor according to the scheduling strategy and the processing capacity of each node;
(5) the coordination processor acquires a thread from the thread manager to process the task data unit;
(6) the coordination processor sends the task execution condition to the task analyzer for analysis, and sends the failure task data unit to the tail of the task distributor placement queue;
(7) the task analyzer analyzes and calculates the supporting capacity of each node according to the task execution condition and informs the task distributor of the supporting capacity;
(8) and the task analyzer summarizes and analyzes the number of the processing failure tasks and gives an alarm through the monitoring center.
Preferably, the scheduling center schedules task data to be stored in the zookeeper cluster, and the zookeeper cluster coordinates and recovers the scheduled task data.
Preferably, the task data unit is executed by the service processing module, and the execution condition is fed back to the coordination processor.
Preferably, a retry strategy is planned according to the supporting capacity of each node and the reason of the failed task data unit, and the failed task data unit is redistributed and processed.
A distributed high-availability task scheduling system comprises a scheduling center system, a service system and a zookeeper cluster;
the dispatching center system comprises a task publisher, a coordination processor, a thread manager and a task analyzer;
the service system comprises a plurality of service processing modules, and the service processing modules acquire task data units to be processed from the dispatching center system through the exposed task data unit processing interfaces and execute the task data units;
the zookeeper cluster is used for storing files in the dispatching center system to ensure the consistency of task data of the dispatching center system;
the thread manager is used for configuring the number of scheduling tasks, calculating the number of threads of the allocated scheduling tasks and starting an isolation thread pool;
the coordination processor is used for acquiring the task data unit, sending the task data unit to the task distributor and sending the task execution condition to the task analyzer;
the task publisher is used for distributing and executing tasks to the coordination processor;
and the task analyzer is used for analyzing and calculating the supporting capacity of each node, informing the task distributor and summarizing and analyzing the number of the tasks which fail to be processed.
Preferably, the task analyzer is connected to the monitoring center, and feeds back the number of tasks that fail to be processed to the monitoring center, and the monitoring center sends out alarm information.
Compared with the prior art, the technical scheme provided by the invention has the following beneficial effects:
1. the invention greatly reduces the workload of system scheduling and development, reduces the cost of enterprise development and improves the production efficiency of enterprises.
2. The scheduling tasks are refined to the task data units, the task data units are isolated from one another, and the task data units are distributed and processed in a distributed environment, so that the processing efficiency and the fault processing capability of the scheduling tasks are greatly improved.
3. In the invention, if a certain machine in the executor cluster fails or processing of a certain task data unit fails, the dispatching center system triggers the fault switching and retry strategies, and because the dispatching center system distributes the task data units, the data units fail to influence other data units in the batch, and the switching is more flexible and reliable.
4. The invention has flexible rule configuration and complete scheduling strategy, can dynamically adjust the execution strategy and the interruption strategy of the task data unit, better ensures the stability of the system and improves the overall scheduling efficiency.
5. The service system in the invention has no perception of 0 code development and low coupling, can complete the scheduling task only by configuring the scheduling strategy by the scheduling center system, and is convenient and quick and has friendly function.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a block diagram of the system of the present invention;
description of the labels in the schematic:
1-a dispatch center system; 2-a service system; 3-zookeeper clustering; 11-a task publisher; 12-a coordination processor; 13-a thread manager; 14-a task analyzer; 15-monitoring center.
Detailed Description
For further understanding of the present invention, the present invention will be described in detail with reference to examples, which are provided for illustration of the present invention but are not intended to limit the scope of the present invention.
Example 1
Referring to fig. 1, the present embodiment relates to a distributed high availability task scheduling method, which includes the following steps:
(1) the thread manager calculates the thread number of the distributed scheduling tasks according to the number of the scheduling tasks configured by the scheduling center and the historical execution condition of the tasks and starts an isolation thread pool;
(2) the coordination processor acquires a thread from the thread pool, acquires task data units in batch through the exposed task data unit batch acquisition interface, and sends the task data units to the task distributor;
(3) the task distributor sends the task data units to a task distributor placing queue on the principle of first-in first-out;
(4) the task distributor dynamically distributes and executes tasks to the coordination processor according to the scheduling strategy and the processing capacity of each node;
(5) the coordination processor acquires a thread from the thread manager to process the task data unit;
(6) the coordination processor sends the task execution condition to the task analyzer for analysis, and sends the failure task data unit to the tail of the task distributor placement queue;
(7) the task analyzer analyzes and calculates the supporting capacity of each node according to the task execution condition and informs the task distributor of the supporting capacity;
(8) and the task analyzer summarizes and analyzes the number of the processing failure tasks and gives an alarm through the monitoring center.
And the scheduling task data of the scheduling center is stored in the zookeeper cluster, and the scheduling task data is coordinated and recovered through the zookeeper cluster. The consistency of the task data of the dispatching center system in the distributed environment is ensured, and the coordination and crash recovery capability in the distributed environment is also ensured.
The task data unit is executed through the service processing module, and the execution condition is fed back to the coordination processor.
And planning a retry strategy according to the supporting capacity of each node and the reason of the failed task data unit, and redistributing and processing the failed task data unit.
The scheduling tasks are refined to the task data units, the task data units are isolated from one another, and the task data units are distributed and processed in a distributed environment, so that the processing efficiency and the fault processing capability of the scheduling tasks are greatly improved. And if a certain machine in the executor cluster fails or processing of a certain task data unit fails, the dispatching center system triggers a fault switching and retry strategy, and because the dispatching center system distributes the task data units, the data units fail to influence other data units in the batch, and the switching is more flexible and reliable.
Example 2
Referring to fig. 2, the present embodiment relates to a distributed high-availability task scheduling system, which includes a scheduling center system 1, a service system 2, and a zookeeper cluster 3;
the dispatching center system 1 comprises a task publisher 11, a coordination processor 12, a thread manager 13 and a task analyzer 14;
the service system 2 comprises a plurality of service processing modules, and the service processing modules acquire task data units to be processed from the dispatching center system 1 through the exposed task data unit processing interfaces and execute the task data units;
the zookeeper cluster 3 is used for storing files in the dispatching center system 1 to ensure the consistency of task data of the dispatching center system;
the thread manager 13 is used for configuring the number of scheduling tasks, calculating the number of threads of the allocated scheduling tasks and starting an isolation thread pool;
the coordination processor 12 is configured to obtain a task data unit, send the task data unit to the task distributor 11, and send a task execution condition to the task analyzer 14;
the task publisher 11 is used for distributing execution tasks to the coordination processor 12;
the task analyzer 14 is used for analyzing and calculating the supporting capability of each node, informing the task distributor 11 and summarizing and analyzing the number of tasks which fail to be processed.
The task analyzer 14 is connected to the monitoring center 15, and feeds back the number of the tasks failing to be processed to the monitoring center 15, and the monitoring center 15 sends out alarm information.
The present invention and its embodiments have been described above schematically, without limitation, and the embodiments of the present invention are shown in the drawings, and the actual structures are not limited thereto. Therefore, those skilled in the art should understand that they can easily and effectively design and modify the structure and embodiments of the present invention without departing from the spirit and scope of the present invention.
Claims (6)
1. A distributed high-availability task scheduling method is characterized by comprising the following steps:
(1) the thread manager calculates the thread number of the distributed scheduling tasks according to the number of the scheduling tasks configured by the scheduling center and the historical execution condition of the tasks and starts an isolation thread pool;
(2) the coordination processor acquires a thread from the thread pool, acquires task data units in batch through the exposed task data unit batch acquisition interface, and sends the task data units to the task distributor;
(3) the task distributor sends the task data units to a task distributor placing queue on the principle of first-in first-out;
(4) the task distributor dynamically distributes and executes tasks to the coordination processor according to the scheduling strategy and the processing capacity of each node;
(5) the coordination processor acquires a thread from the thread manager to process the task data unit;
(6) the coordination processor sends the task execution condition to the task analyzer for analysis, and sends the failure task data unit to the tail of the task distributor placement queue;
(7) the task analyzer analyzes and calculates the supporting capacity of each node according to the task execution condition and informs the task distributor of the supporting capacity;
(8) and the task analyzer summarizes and analyzes the number of the processing failure tasks and gives an alarm through the monitoring center.
2. The distributed high availability task scheduling method of claim 1, wherein the scheduling center schedules task data to be stored in a zookeeper cluster, and the scheduled task data is coordinated and recovered by the zookeeper cluster.
3. The distributed high availability task scheduling method of claim 1, wherein the task data unit is executed by the service processing module and feeds back the execution status to the coordination processor.
4. The distributed high availability task scheduling method of claim 1, wherein a retry strategy is planned according to the supporting capability of each node and the reason of the failed task data unit, and the failed task data unit is redistributed.
5. A distributed high-availability task scheduling system is characterized by comprising a scheduling center system, a service system and a zookeeper cluster; the dispatching center system comprises a task publisher, a coordination processor, a thread manager and a task analyzer; the service system comprises a plurality of service processing modules, and the service processing modules acquire task data units to be processed from the dispatching center system through the exposed task data unit processing interfaces and execute the task data units; the zookeeper cluster is used for storing files in the dispatching center system to ensure the consistency of task data of the dispatching center system;
the thread manager is used for configuring the number of scheduling tasks, calculating the number of threads of the allocated scheduling tasks and starting an isolation thread pool;
the coordination processor is used for acquiring the task data unit, sending the task data unit to the task distributor and sending the task execution condition to the task analyzer; the task publisher is used for distributing and executing tasks to the coordination processor; and the task analyzer is used for analyzing and calculating the supporting capacity of each node, informing the task distributor and summarizing and analyzing the number of the tasks which fail to be processed.
6. The distributed highly available task scheduling system of claim 1, wherein the task analyzer is connected to the monitoring center, and feeds back the number of tasks that fail to be processed to the monitoring center, and the monitoring center sends out alarm information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011514535.2A CN112527488A (en) | 2020-12-21 | 2020-12-21 | Distributed high-availability task scheduling method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011514535.2A CN112527488A (en) | 2020-12-21 | 2020-12-21 | Distributed high-availability task scheduling method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112527488A true CN112527488A (en) | 2021-03-19 |
Family
ID=75001967
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011514535.2A Pending CN112527488A (en) | 2020-12-21 | 2020-12-21 | Distributed high-availability task scheduling method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112527488A (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102455933A (en) * | 2010-10-22 | 2012-05-16 | 深圳市科陆电子科技股份有限公司 | Method for increasing multi-tasking efficiency through thread management |
CN103473129A (en) * | 2013-09-18 | 2013-12-25 | 柳州市博源环科科技有限公司 | Multi-task queue scheduling system with scalable number of threads and implementation method thereof |
CN107832129A (en) * | 2017-10-24 | 2018-03-23 | 华中科技大学 | A kind of dynamic task scheduling optimization method of Based on Distributed stream calculation system |
CN108268314A (en) * | 2016-12-31 | 2018-07-10 | 北京亿阳信通科技有限公司 | A kind of method of multithreading task concurrent processing |
CN109933611A (en) * | 2019-02-22 | 2019-06-25 | 深圳达普信科技有限公司 | A kind of adaptive collecting method and system |
CN110362390A (en) * | 2019-06-06 | 2019-10-22 | 银江股份有限公司 | A kind of distributed data integrated operations dispatching method and device |
CN110798339A (en) * | 2019-10-09 | 2020-02-14 | 国电南瑞科技股份有限公司 | Task disaster tolerance method based on distributed task scheduling framework |
CN110865798A (en) * | 2018-08-28 | 2020-03-06 | 中国移动通信集团浙江有限公司 | Thread pool optimization method and system |
CN111124806A (en) * | 2019-11-25 | 2020-05-08 | 山东鲁能软件技术有限公司 | Equipment state real-time monitoring method and system based on distributed scheduling task |
US20200250042A1 (en) * | 2019-01-31 | 2020-08-06 | Rubrik, Inc. | Distributed streaming database restores |
CN111858062A (en) * | 2020-07-27 | 2020-10-30 | 中国平安财产保险股份有限公司 | Evaluation rule optimization method, service evaluation method and related equipment |
-
2020
- 2020-12-21 CN CN202011514535.2A patent/CN112527488A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102455933A (en) * | 2010-10-22 | 2012-05-16 | 深圳市科陆电子科技股份有限公司 | Method for increasing multi-tasking efficiency through thread management |
CN103473129A (en) * | 2013-09-18 | 2013-12-25 | 柳州市博源环科科技有限公司 | Multi-task queue scheduling system with scalable number of threads and implementation method thereof |
CN108268314A (en) * | 2016-12-31 | 2018-07-10 | 北京亿阳信通科技有限公司 | A kind of method of multithreading task concurrent processing |
CN107832129A (en) * | 2017-10-24 | 2018-03-23 | 华中科技大学 | A kind of dynamic task scheduling optimization method of Based on Distributed stream calculation system |
CN110865798A (en) * | 2018-08-28 | 2020-03-06 | 中国移动通信集团浙江有限公司 | Thread pool optimization method and system |
US20200250042A1 (en) * | 2019-01-31 | 2020-08-06 | Rubrik, Inc. | Distributed streaming database restores |
CN109933611A (en) * | 2019-02-22 | 2019-06-25 | 深圳达普信科技有限公司 | A kind of adaptive collecting method and system |
CN110362390A (en) * | 2019-06-06 | 2019-10-22 | 银江股份有限公司 | A kind of distributed data integrated operations dispatching method and device |
CN110798339A (en) * | 2019-10-09 | 2020-02-14 | 国电南瑞科技股份有限公司 | Task disaster tolerance method based on distributed task scheduling framework |
CN111124806A (en) * | 2019-11-25 | 2020-05-08 | 山东鲁能软件技术有限公司 | Equipment state real-time monitoring method and system based on distributed scheduling task |
CN111858062A (en) * | 2020-07-27 | 2020-10-30 | 中国平安财产保险股份有限公司 | Evaluation rule optimization method, service evaluation method and related equipment |
Non-Patent Citations (1)
Title |
---|
余宣杰等: "《银行大数据应用》", 机械工业出版社, pages: 82 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102521044B (en) | Distributed task scheduling method and system based on messaging middleware | |
CN109857558A (en) | A kind of data flow processing method and system | |
CN107193539B (en) | Multithreading concurrent processing method and multithreading concurrent processing system | |
CN102262564A (en) | Thread pool structure of video monitoring platform system and realizing method | |
CN112559159A (en) | Task scheduling method based on distributed deployment | |
CN110611707B (en) | Task scheduling method and device | |
CN104461752A (en) | Two-level fault-tolerant multimedia distributed task processing method | |
CN110928655A (en) | Task processing method and device | |
CN102479113A (en) | Abnormal self-adapting processing method and system | |
CN109343939A (en) | A kind of distributed type assemblies and parallel computation method for scheduling task | |
CN111459642B (en) | Fault processing and task processing method and device in distributed system | |
CN101424941B (en) | Control implementing method and system | |
CN110727508A (en) | Task scheduling system and scheduling method | |
CN112631764A (en) | Task scheduling method and device, computer equipment and computer readable medium | |
CN111459641A (en) | Cross-machine-room task scheduling and task processing method and device | |
CN102088719A (en) | Method, system and device for service scheduling | |
CN111443720A (en) | Robot scheduling method and device | |
CN104484228A (en) | Distributed parallel task processing system based on Intelli-DSC (Intelligence-Data Service Center) | |
CN110798339A (en) | Task disaster tolerance method based on distributed task scheduling framework | |
CN112527488A (en) | Distributed high-availability task scheduling method and system | |
CN103514036A (en) | Scheduling system and method for event trigger and batch processing | |
CN103326880A (en) | Genesys calling system high-availability cloud computing system and method | |
CN109829005A (en) | A kind of big data processing method and processing device | |
CN111651278B (en) | Dynamic reconstruction method and platform based on software radar | |
CN117112121A (en) | Distributed task processing system, method, apparatus and computer program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |