CN111459639A - Distributed task management platform and method supporting global multi-machine-room deployment - Google Patents
Distributed task management platform and method supporting global multi-machine-room deployment Download PDFInfo
- Publication number
- CN111459639A CN111459639A CN202010257724.XA CN202010257724A CN111459639A CN 111459639 A CN111459639 A CN 111459639A CN 202010257724 A CN202010257724 A CN 202010257724A CN 111459639 A CN111459639 A CN 111459639A
- Authority
- CN
- China
- Prior art keywords
- task
- task management
- machine
- machine room
- tasks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a distributed task management platform and a distributed task management method supporting global multi-machine room deployment. The system comprises a task executor, a task scheduler, a registration center, a task management server and a task control foreground; the task executor executes batch processing tasks, runs in an application service process, is provided with a plurality of nodes, and reports the task execution state of the task executor to a task management server through messages in the process of executing the batch processing tasks; the task scheduler executes the slicing scheduling of the tasks; the registration center is responsible for recording and notifying the upper and lower line states of the task executor nodes, triggering fragment scheduling and storing various information; the task management service is responsible for information management of tasks and manual triggering of task execution; the task control foreground is deployed in the central machine room, and the task management server of each machine room is called to manage the task of each machine room, so that the distributed task management of the single control console and the multiple machine rooms is realized. The invention has the beneficial effects that: reducing the load on the client node.
Description
Technical Field
The invention relates to the technical field of internet correlation, in particular to a distributed task management platform and a distributed task management method supporting global multi-machine room deployment.
Background
In a business system, a timed batch processing requirement is often needed, which is similar to the function of Crontab of L inux, and these batch processing tasks rely on data of the business system more, and in addition, for a scene with strong data consistency, the execution state of the tasks needs to be fed back and presented in real time, and at the same time, high availability and failover of the tasks need to be guaranteed.
In open source software, there are some such frameworks, the most notable of which is the electronic-job-lite that is open source on the current day, and the problems that exist today are: at present, middleware related to open-source distributed task management is difficult to meet the scheduling requirement of single platform management of global multi-machine-room deployment tasks, and for the requirement of multiple machine rooms, only one set of task management system can be independently deployed in each cluster, so that complexity is brought to global multi-machine-room distributed task management, monitoring, task rule issuing and the like. At present, the main open source scheme electronic-job-lite realizes job time scheduling, fragmentation scheduling, job execution, log recording and the like in a rich client form, so that the load of the client is overlarge, and the stability and the expandability of a service system are greatly influenced.
Disclosure of Invention
The invention provides a distributed task management platform and a distributed task management method supporting global multi-machine-room deployment, which are used for overcoming the defects in the prior art and reducing the load of client nodes.
In order to achieve the purpose, the invention adopts the following technical scheme:
a distributed task management platform supporting global multi-machine-room deployment comprises a task executor, a task scheduler, a registration center, a task management server and a task control foreground;
the task executor executes batch processing tasks, runs in an application service process, is provided with a plurality of nodes, and reports the task execution state of the task executor to a task management server through messages in the process of executing the batch processing tasks;
the task scheduler executes the fragment scheduling of the task, namely determining which actuator executes each fragment;
the registration center is responsible for recording and informing the upper and lower line states of the task executor nodes in the tasks, triggering fragment scheduling and storing various information;
the task management service is responsible for information management of tasks and manual triggering of task execution; meanwhile, the task management service consumes the messages sent by the task executor nodes through the message middleware kafka, records the task execution state and events to the database, and gives an alarm to the tasks which fail to be executed through the alarm center;
the task control foreground is deployed in the central machine room, and the task management server of each machine room is called to manage the task of each machine room, so that the distributed task management of a single console and multiple machine rooms is realized.
The platform manages distributed tasks deployed in multiple machine rooms, and comprises task fragment scheduling and task execution separation, so that a task client is concentrated in task execution and execution state reporting, fragment scheduling, alarming, log recording and the like are decoupled through messages and a server, and the messages and the log recording are asynchronously handed to the server to be completed, and the load of client nodes is greatly reduced.
Preferably, for the task scheduler, the task scheduler is deployed together with the task management server and also monitors the upper and lower line states of the task executor nodes on the registration center, and the slicing scheduling policy is a slicing policy based on an average allocation algorithm.
The invention also provides an implementation method of the distributed task management platform supporting global multi-machine room deployment, which specifically comprises the following steps:
(1) in global machine rooms needing to be deployed, each machine room needs to deploy a dependent basic component;
(2) in global machine rooms needing to be deployed, each machine room needs to be deployed with a highly available registration center;
(3) deploying a task scheduler and a task management server side in a global machine room needing to be deployed;
(4) deploying a task control foreground in the central machine room, wherein the task control foreground is a UI (user interface) for managing and operating tasks, configuring the addresses of task management servers of different machine rooms, and calling each machine room service to manage and control the tasks;
(5) through the provided JAVA client SDK, a task is accessed in a JAVA application in a Spring boot annotation mode, and a developer completes own batch processing service logic through expansion.
Preferably, in step (1), the whole platform needs the high-performance message middleware kafka and a database for recording the task execution state and events of the dependent basic components.
Preferably, in step (2), the registry is implemented by an ETCD, a Zookeeper or by itself as required, and if the Zookeeper is used, at least more than 3 nodes are used, and an odd number of nodes are deployed.
Preferably, in step (3), the two components may be selectively deployed in the same JVM process on the same server, and an alarm center corresponding to each component is selected according to the company condition, and when an abnormal job execution occurs, an alarm is given by telephone, short message, nail or other means at the first time.
Preferably, in step (5), after the service accesses the task, the task control foreground checks the job information, the fragmentation scheduling condition, modifies the job time in real time, and queries the execution information and state.
The invention has the beneficial effects that: the task client is concentrated on task execution and execution state reporting, fragment scheduling, alarming, log recording and the like are decoupled through messages and the server, and the messages and the log recording are asynchronously delivered to the server to be completed, so that the load of the client node is greatly reduced.
Drawings
FIG. 1 is a system framework diagram of the present invention;
fig. 2 is a schematic diagram of a fragmentation scheduling policy.
Detailed Description
The invention is further described with reference to the following figures and detailed description.
In the embodiment shown in fig. 1, a distributed task management platform supporting global multi-machine-room deployment includes a task executor, a task scheduler, a registration center, a task management server, and a task control foreground;
the task executor specifically executes batch processing tasks, services are accessed through the SDK of the JAVA client provided by the invention, the service runs in an application service process, the nodes are provided with a plurality of nodes, the nodes of the task executor are nodes deployed by application services, in the process of executing the batch processing tasks, the task execution state of the task executor is reported to the task management server through messages, and the message middleware used in the process is kafka;
the task scheduler executes the fragment scheduling of the task, namely determining which actuator each fragment is executed by; the method can be deployed together with a task management server, and simultaneously monitors the upper and lower line states of a task executor node on a registration center, and the fragment scheduling policy is a fragment policy based on an average allocation algorithm, as shown in fig. 2, if there are three execution nodes, the whole task is divided into 7 fragments, the process of fragment scheduling is to divide the 7 fragments by the number of nodes 3, each node is to allocate 2 fragments in sequence, the final remainder is 1, the fragments are allocated to the node 1, and the final scheduling result is: node 1(1,2,7), node 2(3,4), node 3(5, 6).
The registration center is responsible for recording and informing the upper and lower line states of the task executor nodes in the tasks, triggering fragment scheduling and storing various information; almost all other templates need to deal with the registry, and the registry is realized through Zookeeper in the scheme.
The task management service is responsible for information management of tasks and manual triggering of task execution; meanwhile, the task management service consumes the messages sent by the task executor nodes through the message middleware kafka, records the task execution state and events to the database, and gives an alarm to the tasks which fail to be executed through the alarm center; the tasks of each room need to be managed, so the module will be deployed in each room.
The task control foreground is deployed in the central machine room, and the task management server of each machine room is called to manage the task of each machine room, so that the distributed task management of the single control console and the multiple machine rooms is realized.
The invention also provides an implementation method of the distributed task management platform supporting global multi-machine room deployment, which specifically comprises the following steps:
(1) in global machine rooms needing to be deployed, each machine room needs to deploy a dependent basic component; the whole platform needs to depend on basic components with high-performance message middleware kafka and a database for recording task execution states and events.
(2) In global machine rooms needing to be deployed, each machine room needs to be deployed with a highly available registration center; the registry is realized by ETCD, Zookeeper or the registry is realized according to the requirement, if the Zookeeper is used, at least more than 3 nodes are used, and odd number of nodes are deployed.
(3) Deploying a task scheduler and a task management server side in a global machine room needing to be deployed; the two components can be selectively deployed in the same JVM process on the same server, a corresponding alarm center is selected according to the condition of a company, and when the operation execution is abnormal, the alarm is given by telephone, short message, nail or other modes at the first time.
(4) Deploying a task control foreground in the central machine room, wherein the task control foreground is a UI (user interface) for managing and operating tasks, configuring the addresses of task management servers of different machine rooms, and calling each machine room service to manage and control the tasks;
(5) through the provided JAVA client SDK, a task is accessed in a JAVA application in a Spring boot annotation mode, and a developer completes own batch processing service logic through expansion. After the service is accessed to the task, the task control foreground checks the operation information, the fragment scheduling condition, modifies the operation time in real time and inquires the execution information and the state.
The platform manages distributed tasks deployed in multiple machine rooms, and comprises task fragment scheduling and task execution separation, so that a task client is concentrated in task execution and execution state reporting, fragment scheduling, alarming, log recording and the like are decoupled through messages and a server, and the messages and the log recording are asynchronously handed to the server to be completed, and the load of client nodes is greatly reduced.
Claims (7)
1. A distributed task management platform supporting global multi-machine-room deployment is characterized by comprising a task executor, a task scheduler, a registration center, a task management server and a task control foreground;
the task executor executes batch processing tasks, runs in an application service process, is provided with a plurality of nodes, and reports the task execution state of the task executor to a task management server through messages in the process of executing the batch processing tasks;
the task scheduler executes the fragment scheduling of the task, namely determining which actuator executes each fragment;
the registration center is responsible for recording and informing the upper and lower line states of the task executor nodes in the tasks, triggering fragment scheduling and storing various information;
the task management service is responsible for information management of tasks and manual triggering of task execution; meanwhile, the task management service consumes the messages sent by the task executor nodes through the message middleware kafka, records the task execution state and events to the database, and gives an alarm to the tasks which fail to be executed through the alarm center;
the task control foreground is deployed in the central machine room, and the task management server of each machine room is called to manage the task of each machine room, so that the distributed task management of a single console and multiple machine rooms is realized.
2. The distributed task management platform and method for supporting global multi-machine-room deployment according to claim 1, wherein for the task scheduler, the task scheduler is deployed together with the task management server, and simultaneously monitors the upper and lower line states of the task executor nodes on the registration center, and the slicing scheduling policy is a slicing policy based on an average allocation algorithm.
3. An implementation method of a distributed task management platform supporting global multi-machine room deployment is characterized by specifically comprising the following steps:
(1) in global machine rooms needing to be deployed, each machine room needs to deploy a dependent basic component;
(2) in global machine rooms needing to be deployed, each machine room needs to be deployed with a highly available registration center;
(3) deploying a task scheduler and a task management server side in a global machine room needing to be deployed;
(4) deploying a task control foreground in the central machine room, wherein the task control foreground is a UI (user interface) for managing and operating tasks, configuring the addresses of task management servers of different machine rooms, and calling each machine room service to manage and control the tasks;
(5) through the provided JAVA client SDK, a task is accessed in a JAVA application in a Spring boot annotation mode, and a developer completes own batch processing service logic through expansion.
4. The method as claimed in claim 3, wherein in step (1), the overall platform needs to rely on the basic components with high performance message middleware kafka and a database for recording task execution status and events.
5. The implementation method of the distributed task management platform supporting global multi-machine-room deployment according to claim 3, wherein in the step (2), the registry is implemented by ETCD, Zookeeper or by itself as required, and if Zookeeper is used, at least more than 3 nodes are used to deploy odd number of nodes.
6. The implementation method of the distributed task management platform supporting global multi-machine-room deployment according to claim 3, wherein in step (3), the two components can be selectively deployed in the same JVM process on the same server, and the corresponding alarm center is selected according to company conditions, and when there is an abnormal job execution, the alarm is given by phone call, short message, nail or other means at the first time.
7. The implementation method of the distributed task management platform supporting global multi-machine-room deployment according to claim 3, wherein in step (5), after the service accesses the task, the task control foreground checks the job information, the fragmentation scheduling condition, modifies the job time in real time, and queries the execution information and status.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010257724.XA CN111459639B (en) | 2020-04-03 | 2020-04-03 | Distributed task management platform and method supporting global multi-machine room deployment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010257724.XA CN111459639B (en) | 2020-04-03 | 2020-04-03 | Distributed task management platform and method supporting global multi-machine room deployment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111459639A true CN111459639A (en) | 2020-07-28 |
CN111459639B CN111459639B (en) | 2023-10-20 |
Family
ID=71681077
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010257724.XA Active CN111459639B (en) | 2020-04-03 | 2020-04-03 | Distributed task management platform and method supporting global multi-machine room deployment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111459639B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111858007A (en) * | 2020-07-29 | 2020-10-30 | 广州海鹚网络科技有限公司 | Task scheduling method and device based on message middleware |
CN112527478A (en) * | 2020-11-30 | 2021-03-19 | 成都中科大旗软件股份有限公司 | Method and system for realizing automatic task registration and asynchronous scheduling based on distribution |
CN113485812A (en) * | 2021-07-23 | 2021-10-08 | 重庆富民银行股份有限公司 | Partition parallel processing method and system based on large data volume task |
CN114968504A (en) * | 2021-02-26 | 2022-08-30 | 中国联合网络通信集团有限公司 | Distributed task scheduling method and device and storage medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102710554A (en) * | 2012-06-25 | 2012-10-03 | 深圳中兴网信科技有限公司 | Distributed message system and service status detection method thereof |
CN103324539A (en) * | 2013-06-24 | 2013-09-25 | 浪潮电子信息产业股份有限公司 | Job scheduling management system and method |
US20140298350A1 (en) * | 2013-03-27 | 2014-10-02 | Nec Corporation | Distributed processing system |
CN104536809A (en) * | 2014-11-26 | 2015-04-22 | 上海瀚之友信息技术服务有限公司 | Distributed timing task scheduling system based on client and server system |
CN106201694A (en) * | 2016-07-13 | 2016-12-07 | 北京农信互联科技有限公司 | Configuration method and system for executing timing task under distributed system |
CN106874090A (en) * | 2017-01-23 | 2017-06-20 | 北京思特奇信息技术股份有限公司 | Job scheduling method and system based on cloud system |
CN107766136A (en) * | 2017-09-30 | 2018-03-06 | 南威软件股份有限公司 | A kind of method of task cluster management and running |
CN108958920A (en) * | 2018-07-13 | 2018-12-07 | 众安在线财产保险股份有限公司 | A kind of distributed task dispatching method and system |
CN110569113A (en) * | 2018-06-06 | 2019-12-13 | 海通证券股份有限公司 | Method and system for scheduling distributed tasks and computer readable storage medium |
CN110780869A (en) * | 2019-10-31 | 2020-02-11 | 辽宁振兴银行股份有限公司 | Distributed batch scheduling |
CN110928662A (en) * | 2019-11-28 | 2020-03-27 | 国网信息通信产业集团有限公司 | Distributed timing task scheduler facing micro-service architecture |
-
2020
- 2020-04-03 CN CN202010257724.XA patent/CN111459639B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102710554A (en) * | 2012-06-25 | 2012-10-03 | 深圳中兴网信科技有限公司 | Distributed message system and service status detection method thereof |
US20140298350A1 (en) * | 2013-03-27 | 2014-10-02 | Nec Corporation | Distributed processing system |
CN103324539A (en) * | 2013-06-24 | 2013-09-25 | 浪潮电子信息产业股份有限公司 | Job scheduling management system and method |
CN104536809A (en) * | 2014-11-26 | 2015-04-22 | 上海瀚之友信息技术服务有限公司 | Distributed timing task scheduling system based on client and server system |
CN106201694A (en) * | 2016-07-13 | 2016-12-07 | 北京农信互联科技有限公司 | Configuration method and system for executing timing task under distributed system |
CN106874090A (en) * | 2017-01-23 | 2017-06-20 | 北京思特奇信息技术股份有限公司 | Job scheduling method and system based on cloud system |
CN107766136A (en) * | 2017-09-30 | 2018-03-06 | 南威软件股份有限公司 | A kind of method of task cluster management and running |
CN110569113A (en) * | 2018-06-06 | 2019-12-13 | 海通证券股份有限公司 | Method and system for scheduling distributed tasks and computer readable storage medium |
CN108958920A (en) * | 2018-07-13 | 2018-12-07 | 众安在线财产保险股份有限公司 | A kind of distributed task dispatching method and system |
CN110780869A (en) * | 2019-10-31 | 2020-02-11 | 辽宁振兴银行股份有限公司 | Distributed batch scheduling |
CN110928662A (en) * | 2019-11-28 | 2020-03-27 | 国网信息通信产业集团有限公司 | Distributed timing task scheduler facing micro-service architecture |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111858007A (en) * | 2020-07-29 | 2020-10-30 | 广州海鹚网络科技有限公司 | Task scheduling method and device based on message middleware |
CN112527478A (en) * | 2020-11-30 | 2021-03-19 | 成都中科大旗软件股份有限公司 | Method and system for realizing automatic task registration and asynchronous scheduling based on distribution |
CN112527478B (en) * | 2020-11-30 | 2023-11-07 | 成都中科大旗软件股份有限公司 | Method and system for realizing automatic registration and asynchronous scheduling of tasks based on distribution |
CN114968504A (en) * | 2021-02-26 | 2022-08-30 | 中国联合网络通信集团有限公司 | Distributed task scheduling method and device and storage medium |
CN113485812A (en) * | 2021-07-23 | 2021-10-08 | 重庆富民银行股份有限公司 | Partition parallel processing method and system based on large data volume task |
CN113485812B (en) * | 2021-07-23 | 2023-12-12 | 重庆富民银行股份有限公司 | Partition parallel processing method and system based on large-data-volume task |
Also Published As
Publication number | Publication date |
---|---|
CN111459639B (en) | 2023-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111459639A (en) | Distributed task management platform and method supporting global multi-machine-room deployment | |
CN111290834B (en) | Method, device and equipment for realizing high service availability based on cloud management platform | |
CA3168286A1 (en) | Data flow processing method and system | |
CN113742031B (en) | Node state information acquisition method and device, electronic equipment and readable storage medium | |
CN104935672A (en) | High available realizing method and equipment of load balancing service | |
CN109656742B (en) | Node exception handling method and device and storage medium | |
CN111274052A (en) | Data distribution method, server, and computer-readable storage medium | |
CN111209110B (en) | Task scheduling management method, system and storage medium for realizing load balancing | |
CN105630589A (en) | Distributed process scheduling system and process scheduling and execution method | |
CN109802986B (en) | Equipment management method, system, device and server | |
CN112003728B (en) | Kubernetes cluster-based application master and standby implementation method and device | |
CN112910937B (en) | Object scheduling method and device in container cluster, server and container cluster | |
US20150186489A1 (en) | System and method for supporting asynchronous invocation in a distributed data grid | |
CN111427670A (en) | Task scheduling method and system | |
CN111414241A (en) | Batch data processing method, device and system, computer equipment and computer readable storage medium | |
CN114138434A (en) | Big data task scheduling system | |
CN111897643A (en) | Thread pool configuration system, method, device and storage medium | |
JP2006285443A (en) | Object relief system and method | |
CN113765690A (en) | Cluster switching method, system, device, terminal, server and storage medium | |
CN113238849A (en) | Timed task processing method and device, storage medium and electronic equipment | |
CN115391058B (en) | SDN-based resource event processing method, resource creation method and system | |
CN117201278A (en) | Method for realizing disaster recovery high-availability scene of primary and backup cloud primary application in information creation environment | |
CN115426356A (en) | Distributed timed task lock update control execution method and device | |
CN113434316A (en) | Function integration method, device, equipment and storage medium based on redis plug-in | |
CN115550371B (en) | Pod scheduling method and system based on Kubernetes and cloud platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 22nd floor, block a, Huaxing Times Square, 478 Wensan Road, Xihu District, Hangzhou, Zhejiang 310000 Applicant after: Hangzhou Xiaoying Innovation Technology Co.,Ltd. Address before: 16 / F, HANGGANG Metallurgical Science and technology building, 294 Tianmushan Road, Xihu District, Hangzhou City, Zhejiang Province, 310012 Applicant before: HANGZHOU QUWEI SCIENCE & TECHNOLOGY Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |