CN112162841A - Distributed scheduling system, method and storage medium for big data processing - Google Patents
Distributed scheduling system, method and storage medium for big data processing Download PDFInfo
- Publication number
- CN112162841A CN112162841A CN202011069582.0A CN202011069582A CN112162841A CN 112162841 A CN112162841 A CN 112162841A CN 202011069582 A CN202011069582 A CN 202011069582A CN 112162841 A CN112162841 A CN 112162841A
- Authority
- CN
- China
- Prior art keywords
- workflow
- leader
- task
- follower
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012545 processing Methods 0.000 title claims abstract description 17
- 238000004364 calculation method Methods 0.000 claims abstract description 9
- 238000011161 development Methods 0.000 claims abstract description 9
- 230000018109 developmental process Effects 0.000 claims abstract description 9
- 230000001419 dependent effect Effects 0.000 claims abstract description 8
- 230000011218 segmentation Effects 0.000 claims abstract description 7
- 230000002085 persistent effect Effects 0.000 claims description 9
- 230000001960 triggered effect Effects 0.000 claims description 9
- 238000012544 monitoring process Methods 0.000 claims description 4
- 238000011144 upstream manufacturing Methods 0.000 abstract description 3
- 230000009286 beneficial effect Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/548—Queue
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a distributed scheduling system, a distributed scheduling method and a storage medium for big data processing, wherein the distributed scheduling system, the distributed scheduling method and the storage medium comprise a scheduling center module which is used for being responsible for dependent configuration and job development of workflow; the leader module is used as a task flow segmentation and distribution node in the cluster, segments the workflow configured by the dispatching center according to the dependency relationship, and sends the segmented specific task node to the follower node; the follower module is used for executing the specific calculation tasks distributed by the leader module, submitting task results and storing task execution logs; the coordinator module is used for taking out tasks to be executed from the database at regular time, and performing load balancing on the leader module by using a Round-Robin algorithm according to the load conditions of all the current leader modules; a task queue module and a metadata module. The invention considers the dependence on the task, avoids the phenomenon that the downstream task runs empty due to the overtime or the empty running of the execution time of the upstream task, and is beneficial to the whole data circulation.
Description
Technical Field
The invention belongs to the technical field of big data computing task scheduling, and particularly relates to a distributed scheduling system and method for big data processing and a storage medium.
Background
With the rapid development of data technology, modern enterprises begin to move from the IT era to the DT era, and no matter a public cloud or a self-built data center is selected, a large data platform has become an infrastructure of the modern enterprises. The big data platform is iterated gradually from the initial single execution engine MapReduce to the multiple execution engine era of MapReduce, Spark, Flink and the like. At present, in the process of mining data values of enterprises, thousands of data calculation tasks are generated, how to arrange and schedule the tasks is performed, and it is very important to construct an intricate calculation task dependence network.
For example, patent document CN107506381A discloses a big data distributed scheduling analysis method, system device and storage medium: a self-built big data distributed scheduling and analyzing system is described, the core function of the system is to realize the encapsulation of the big data processing technical process, and the scheduling function of the self-contained part, but a method for dealing with the dependence and arrangement of the complicated task flow in the big data scene is not provided, and the whole system has single point of failure, and the high available fault-tolerant strategy is not considered, so the following problems can be faced:
(1) the task distribution mode does not consider the dependency on the task. Once the execution time of the upstream task is overtime or runs empty, the downstream task is likely to have a run-empty phenomenon, which is not favorable for the whole data circulation process and increases the burden of developers.
(2) The server submitting or executing the computing task has a single point of failure, and once the server is down, the computing task cannot be triggered, so that the computing logic is influenced.
Therefore, it is necessary to develop a distributed scheduling system, method and storage medium for big data processing.
Disclosure of Invention
In order to solve the above problems, the present invention provides a distributed scheduling system, method and storage medium for big data processing.
In a first aspect, the present invention provides a big data processing-oriented distributed scheduling system, including:
the scheduling center module is used for being responsible for dependent configuration and job development of the workflow and persisting the configured workflow into a workflow table to be executed of the relational database through the API (application programming interface);
the leader module is used as a task flow segmentation and distribution node in the cluster, segments the workflow configured by the dispatching center according to the dependency relationship, and sends the segmented specific task node to the follower node;
the follower module is also called an executor and is used for executing specific calculation tasks distributed by the leader module, submitting task results and storing task execution logs;
the coordinator module is used for taking out tasks to be executed from the database at regular time, and performing load balancing on the leader module by using a Round-Robin algorithm according to the load conditions of all the current leader modules;
the task queue module is a message queue, comprises workflow topic, task topic and task result topic and is used for realizing task dependence among the workflows;
the metadata module comprises two databases, namely a relational database and a distributed memory database, wherein the relational database is used for persistently storing execution records of the workflow; and the distributed memory database is used for taking out the metadata related to the workflow from the relational database and loading the metadata into the memory.
In a second aspect, the distributed scheduling method for big data processing according to the present invention is a distributed scheduling system for big data processing according to the present invention, and the method includes the following steps:
receiving dependent configuration and job development of the workflow, and persisting the configured workflow into a workflow table to be executed of the relational database through an API (application programming interface);
segmenting the workflow configured by the dispatching center according to the dependency relationship, and sending the segmented specific task nodes to follower nodes;
the system is used for executing specific computing tasks distributed by the leader module, submitting task results and storing task execution logs;
and the system is used for taking out tasks to be executed from the database at regular time, and carrying out load balancing on the leader module by adopting a Round-Robin algorithm according to the load conditions of all the current leader modules.
Further, the coordinator service regularly scans a workflow table to be executed in the relational database to acquire a command to be executed, regularly requests load information of the Leader cluster from the ZooKeeper cluster, and allocates the workflow to the corresponding Leader according to the CPU and memory allowance of each current Leader machine by using a Round-Robin algorithm; and finally, sending the workflow with the Leader label to a process _ instance theme in the message queue, and waiting for the Leader to consume the topic to execute the workflow.
Further, the Leader consumes the process _ instance theme in the message queue, and judges whether the workflow needs to be executed according to the Leader _ host _ name field in the message; if the workflow needs to be executed, dividing the workflow into a plurality of computing tasks according to the dependency relationship of the workflow, and estimating computing resources needed by each computing task; acquiring load information of each machine of the Follower cluster from the ZooKeeper cluster, and performing load balance of the task acquired by the actuator by using a Round-Robin algorithm; adding the independent calculation task after segmentation to corresponding executor Follower _ host _ name information, and sending the result to a task _ instance theme in a message queue; in the process of suspending the workflow execution thread, the Leader consumes data in a task _ instance _ result topic returned by the Follower through the message queue, and updates the execution result of the workflow; and under the condition that the execution state of the whole workflow is changed into a final state, persisting the workflow execution result into the relational database.
Further, the Follower consumes the data of the task _ instance topic in the message queue, the execution is performed according to the Follower _ host _ name matching corresponding actuator, after the task execution is completed, the execution result of the task is written back to the task _ instance _ result topic in the message queue, the Leader consumption task execution result is waited, and after the Leader writes the execution completed task result back to the relational database, the execution of the whole task flow is completed.
Further, after the system is started, the Leader and the Follower register the znode of the Leader _ messages and the Follower _ messages on the zookeeper, provide the CPU and the memory information of the local computer and maintain the heartbeat; each Leader and Follower monitors the znode; and once the downtime of the Leader or the Follower is found, entering a workflow fault tolerance flow, wherein the workflow fault tolerance flow comprises a Leader fault tolerance flow and a Follower fault tolerance flow.
Further, the loader fault tolerance process specifically includes:
each machine of the Leader cluster monitors the znode of the Leader on the ZooKeeper cluster, once the Leader is found to be down, a distributed lock mechanism based on the ZooKeeper cluster is triggered, one active Leader acquires a distributed lock and triggers a workflow fault-tolerant logic, workflow information needing fault tolerance is inserted into a fault-tolerant command table in a relational database, and then the Leader acquiring the distributed lock takes over the workflow to complete the distributed fault-tolerant process of the Leader.
Further, the Follower fault tolerance process specifically includes:
each machine of the Follwer cluster can register itself with a znode on the ZooKeeper cluster, if the Follower executing the task is down, a monitoring mechanism of a Leader is triggered, all running tasks on the currently down Follower are stopped, the Leader marks the workflow as a fault-tolerant state, and the surviving Follower is reselected as an executor of the remaining tasks of the workflow.
In a third aspect, the storage medium of the present invention stores therein a computer readable program, which when called by an executor, can execute the steps of the distributed scheduling method for big data processing according to the present invention.
The invention has the following advantages: :
(1) the dependence on the tasks is considered in the task distribution mode, so that the phenomenon that the downstream tasks run empty due to overtime or running empty of the execution time of the upstream tasks is avoided, the whole data flow is facilitated, and the burden of developers is reduced.
(2) And designing a fault tolerance strategy, and when a single point of failure exists in the whole system, taking over the corresponding workflow and continuously executing the workflow.
(3) The whole dispatching cluster can realize linear expansion of the Leader node and the Worker node.
Drawings
Fig. 1 is a diagram of a Follower actuator node architecture in this embodiment;
FIG. 2 is a general construction diagram of the present embodiment;
FIG. 3 is a flowchart of workflow execution according to the present embodiment;
FIG. 4 is a schematic diagram of a loader fault tolerance according to the present embodiment;
fig. 5 is a schematic diagram of a fowlower fault tolerance according to this embodiment.
Detailed Description
The invention will be further explained with reference to the drawings.
In this embodiment, a distributed scheduling system for big data processing includes:
the scheduling center module is provided with a scheduling center Web interface, provides a simple and convenient task visual configuration window for a user, and provides the functions of monitoring, operation and maintenance of the operation of the scheduling platform. The scheduling center module is used for being responsible for dependent configuration and job development of the workflow, and persisting the configured workflow to a workflow table to be executed of a relational database (DB for short) through an API (application programming interface); workflow dependency is described by Json, with its predecessor and successor job id information stored in the Json data for each job.
And the leader module is used as a task flow segmentation and distribution node in the cluster, segments the workflow configured by the dispatching center according to the dependency relationship, and sends the segmented specific task node to the follower node.
The follower module, also called as an executor, is used for executing specific computing tasks distributed by the leader module, submitting task results and storing task execution logs.
And the coordinator module is used for taking out tasks to be executed from the database at regular time and carrying out load balancing on the leader module by adopting a Round-Robin algorithm according to the load conditions of all the current leader modules.
The task queue module is a message queue (MQ for short) and comprises a workflow topic, a task topic and a task result topic.
The metadata module comprises two databases, namely a relational database and a distributed memory database, wherein the relational database is used for persistently storing execution records of the workflow; the distributed memory database is used for taking out the metadata related to the workflow from the relational database and loading the metadata into the memory, so that the delay in the workflow running process is reduced, and the operation efficiency is improved.
The system is a multi-execution-engine distributed task scheduling system facing to a big data platform complex computation task scene. The system can divide the workflow based on the decentralization idea aiming at the user-defined calculation task workflow. And distributing the segmented computing task to a Follower in the cluster for execution. Task dependence between workflows is realized by using a message queue, and the method has the expression capability of a complex dependence DAG graph. The high availability of the leader module and the follower module is realized based on the distributed coordination service zookeeper, so that the whole system can realize linear expansion in the operation process.
In the data platform, a complete data processing task includes: the method comprises four stages of data access, data cleaning, data mining and analysis result storage, namely, a complete data processing workflow comprises a plurality of computing engines. As shown in fig. 1, the Follower does not serve as a running node of a specific computing task, but uses a gateway node in the big data platform, that is, a submitting node of the computing task, as a Follower node. The FOLLOWER node is provided with a gateway of computing engines such as an Sqoop client, a Spark client, a flight client, a Hive client and the like, is not directly used as a running node of a computing task, can realize the capability of scheduling the computing tasks of multiple execution engines, realizes the decoupling of a scheduling platform and the computing platform, and avoids resource competition.
In this embodiment, the distributed scheduling method for big data processing adopts the distributed scheduling system for big data processing as described in this embodiment, and the method includes the following steps:
receiving dependent configuration and job development of the workflow, and persisting the configured workflow into a workflow table to be executed of the relational database through an API (application programming interface);
segmenting the workflow configured by the dispatching center according to the dependency relationship, and sending the segmented specific task nodes to follower nodes;
the system is used for executing specific computing tasks distributed by the leader module, submitting task results and storing task execution logs;
and the system is used for taking out tasks to be executed from the database at regular time, and carrying out load balancing on the leader module by adopting a Round-Robin algorithm according to the load conditions of all the current leader modules.
As shown in fig. 2 and 3, the specific flow of the method is as follows:
the scheduling center module is responsible for dependent configuration and job development of the workflow, and the configured workflow is persisted into a workflow table to be executed of the relational database through the API; workflow dependency is described by Json, with its predecessor and successor job id information stored in the Json data for each job.
A coordinator service (i.e., coordinator) periodically scans a to-be-executed workflow table in a relational database to obtain a to-be-executed command, periodically requests load information of a Leader cluster from a ZooKeeper (ZK for short), and allocates the workflow to a corresponding Leader according to a CPU and memory margin of each current Leader machine by using a Round-Robin algorithm. And finally, sending the workflow with the Leader label to a process _ instance theme in the message queue, and waiting for the Leader to consume the topic to execute the workflow.
A processor _ instance theme in a message queue is consumed by a Leader, and whether the workflow needs to be executed is judged according to a Leader _ host _ name field in the message; and if the workflow needs to be executed, dividing the workflow into a plurality of computing tasks according to the dependency relationship of the workflow, and estimating computing resources needed by each computing task. Acquiring load information of each machine of the Follower cluster from the ZooKeeper cluster, and performing load balance of the task acquired by the actuator by using a Round-Robin algorithm; and adding the independent calculation task after segmentation to corresponding executor Follower _ host _ name information, and sending the result to a task _ instance theme in a message queue. In the process of suspending the workflow execution thread, the Leader consumes the data in the task _ instance _ result topic returned by the folower through the message queue, and updates the execution result of the workflow. And under the condition that the execution state of the whole workflow is changed into a final state, persisting the workflow execution result into the relational database.
And consuming the data of the task _ instance topic in the message queue by the following Follower, and executing the data according to the fact that the Follower _ host _ name is matched with the corresponding actuator. After the task is executed, the execution result of the task is written back to the task _ instance _ result topic in the message queue, and the Leader is waited to consume the execution result of the task. And after the Leader writes the executed task result back to the relational database, the whole task flow is executed.
Being a dispatch system with distributed capability, fault tolerant design is the core that must be considered for the entire system because of the natural unreliability of distributed systems. Distributed fault tolerance of the whole scheduling system is realized based on ZooKeeper. After the system is started, the Leader and the Follower register to the znode of the Leader _ mechs and the Follower _ mechs on the zookeeper, provide CPU and memory information of the local computer and maintain heartbeat; each Leader and Follower monitors the znode; and once the downtime of the Leader or the Follower is found, entering a workflow fault tolerance flow, wherein the workflow fault tolerance flow comprises a Leader fault tolerance flow and a Follower fault tolerance flow.
As shown in fig. 4, in this embodiment, the loader fault tolerance process specifically includes:
each machine of the Leader cluster monitors the znode on the ZooKeeper, once the Leader crash is found, a distributed lock mechanism based on the ZooKeeper is triggered, one of the active leaders acquires the distributed lock, workflow fault-tolerant logic is triggered, workflow information needing fault tolerance is inserted into a fault-tolerant command table in the relational database, and then the Leader acquiring the distributed lock takes over the workflow to complete the distributed fault-tolerant process of the Leader. As shown in fig. 4, after the Leader1 is hung, the Leader2 acquires the distributed lock, then triggers the workflow fault tolerance logic, inserts the information of the workflow which needs fault tolerance into the fault tolerance command table in the relational database, and then the Leader2 takes over the workflow to complete the distributed fault tolerance process of the Leader.
In this embodiment, the Follower fault tolerance process specifically includes:
each machine of the Follwer cluster can register itself with a znode on the ZooKeeper cluster, if the Follower executing the task is down, a monitoring mechanism of a Leader is triggered, all running tasks on the currently down Follower are stopped, the Leader marks the workflow as a fault-tolerant state, and the surviving Follower is reselected as an executor of the remaining tasks of the workflow.
As shown in fig. 5, when the Follwer1 goes down, it is re-executed by the Follwer 2.
In this embodiment, a storage medium stores therein a computer readable program, and when the computer readable program is called by an executor, the steps of the distributed scheduling method for big data processing as described in this embodiment can be executed.
Claims (9)
1. A big data processing-oriented distributed scheduling system, comprising:
the scheduling center module is used for being responsible for dependent configuration and job development of the workflow and persisting the configured workflow into a workflow table to be executed of the relational database through the API (application programming interface);
the leader module is used as a task flow segmentation and distribution node in the cluster, segments the workflow configured by the dispatching center according to the dependency relationship, and sends the segmented specific task node to the follower node;
the follower module is also called an executor and is used for executing specific calculation tasks distributed by the leader module, submitting task results and storing task execution logs;
the coordinator module is used for taking out tasks to be executed from the database at regular time, and performing load balancing on the leader module by using a Round-Robin algorithm according to the load conditions of all the current leader modules;
the task queue module is a message queue, comprises workflow topic, task topic and task result topic and is used for realizing task dependence among the workflows;
the metadata module comprises two databases, namely a relational database and a distributed memory database, wherein the relational database is used for persistently storing execution records of the workflow; and the distributed memory database is used for taking out the metadata related to the workflow from the relational database and loading the metadata into the memory.
2. A big data processing-oriented distributed scheduling method, which is characterized in that the big data processing-oriented distributed scheduling system of claim 1 is adopted, and the method comprises the following steps:
receiving dependent configuration and job development of the workflow, and persisting the configured workflow into a workflow table to be executed of the relational database through an API (application programming interface);
segmenting the workflow configured by the dispatching center according to the dependency relationship, and sending the segmented specific task nodes to follower nodes;
the system is used for executing specific computing tasks distributed by the leader module, submitting task results and storing task execution logs;
and the system is used for taking out tasks to be executed from the database at regular time, and carrying out load balancing on the leader module by adopting a Round-Robin algorithm according to the load conditions of all the current leader modules.
3. The big-data-processing-oriented distributed scheduling method of claim 2, wherein: the coordinator service regularly scans a workflow table to be executed in a relational database to obtain a command to be executed, regularly requests load information of a Leader cluster from a ZooKeeper, and allocates the workflow to a corresponding Leader according to the CPU and memory allowance of each current Leader machine by adopting a Round-Robin algorithm; and finally, sending the workflow with the Leader label to a process _ instance theme in the message queue, and waiting for the Leader to consume the topic to execute the workflow.
4. The big-data-processing-oriented distributed scheduling method of claim 3, wherein: a processor _ instance theme in a message queue is consumed by a Leader, and whether the workflow needs to be executed is judged according to a Leader _ host _ name field in the message; if the workflow needs to be executed, dividing the workflow into a plurality of computing tasks according to the dependency relationship of the workflow, and estimating computing resources needed by each computing task; acquiring load information of each machine of the Follower cluster from the ZooKeeper cluster, and performing load balance of the task acquired by the actuator by using a Round-Robin algorithm; adding the independent calculation task after segmentation to corresponding executor Follower _ host _ name information, and sending the result to a task _ instance theme in a message queue; in the process of suspending the workflow execution thread, the Leader consumes data in a task _ instance _ result topic returned by the Follower through the message queue, and updates the execution result of the workflow; and under the condition that the execution state of the whole workflow is changed into the final state, persisting the execution result state of the workflow into the relational database.
5. The big-data-processing-oriented distributed scheduling method of claim 4, wherein: and after the task is executed, writing the execution result of the task back to the task instance result in the message queue, waiting for the Leader to consume the execution result of the task, and after the Leader writes the executed task result back to the relational database, finishing the execution of the whole task flow.
6. The big data processing-oriented distributed scheduling method according to claim 4 or 5, wherein: after the system is started, the Leader and the Follower register to the znode of the Leader _ mechs and the Follower _ mechs on the zookeeper, provide CPU and memory information of the local computer and maintain heartbeat; each Leader and Follower monitors the znode; and once the downtime of the Leader or the Follower is found, entering a workflow fault tolerance flow, wherein the workflow fault tolerance flow comprises a Leader fault tolerance flow and a Follower fault tolerance flow.
7. The big-data-processing-oriented distributed scheduling method of claim 6, wherein: the loader fault-tolerant process specifically comprises the following steps:
each machine of the Leader cluster monitors the znode on the ZooKeeper, once the Leader crash is found, a distributed lock mechanism based on the ZooKeeper is triggered, one of the active leaders acquires the distributed lock, workflow fault-tolerant logic is triggered, workflow information needing fault tolerance is inserted into a fault-tolerant command table in the relational database, and then the Leader acquiring the distributed lock takes over the workflow to complete the distributed fault-tolerant process of the Leader.
8. The big-data-processing-oriented distributed scheduling method of claim 7, wherein: the Follower fault-tolerant process specifically comprises the following steps:
each machine of the Follwer cluster can register itself with a znode on the ZooKeeper, if the Follower executing the task is down, a monitoring mechanism of the Leader is triggered, all running tasks on the currently down Follower are stopped, the Leader marks the workflow as a fault-tolerant state, and the surviving Follower is reselected as an executor of the remaining tasks of the workflow.
9. A storage medium having a computer-readable program stored therein, characterized in that: the computer readable program, when being invoked by an executor, is capable of performing the steps of the big data processing oriented distributed scheduling method of any one of claims 2 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011069582.0A CN112162841A (en) | 2020-09-30 | 2020-09-30 | Distributed scheduling system, method and storage medium for big data processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011069582.0A CN112162841A (en) | 2020-09-30 | 2020-09-30 | Distributed scheduling system, method and storage medium for big data processing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112162841A true CN112162841A (en) | 2021-01-01 |
Family
ID=73861133
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011069582.0A Pending CN112162841A (en) | 2020-09-30 | 2020-09-30 | Distributed scheduling system, method and storage medium for big data processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112162841A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113010295A (en) * | 2021-03-30 | 2021-06-22 | 中信银行股份有限公司 | Stream computing method, device, equipment and storage medium |
CN113438281A (en) * | 2021-06-05 | 2021-09-24 | 济南浪潮数据技术有限公司 | Storage method, device, equipment and readable medium of distributed message queue |
CN113535362A (en) * | 2021-07-26 | 2021-10-22 | 北京计算机技术及应用研究所 | Distributed scheduling system architecture and micro-service workflow scheduling method |
CN113821322A (en) * | 2021-09-10 | 2021-12-21 | 浙江数新网络有限公司 | Loosely-coupled distributed workflow coordination system and method |
CN115840631A (en) * | 2023-01-04 | 2023-03-24 | 中科金瑞(北京)大数据科技有限公司 | RAFT-based high-availability distributed task scheduling method and equipment |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060101081A1 (en) * | 2004-11-01 | 2006-05-11 | Sybase, Inc. | Distributed Database System Providing Data and Space Management Methodology |
US20150067028A1 (en) * | 2013-08-30 | 2015-03-05 | Indian Space Research Organisation | Message driven method and system for optimal management of dynamic production workflows in a distributed environment |
WO2015081808A1 (en) * | 2013-12-03 | 2015-06-11 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for data transmission |
US20160182614A1 (en) * | 2014-12-23 | 2016-06-23 | Cisco Technology, Inc. | Elastic scale out policy service |
CN107766147A (en) * | 2016-08-23 | 2018-03-06 | 上海宝信软件股份有限公司 | Distributed data analysis task scheduling system |
CN110888719A (en) * | 2019-09-18 | 2020-03-17 | 广州市巨硅信息科技有限公司 | Distributed task scheduling system and method based on web service |
CN111209301A (en) * | 2019-12-29 | 2020-05-29 | 南京云帐房网络科技有限公司 | Method and system for improving operation performance based on dependency tree splitting |
CN111338774A (en) * | 2020-02-21 | 2020-06-26 | 华云数据有限公司 | Distributed timing task scheduling system and computing device |
CN111400017A (en) * | 2020-03-26 | 2020-07-10 | 华泰证券股份有限公司 | Distributed complex task scheduling method |
-
2020
- 2020-09-30 CN CN202011069582.0A patent/CN112162841A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060101081A1 (en) * | 2004-11-01 | 2006-05-11 | Sybase, Inc. | Distributed Database System Providing Data and Space Management Methodology |
US20150067028A1 (en) * | 2013-08-30 | 2015-03-05 | Indian Space Research Organisation | Message driven method and system for optimal management of dynamic production workflows in a distributed environment |
WO2015081808A1 (en) * | 2013-12-03 | 2015-06-11 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for data transmission |
US20160182614A1 (en) * | 2014-12-23 | 2016-06-23 | Cisco Technology, Inc. | Elastic scale out policy service |
CN107766147A (en) * | 2016-08-23 | 2018-03-06 | 上海宝信软件股份有限公司 | Distributed data analysis task scheduling system |
CN110888719A (en) * | 2019-09-18 | 2020-03-17 | 广州市巨硅信息科技有限公司 | Distributed task scheduling system and method based on web service |
CN111209301A (en) * | 2019-12-29 | 2020-05-29 | 南京云帐房网络科技有限公司 | Method and system for improving operation performance based on dependency tree splitting |
CN111338774A (en) * | 2020-02-21 | 2020-06-26 | 华云数据有限公司 | Distributed timing task scheduling system and computing device |
CN111400017A (en) * | 2020-03-26 | 2020-07-10 | 华泰证券股份有限公司 | Distributed complex task scheduling method |
Non-Patent Citations (2)
Title |
---|
方曙东 等: "《Hadoop大数据技术与应用》", vol. 1, 31 January 2020, 浙江科学技术出版社, pages: 285 - 288 * |
许鑫著: "《基于文本特征计算的信息分析方法》", vol. 1, 30 November 2015, 上海科学技术文献出版社, pages: 19 - 20 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113010295A (en) * | 2021-03-30 | 2021-06-22 | 中信银行股份有限公司 | Stream computing method, device, equipment and storage medium |
CN113010295B (en) * | 2021-03-30 | 2024-06-11 | 中信银行股份有限公司 | Stream computing method, device, equipment and storage medium |
CN113438281A (en) * | 2021-06-05 | 2021-09-24 | 济南浪潮数据技术有限公司 | Storage method, device, equipment and readable medium of distributed message queue |
CN113438281B (en) * | 2021-06-05 | 2023-02-28 | 济南浪潮数据技术有限公司 | Storage method, device, equipment and readable medium of distributed message queue |
CN113535362A (en) * | 2021-07-26 | 2021-10-22 | 北京计算机技术及应用研究所 | Distributed scheduling system architecture and micro-service workflow scheduling method |
CN113821322A (en) * | 2021-09-10 | 2021-12-21 | 浙江数新网络有限公司 | Loosely-coupled distributed workflow coordination system and method |
CN115840631A (en) * | 2023-01-04 | 2023-03-24 | 中科金瑞(北京)大数据科技有限公司 | RAFT-based high-availability distributed task scheduling method and equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112162841A (en) | Distributed scheduling system, method and storage medium for big data processing | |
US11366797B2 (en) | System and method for large-scale data processing using an application-independent framework | |
CN107038069B (en) | Dynamic label matching DLMS scheduling method under Hadoop platform | |
US10042886B2 (en) | Distributed resource-aware task scheduling with replicated data placement in parallel database clusters | |
US9542223B2 (en) | Scheduling jobs in a cluster by constructing multiple subclusters based on entry and exit rules | |
US8914805B2 (en) | Rescheduling workload in a hybrid computing environment | |
US8739171B2 (en) | High-throughput-computing in a hybrid computing environment | |
US7650331B1 (en) | System and method for efficient large-scale data processing | |
CN114741207B (en) | GPU resource scheduling method and system based on multi-dimensional combination parallelism | |
Ju et al. | iGraph: an incremental data processing system for dynamic graph | |
CN107463442B (en) | Satellite-borne multi-core SoC task level load balancing parallel scheduling method | |
CN112114973B (en) | Data processing method and device | |
Shen et al. | Defuse: A dependency-guided function scheduler to mitigate cold starts on faas platforms | |
CN115373835A (en) | Task resource adjusting method and device for Flink cluster and electronic equipment | |
CN112256414A (en) | Method and system for connecting multiple computing storage engines | |
US11748164B2 (en) | FAAS distributed computing method and apparatus | |
CN111459622A (en) | Method and device for scheduling virtual CPU, computer equipment and storage medium | |
Bao et al. | BC-BSP: A BSP-based parallel iterative processing system for big data on cloud architecture | |
CN112948096A (en) | Batch scheduling method, device and equipment | |
CN113821322A (en) | Loosely-coupled distributed workflow coordination system and method | |
CN107528871A (en) | Data analysis in storage system | |
CN112148546A (en) | Static safety analysis parallel computing system and method for power system | |
CN114237858A (en) | Task scheduling method and system based on multi-cluster network | |
JP2021060707A (en) | Synchronization control system and synchronization control method | |
Xie et al. | A resource scheduling algorithm based on trust degree in cloud computing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |