CN110362390B - Distributed data integration job scheduling method and device - Google Patents
Distributed data integration job scheduling method and device Download PDFInfo
- Publication number
- CN110362390B CN110362390B CN201910489422.2A CN201910489422A CN110362390B CN 110362390 B CN110362390 B CN 110362390B CN 201910489422 A CN201910489422 A CN 201910489422A CN 110362390 B CN110362390 B CN 110362390B
- Authority
- CN
- China
- Prior art keywords
- job
- scheduling
- information
- module
- resource
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/485—Task life-cycle, e.g. stopping, restarting, resuming execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a distributed data integration job scheduling method and a device, aiming at special scenes possibly faced by data integration, the job scheduling device is responsible for sending data integration jobs to a job running device, the job running device receives scheduling tasks and starts job execution, and sends state information of job running to a job management module, and feeds back working node calculation resources to a resource scheduling module and feeds back loss connection or fault information to a job preloading module. The invention has the following comprehensive characteristics: (1) high availability, fault tolerance, weak consistency; (2) the low-delay characteristic facing the quasi-real-time job scheduling; (3) multi-tenant concurrency control facing to cloud service application; (4) computing resource isolation and multi-job parallel scheduling; (5) a priority scheduling mechanism.
Description
Technical Field
The invention relates to the technical field of big data foundation, in particular to a distributed data integration job scheduling method and device.
Background
With the evolution of digital economy, service digitization in various industries has been fully developed, and digital service has gradually become a new center of gravity. However, since the service digitization derives a large number of data islands and becomes a common pain point for realizing the digital service, data integration is urgently needed in various industries, the data islands are opened and avoided, and data resources are integrated and managed, so that the associated value among data is effectively developed.
Data integration often faces thousands of job scheduling including job types such as data exchange and data preprocessing, and the design of a scheduling system needs to consider various complex scenarios. For example, some scenarios not only have a large number of jobs but also require parallel processing; some scenes require quasi-real-time operation, and priority scheduling needs to be considered; some jobs need to occupy more or longer computing resources, and resource isolation needs to be considered so as not to influence other jobs; some jobs may suffer from failures such as downtime of data sources and targets, interruption of networks, downtime of running nodes and the like, and a failure fault tolerance mechanism is needed; in a multi-tenant scenario, concurrent control needs to be handled; and so on.
The prior art fails to meet the complex data integration job scheduling requirements. For example: in the traditional LTS scheduling system, the operation is isolated based on the thread, if the execution thread of one operation exhausts all the memories of the current process, all the operations in the process are abnormal, the capability of scheduling data integration operation is lacked, and the traditional LTS scheduling system is more suitable for scheduling light-weight tasks. Chinese patent CN201610800080 discloses a distributed task scheduling system and method for solving the problems of large code writing amount and heavy development task of developers in a parallel computing program development mode, which is essentially scheduling for a single large-scale distributed computing job without considering multi-task parallel scheduling. Chinese patent CN201610197298 discloses a task scheduling method, apparatus and system, which provides a multi-channel multi-task distributed scheduling method, solves the problem of starvation of other jobs caused by too long scheduling time of a single task, but does not consider the problems of low latency of job metadata access, how to ensure job metadata consistency, and the like. Chinese patent No. CN201410748604 of the invention discloses a distributed task scheduling system and method, which provides a distributed task scheduling system and method for ensuring the reliability of the system itself, supporting independent or associated tasks, and supporting task rollback distribution, but is not suitable for complex data integration scenarios where rollback of tasks is not a key point, association does not need to occur between data integration tasks, and does not consider the problem that downtime under large-scale complex data integration tasks is high, and high availability needs to be achieved, and the problems of how to reduce delay as much as possible, how to ensure consistency of operation metadata, and the like under the premise that a quasi-real-time task exists.
Disclosure of Invention
Aiming at special scenes possibly faced by data integration, the invention is characterized in that an operation scheduling device is responsible for sending data integration operation to an operation running device, the operation running device receives scheduling tasks and starts operation execution, and sends operation running state information to an operation management module, feeds back working node calculation resources to a resource scheduling module, and feeds back loss connection or fault information to an operation preloading module; the invention has the following comprehensive characteristics: (1) high availability, fault tolerance, weak consistency; (2) the low-delay characteristic facing the quasi-real-time job scheduling; (3) multi-tenant concurrency control facing to cloud service application; (4) computing resource isolation and multi-job parallel scheduling; (5) a priority scheduling mechanism.
The invention achieves the aim through the following technical scheme: a distributed data integration job scheduling method comprises the following steps:
(1) the job scheduling device issues the data integration job to the job running device, wherein the job scheduling device comprises a job management module, a job preloading module and a resource scheduling module: (1.1) the operation management module receives, caches and stores operation related meta information to perform concurrency control;
(1.2) the job preloading module acquires the jobs to be processed from the job management module and determines the scheduling priority sequence;
(1.3) the resource scheduling module completes resource allocation and scheduling distribution by acquiring the job preloading information and the calculation resource information of the job running device;
(2) the operation running device receives the scheduling task and starts operation execution, and feeds back the operation running state information to the operation management module, feeds back the working node calculation resources to the resource scheduling module, and feeds back the loss connection or fault information to the operation preloading module.
Preferably, the job management module includes an information receiving unit, an information caching unit, a persistent storage unit, and a concurrency control unit, wherein the specific operations are as follows:
(i) the information receiving unit receives job submission, job meta-information modification and scheduling strategy updating; receiving unallocated resource job information fed back by the resource scheduling module, and updating job states; receiving operation state information fed back by the operation running device and updating the operation state;
(ii) the information caching unit is used for locally caching the operation meta information and the state information and supporting frequent real-time query;
(iii) the persistent storage unit maintains the data consistency of the cache layer and the storage layer according to the metadata information of the persistent operation of the cache content;
(iv) the concurrency control unit assigns a read-write lock to access of each of the job resources.
Preferably, in step (iii), the data consistency between the cache layer and the storage layer is maintained by the following method:
(a) writing updated data into a local file by using fault-tolerant storage, and writing the updated data into a storage layer after the network is recovered to be normal;
(b) the fuse is used, when the fault-tolerant mechanism is triggered and reaches a preset threshold value, the fuse is disconnected, the service performs degradation processing, and a new task is not scheduled;
(c) and encapsulating the operation state interface of the operation running device, acquiring the operation running state and auditing the operation running state when the operation scheduling node is started every time, and ensuring that the operation state in the operation running device is consistent with the state in the metadata storage layer.
Preferably, the job preloading module includes a real-time query unit, a job preloading unit, and a fault processing unit, wherein the specific operations are as follows:
(I) the real-time query unit queries the job metadata cache in real time to acquire the unlocked job to be scheduled;
(II) adding the unlocked job to be scheduled into a bounded ordered queue by a job preloading unit, and sequencing according to job scheduling time and job priority;
(III) the fault processing unit receives fault information from the operation device and carries out fault tolerance processing; the fault tolerance processing means that a working node is down or a long-time network is disconnected in the operation process, the operation running device informs the operation preloading module, the operation preloading module is connected with the working node to judge whether the connection is unavailable, if the connection is not available, the operation is directly put into a queue, and the lost operation is finally dispatched to other available nodes; if the unconnected node recovers again at the moment, the operation running device can directly kill the operation process, so that the same operation cannot be simultaneously run on two nodes under one operation running device.
Preferably, the bounded ordered blocking queue is used for loading all jobs to be scheduled, wherein bounded means that the upper limit of the number of jobs is guaranteed, and an upper limit parameter can be given through scene evaluation; the order refers to that the job with earlier trigger time and higher priority is placed at the position at the front of the queue for preferential scheduling; in the process of stopping deleting the job, supporting to remove the specified job to be scheduled from the queue; with the producer-consumer model, the CPU burden is reduced using a thread blocking-wakeup approach.
Preferably, the resource scheduling module includes a resource obtaining unit, a resource allocating unit, and a scheduling distributing unit, wherein the specific operations are as follows:
1) the resource obtaining unit obtains the computing resources of the operation cluster and caches the computing resources in the memory;
2) the resource allocation unit acquires all the jobs from the bounded ordered queue and allocates computing resources to each job according to the job priority order;
3) and the scheduling and distributing unit appoints an actuator for the scheduled job and sends the job meta-information, the distributed computing resources and the actuator configuration to the job running cluster.
Preferably, in the step (2), the job running device includes a main control node and a work node, the main control node is responsible for management and coordination, and the work node is responsible for executing the data integration job; the master control node receives the job meta-information, the job resource allocation information and the job executor information which are distributed by the resource scheduling module and starts the job execution; the agent program on the working node collects the state information of the operation and sends the state information to the main control node, and the main control node sends the state information to the operation management module; the agent program on the working node collects the working node computing resources and sends the working node computing resources to the main control node, and the main control node sends the working node computing resources to the resource scheduling module; the agent program on the working node sends heartbeat information to the main control node, and the main control node sends loss of connection or fault information to the operation preloading module; the executor on the working node is provided with a retry mechanism, and once the data flow source or the data flow target goes down or loses connection, the executor carries out timing retry to ensure that the data flow source and the data flow target can continue to normally operate after recovery.
Preferably, the operation of the job running device is based on a meso cluster system to perform distributed system resource management, a master control node provides low-delay local metadata management in a RAM + WAL log mode, a PAXOS algorithm is adopted to maintain job state synchronization of a large number of working nodes, and specific physical resources are pushed to a job scheduling system based on a cluster physical resource unified management interface and a specific resource sharing strategy of the master control node; the job operation cluster provides two modes of a multi-language driver package and JSON RPC for the registration of the job scheduling system and the acquisition of a specific callback event; the agent program is responsible for collecting the resources of the working nodes, running a specific scheduling task through the actuator and returning the execution result and the task state of the actuator to the master control node; and then the main control node forwards the data to the job scheduling device.
A distributed data integrated job scheduling apparatus, comprising: a job scheduling device and a job running device; the job scheduling device and the job running device perform information interaction with each other; the job scheduling device comprises a job management module, a job pre-recording module and a resource scheduling module; the operation management module is used for receiving, caching and storing operation related meta information and performing concurrency control; the job preloading module is used for acquiring the jobs to be processed from the job management module and determining the scheduling priority sequence; the resource scheduling module is used for completing resource allocation and scheduling distribution by acquiring the job preloading information and the calculation resource information of the job running device; the operation running device comprises a main control node and a working node, wherein the main control node is responsible for management and coordination, and the working node is responsible for executing data integration operation.
Preferably, the job scheduling device and the job running device are both registered in the ZooKeeper, the job scheduling device adopts a master-slave mode, and once the master device goes down, the ZooKeeper selects the backup device and replaces the job scheduling work; the main control nodes in the operation running device adopt a main standby mode, and once the main control nodes are down, the ZooKeeper elects the standby main control nodes to take over the management coordination work.
Preferably, the job scheduling device performs auditing and maintenance based on job status information fed back by the job running device, and the job metadata database caches and maintains consistency based on job metadata of the job scheduling device; when the job scheduling device fails, the standby job scheduling device needs to interact with the job metadata database once taking over the work, a job metadata cache mechanism is rebuilt, and the metadata cache information is audited and maintained by receiving job state feedback information from the job running device, so that the metadata cache information is kept consistent in a distribution system.
The invention has the beneficial effects that: (1) according to CAP theorem, the method of the invention meets two indexes of high availability and fault tolerance, and adopts a mechanism for ensuring the consistency of operation metadata as much as possible; (2) the multi-tenant concurrency control is realized based on the distributed read-write lock, and the multi-tenant data integration service is provided in a cloud service mode; (3) aiming at the particularity that the data integration operation needs frequent scheduling, the operation metadata adopts a cache mechanism, so that the delay and interruption risks caused by frequent metadata access can be effectively reduced.
Drawings
FIG. 1 is a schematic flow diagram of the apparatus of the present invention;
FIG. 2 is a schematic diagram of the high availability mechanism of the apparatus of the present invention;
FIG. 3 is a schematic flow diagram of the method of the present invention;
FIG. 4 is a schematic flow diagram of a job management module of the present invention;
FIG. 5 is a schematic representation of the operation of the job management module of the present invention;
FIG. 6 is a process flow diagram of a job preloading module of the present invention;
FIG. 7 is a schematic flow diagram of a resource scheduling module of the present invention;
fig. 8 is a schematic view showing an operation flow of the work running apparatus of the present invention.
Detailed Description
The invention will be further described with reference to specific examples, but the scope of the invention is not limited thereto:
example (b): as shown in fig. 1, a distributed data-integrated job scheduling apparatus is composed of a job scheduling apparatus and a job execution apparatus. The job scheduling device and the job running device perform information interaction with each other; the job scheduling device comprises a job management module, a job pre-recording module and a resource scheduling module; the operation management module is used for receiving, caching and storing operation related meta information and performing concurrency control; the job preloading module is used for acquiring the jobs to be processed from the job management module and determining the scheduling priority sequence; the resource scheduling module is used for completing resource allocation and scheduling distribution by acquiring the job preloading information and the calculation resource information of the job running device; the operation running device comprises a main control node and a working node, wherein the main control node is responsible for management and coordination, and the working node is responsible for executing data integration operation.
As shown in fig. 2, both the job scheduling device and the job running device are registered in the ZooKeeper, the job scheduling device adopts a master-slave mode, and once the master device goes down, the ZooKeeper selects the backup device and replaces the job scheduling work; the main control nodes in the operation running device adopt a main standby mode, and once the main control nodes are down, the ZooKeeper elects the standby main control nodes to take over the management coordination work.
The job scheduling device audits and maintains based on job state information fed back by the job running device, and the job metadata base caches and maintains consistency based on job metadata of the job scheduling device; when the job scheduling device fails, the standby job scheduling device needs to interact with a job metadata database once taking over work, a job metadata cache mechanism is rebuilt, and the metadata cache information is audited and maintained by receiving job state feedback information from the job running device so as to keep the metadata cache information consistent in a distribution system; thereby ensuring weak consistency.
As shown in fig. 3, a distributed data integration job scheduling method includes the following steps:
s100: the job scheduling device issues the data integration job to the job running device, and the job scheduling device consists of a job management module, a job preloading module and a resource scheduling module, and specifically comprises the following parts:
s101: and the operation management module receives, caches and stores the operation related meta information and performs concurrency control. The job management module consists of an information receiving unit, a storage processing unit and a concurrency control unit. As shown in fig. 4, the specific operations are as follows:
(1) the information receiving unit S101-1 is responsible for receiving job submission, job meta-information modification and scheduling policy update; receiving unallocated resource job information fed back by the resource scheduling module, and updating job states; receiving operation state information fed back by the operation running device and updating the operation state;
(2) the information caching unit S101-2 is responsible for locally caching the operation meta information and the state information and supporting frequent real-time query;
(3) the persistent storage unit S101-3 maintains the data consistency of the cache layer and the storage layer according to the metadata information of the persistent job of the cache content;
(4) the concurrency control unit S101-4 is responsible for assigning a read-write lock to access of each job resource.
Specifically, as shown in fig. 5, the job management module of the present invention is responsible for receiving jobs and maintaining job state machines, and the main job states may include: the method comprises the following steps of not starting operation, waiting to be scheduled operation, suspending operation, running operation, stopping operation, abnormal operation and finishing operation. The job management module provides job state operation interfaces, such as operation interfaces for stopping running jobs, suspending jobs, scheduling jobs, suspending jobs, normally stopping jobs, abnormally stopping jobs, and the like. Maintaining various job scheduling policies, such as: repeated operation, timed operation, Cron operation, disposable operation, etc. And a history storage module is internally maintained and used for recording all scheduling histories. And adding a read-write lock to the operation of the cache layer and the metadata persistence layer to realize concurrency control: if there are concurrent threads that are write operations, the lock is upgraded to an exclusive lock and other threads cannot seize the lock. Conversely, if the concurrent thread is a read operation, the lock is upgraded to a shared lock and other threads can concurrently seize the lock. The operation management module adds a cache layer on an operation metadata storage layer, ensures frequent metadata query and call, supports frequent access and frequent scheduling of quasi-real-time data integration tasks, abstracts the cache layer into an SPI interface on an implementation layer, and supports the implementation of cache layer interfaces such as Caffeine, JDK, Guava, Redis and the like. The persistence layer is abstracted into an SPI interface on the implementation level and supports databases such as relational databases, MongoDB databases and the like. Because the job data may cause the problem of data inconsistency due to network and other instability factors when written into the persistence layer, in the implementation level, the job management module adopts "triple insurance" to ensure the metadata consistency as much as possible: (1) writing updated data into a local database/file by using fault-tolerant storage, and writing the updated data into a storage layer after the network is recovered to be normal; (2) the fuse is used, when the fault-tolerant mechanism is triggered and reaches a certain threshold value, the fuse is disconnected, the service performs degradation processing, and a new task is not scheduled; (3) and encapsulating an operation state interface of the operation running system, and acquiring the operation running state for auditing when the operation scheduling node is started every time, so as to ensure that the operation state in the operation running system is consistent with the state in the metadata storage layer.
S102, as shown in figure 6, the job preloading module acquires the job to be processed from the job management module and determines the scheduling priority sequence, wherein the job preloading module consists of a real-time query unit, a job preloading unit and a fault processing unit. The method comprises the following specific operations:
(1) the real-time query unit S102-1 is responsible for querying job metadata cache in real time and acquiring unlocked jobs to be scheduled;
(2) the job preloading unit S102-2 is responsible for adding the unlocked job to be scheduled into the bounded ordered queue and sorting the job according to the job scheduling time and the job priority;
(3) failure processing unit S102-3: and the system is responsible for receiving fault information from the operation cluster and performing fault tolerance processing.
Specifically, a bounded ordered blocking queue is built to load all jobs to be scheduled. The bounded state refers to the condition that the upper limit of the number of the jobs is guaranteed, and upper limit parameters can be given through scene evaluation; in order means that jobs with earlier trigger times and higher priority levels will be placed in the queue front position for priority scheduling. And in the process of stopping deleting the job, supporting to remove the specified job to be scheduled from the queue. With the producer-consumer model, the CPU burden is reduced using a thread blocking-wakeup approach.
The fault tolerance processing means that in the operation process of the operation, a working node is down or a long-time network is disconnected, the operation running device informs the operation preloading module, the operation preloading module is connected with the working node to judge whether the connection is unavailable, if the connection is not available, the operation is directly put into a queue, and the lost operation is finally scheduled to other available nodes. If the disconnected node is recovered again, the operation running device can directly kill the operation process to ensure that the same operation cannot run on two nodes simultaneously under one operation running device.
And S103, the resource scheduling module finishes resource allocation and scheduling distribution by acquiring the job preloading information and the calculation resource information of the job running device. The resource scheduling module consists of a resource acquisition unit, a resource allocation unit and a scheduling and distributing unit; as shown in fig. 7:
(1) the resource obtaining unit S103-1 is responsible for obtaining the computing resources of the job running cluster and caching the computing resources in the memory;
(2) the resource allocation unit S103-2 is responsible for acquiring all the jobs from the bounded ordered queue and allocating computing resources to each job according to the job priority order;
(3) the scheduling and distributing unit S103-3 is responsible for assigning an executor to the scheduled job and sending the job meta-information, the allocated computing resource and the executor configuration to the job running cluster.
The implementation of the Executor may be Linux container Executor, Docker Executor, or other executors, and these container executors can implement the isolation of the computing resources.
And S200, the job running device receives the scheduling task and starts job execution, sends the state information of job running to the job management module, feeds back the working node calculation resources to the resource scheduling module, and feeds back the loss of connection or fault information to the job preloading module.
The operation running device is provided with a main control node and a working node, the main control node is responsible for management and coordination, and the working node is responsible for executing data integration operation. The master control node receives job meta-information, job resource allocation information and job executor information distributed by the resource scheduling module and starts job execution; the agent program on the working node collects the state information of the operation and sends the state information to the main control node, and the main control node sends the state information to the operation management module; the agent program on the working node collects the working node computing resources and sends the working node computing resources to the main control node, and the main control node sends the working node computing resources to the resource scheduling module; and the agent program on the working node sends heartbeat information to the main control node, and the main control node sends loss of connection or fault information to the operation preloading module. The executor on the working node is provided with a retry mechanism, and once the data flow source or the data flow target goes down or loses connection, the executor carries out timing retry to ensure that the data flow source and the data flow target can continue to normally operate after being recovered.
As shown in fig. 8, the job running apparatus may perform distributed system resource management based on a meso cluster system, where the master node provides low-latency local metadata management in a RAM + WAL log manner, maintains job state synchronization of a large number of working nodes by using a PAXOS algorithm, and pushes specific physical resources to a job scheduling system based on a cluster physical resource unified management interface and a specific resource sharing policy of the master node. The job operation cluster provides two modes of a multi-language driver package and JSON RPC for the registration of the job scheduling system and the acquisition of a specific callback event; the agent program is responsible for collecting the resources of the working nodes, running a specific scheduling task through the actuator and returning the execution result and the task state of the actuator to the master control node. And then the main control node forwards the data to the job scheduling device.
While the invention has been described in connection with specific embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (8)
1. A distributed data integration job scheduling method is characterized by comprising the following steps: (1) the job scheduling device issues the data integration job to the job running device, wherein the job scheduling device comprises a job management module, a job preloading module and a resource scheduling module:
(1.1) the operation management module receives, caches and stores operation related meta information to perform concurrency control; the operation management module comprises an information receiving unit, an information caching unit, a persistent storage unit and a concurrency control unit, wherein the operation management module specifically comprises the following operations:
(1.1.1) the information receiving unit receives the job submission, the job meta-information modification and the scheduling policy update; receiving unallocated resource job information fed back by the resource scheduling module, and updating job states; receiving operation state information fed back by the operation running device and updating the operation state;
(1.1.2) the information caching unit carries out local caching on the operation meta information and the state information and supports frequent real-time query;
(1.1.3) the persistent storage unit maintains the data consistency of the cache layer and the storage layer according to the metadata information of the persistent job of the cache content; the method for maintaining the data consistency of the cache layer and the storage layer is realized by the following steps:
(a) writing updated data into a local file by using fault-tolerant storage, and writing the updated data into a storage layer after the network is recovered to be normal;
(b) the fuse is used, when the fault-tolerant mechanism is triggered and reaches a preset threshold value, the fuse is disconnected, the service performs degradation processing, and a new task is not scheduled;
(c) encapsulating an operation state interface of the operation running device, acquiring an operation running state and auditing the operation running state when an operation scheduling node is started each time, and ensuring that the operation state in the operation running device is consistent with the state in the metadata storage layer;
(1.1.4) the concurrency control unit allocating a read-write lock to access each job resource;
(1.2) the job preloading module acquires the job to be processed from the job management module and determines a bounded ordered queue and a scheduling priority; the bounded ordered queue is used for loading all jobs to be scheduled, wherein bounded means that the upper limit of the number of the jobs is guaranteed, and the upper limit parameter of the number of the jobs is given through scene evaluation; the order refers to that the jobs with early triggering time and high priority are placed at the front position of the queue for preferential scheduling; in the process of stopping deleting the job, supporting to remove the specified job to be scheduled from the queue; the producer-consumer model is adopted, and the CPU burden is reduced by using a thread blocking-awakening mode;
(1.3) the resource scheduling module completes resource allocation and scheduling distribution by acquiring the job preloading information and the calculation resource information of the job running device;
(2) the operation running device receives the scheduling task and starts operation execution, and feeds back the operation running state information to the operation management module, feeds back the working node calculation resources to the resource scheduling module, and feeds back the loss connection or fault information to the operation preloading module.
2. The distributed data integration job scheduling method according to claim 1, wherein: the operation preloading module comprises a real-time query unit, an operation preloading unit and a fault processing unit, wherein the operation of the step (1.2) is as follows:
(I) the real-time query unit queries the job metadata cache in real time to acquire the unlocked job to be scheduled;
(II) adding the unlocked job to be scheduled into a bounded ordered queue by a job preloading unit, and sequencing according to job scheduling time and job priority;
(III) the fault processing unit receives fault information from the operation device and carries out fault tolerance processing; the fault tolerance processing means that a working node is down or a long-time network is disconnected in the operation process, the operation running device informs the operation preloading module, the operation preloading module is connected with the working node to judge whether the connection is unavailable, if the connection is not available, the operation is directly put into a queue, and the lost operation is finally dispatched to other available nodes; if the unconnected node recovers again at the moment, the operation running device can directly kill the operation process, so that the condition that the same operation cannot run on two nodes simultaneously under one operation running device is ensured.
3. The distributed data integration job scheduling method according to claim 1, wherein: the resource scheduling module comprises a resource obtaining unit, a resource allocation unit and a scheduling distribution unit, wherein the step (1.3) specifically operates as follows:
1) the resource obtaining unit obtains the computing resources of the operation cluster and caches the computing resources in the memory;
2) the resource allocation unit acquires all the jobs from the bounded ordered queue and allocates computing resources to each job according to the job priority order;
3) and the scheduling and distributing unit appoints an actuator for the scheduled job and sends the job meta-information, the distributed computing resources and the actuator configuration to the job running cluster.
4. The distributed data integration job scheduling method according to claim 1, wherein: in the step (2), the operation running device comprises a main control node and a working node, the main control node is responsible for management and coordination, and the working node is responsible for executing data integration operation; the master control node receives the job meta-information, the job resource allocation information and the job executor information which are distributed by the resource scheduling module and starts the job execution; the agent program on the working node collects the state information of the operation and sends the state information to the main control node, and the main control node sends the state information to the operation management module; the agent program on the working node collects the working node computing resources and sends the working node computing resources to the main control node, and the main control node sends the working node computing resources to the resource scheduling module; the agent program on the working node sends heartbeat information to the main control node, and the main control node sends loss of connection or fault information to the operation preloading module; the executor on the working node is provided with a retry mechanism, and once the data flow source or the data flow target goes down or loses connection, the executor carries out timing retry to ensure that the data flow source and the data flow target can continue to normally operate after recovery.
5. The distributed data integration job scheduling method according to claim 1, wherein: the operation of the operation device is based on a meso cluster system to manage distributed system resources, a main control node adopts a RAM + WAL log mode to provide low-delay local metadata management, a PAXOS algorithm is adopted to maintain the operation state synchronization of working nodes, and specific physical resources are pushed to an operation scheduling system based on a cluster physical resource unified management interface and a specific resource sharing strategy; the job operation cluster provides two modes of a multi-language driver package and JSON RPC for the registration of the job scheduling system and the acquisition of a specific callback event; the agent program is responsible for collecting the resources of the working nodes, running a specific scheduling task through the actuator and returning the execution result and the task state of the actuator to the master control node; and then the main control node forwards the data to the job scheduling device.
6. A distributed data integrated job scheduling apparatus to which the method of claim 1 is applied, comprising: a job scheduling device and a job running device; the job scheduling device and the job running device perform information interaction with each other; the job scheduling device comprises a job management module, a job preloading module and a resource scheduling module; the operation management module is used for receiving, caching and storing operation related meta information and performing concurrency control; the job preloading module is used for acquiring the job to be processed from the job management module and determining the scheduling priority; the resource scheduling module is used for completing resource allocation and scheduling distribution by acquiring the job preloading information and the calculation resource information of the job running device; the operation running device comprises a main control node and a working node, wherein the main control node is responsible for management and coordination, and the working node is responsible for executing data integration operation.
7. The distributed data integrated job scheduling device according to claim 6, wherein: the job scheduling device and the job running device are both registered in the ZooKeeper, the job scheduling device adopts a main standby mode, and once the main device is down, the ZooKeeper selects a standby device and replaces the job scheduling work; the main control nodes in the operation running device adopt a main standby mode, and once the main control nodes are down, the ZooKeeper elects the standby main control nodes to take over the management coordination work.
8. The distributed data integrated job scheduling apparatus according to claim 7, wherein: the job scheduling device audits and maintains based on job state information fed back by the job running device, and the job metadata base maintains consistency based on job metadata cache of the job scheduling device; when the job scheduling device fails, the standby job scheduling device needs to interact with the job metadata database once taking over the work, a job metadata cache mechanism is rebuilt, and the metadata cache information is audited and maintained by receiving job state feedback information from the job running device, so that the metadata cache information is kept consistent in a distribution system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910489422.2A CN110362390B (en) | 2019-06-06 | 2019-06-06 | Distributed data integration job scheduling method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910489422.2A CN110362390B (en) | 2019-06-06 | 2019-06-06 | Distributed data integration job scheduling method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110362390A CN110362390A (en) | 2019-10-22 |
CN110362390B true CN110362390B (en) | 2021-09-07 |
Family
ID=68215696
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910489422.2A Active CN110362390B (en) | 2019-06-06 | 2019-06-06 | Distributed data integration job scheduling method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110362390B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111045802B (en) * | 2019-11-22 | 2024-01-26 | 中国联合网络通信集团有限公司 | Redis cluster component scheduling system and method and platform equipment |
CN111124806B (en) * | 2019-11-25 | 2023-09-05 | 山东鲁软数字科技有限公司 | Method and system for monitoring equipment state in real time based on distributed scheduling task |
CN111338770A (en) * | 2020-02-12 | 2020-06-26 | 咪咕文化科技有限公司 | Task scheduling method, server and computer readable storage medium |
CN111580990A (en) * | 2020-05-08 | 2020-08-25 | 中国建设银行股份有限公司 | Task scheduling method, scheduling node, centralized configuration server and system |
CN112200534A (en) * | 2020-09-24 | 2021-01-08 | 中国建设银行股份有限公司 | Method and device for managing time events |
CN112328383A (en) * | 2020-11-19 | 2021-02-05 | 湖南智慧畅行交通科技有限公司 | Priority-based job concurrency control and scheduling algorithm |
CN112131318B (en) * | 2020-11-30 | 2021-03-16 | 北京优炫软件股份有限公司 | Pre-written log record ordering system in database cluster |
CN112527488A (en) * | 2020-12-21 | 2021-03-19 | 浙江百应科技有限公司 | Distributed high-availability task scheduling method and system |
CN112835717A (en) * | 2021-02-05 | 2021-05-25 | 远光软件股份有限公司 | Integrated application processing method and device for cluster |
CN113778676B (en) * | 2021-09-02 | 2023-05-23 | 山东派盟网络科技有限公司 | Task scheduling system, method, computer device and storage medium |
CN113986507A (en) * | 2021-11-01 | 2022-01-28 | 佛山技研智联科技有限公司 | Job scheduling method and device, computer equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6490572B2 (en) * | 1998-05-15 | 2002-12-03 | International Business Machines Corporation | Optimization prediction for industrial processes |
CN101309208A (en) * | 2008-06-21 | 2008-11-19 | 华中科技大学 | Job scheduling system suitable for grid environment and based on reliable expense |
CN101599026A (en) * | 2009-07-09 | 2009-12-09 | 浪潮电子信息产业股份有限公司 | A kind of cluster job scheduling system with resilient infrastructure |
CN104317650A (en) * | 2014-10-10 | 2015-01-28 | 北京工业大学 | Map/Reduce type mass data processing platform-orientated job scheduling method |
CN104462370A (en) * | 2014-12-09 | 2015-03-25 | 北京百度网讯科技有限公司 | Distributed task scheduling system and method |
US9141433B2 (en) * | 2009-12-18 | 2015-09-22 | International Business Machines Corporation | Automated cloud workload management in a map-reduce environment |
CN109327509A (en) * | 2018-09-11 | 2019-02-12 | 武汉魅瞳科技有限公司 | A kind of distributive type Computational frame of the lower coupling of master/slave framework |
-
2019
- 2019-06-06 CN CN201910489422.2A patent/CN110362390B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6490572B2 (en) * | 1998-05-15 | 2002-12-03 | International Business Machines Corporation | Optimization prediction for industrial processes |
CN101309208A (en) * | 2008-06-21 | 2008-11-19 | 华中科技大学 | Job scheduling system suitable for grid environment and based on reliable expense |
CN101599026A (en) * | 2009-07-09 | 2009-12-09 | 浪潮电子信息产业股份有限公司 | A kind of cluster job scheduling system with resilient infrastructure |
US9141433B2 (en) * | 2009-12-18 | 2015-09-22 | International Business Machines Corporation | Automated cloud workload management in a map-reduce environment |
CN104317650A (en) * | 2014-10-10 | 2015-01-28 | 北京工业大学 | Map/Reduce type mass data processing platform-orientated job scheduling method |
CN104462370A (en) * | 2014-12-09 | 2015-03-25 | 北京百度网讯科技有限公司 | Distributed task scheduling system and method |
CN109327509A (en) * | 2018-09-11 | 2019-02-12 | 武汉魅瞳科技有限公司 | A kind of distributive type Computational frame of the lower coupling of master/slave framework |
Also Published As
Publication number | Publication date |
---|---|
CN110362390A (en) | 2019-10-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110362390B (en) | Distributed data integration job scheduling method and device | |
US11561841B2 (en) | Managing partitions in a scalable environment | |
US10817478B2 (en) | System and method for supporting persistent store versioning and integrity in a distributed data grid | |
US8799248B2 (en) | Real-time transaction scheduling in a distributed database | |
CN104793988B (en) | The implementation method and device of integration across database distributed transaction | |
Lin et al. | Towards a non-2pc transaction management in distributed database systems | |
JP5214105B2 (en) | Virtual machine monitoring | |
US20040158549A1 (en) | Method and apparatus for online transaction processing | |
US20090172142A1 (en) | System and method for adding a standby computer into clustered computer system | |
EP3198430A1 (en) | System and method for supporting dynamic thread pool sizing in a distributed data grid | |
US9128895B2 (en) | Intelligent flood control management | |
JP2015506523A (en) | Dynamic load balancing in a scalable environment | |
EP2673711A1 (en) | Method and system for reducing write latency for database logging utilizing multiple storage devices | |
US11550820B2 (en) | System and method for partition-scoped snapshot creation in a distributed data computing environment | |
CN114064414A (en) | High-availability cluster state monitoring method and system | |
US9703634B2 (en) | Data recovery for a compute node in a heterogeneous database system | |
CN111580951A (en) | Task allocation method and resource management platform | |
CN113342511A (en) | Distributed task management system and method | |
JP2010218159A (en) | Management device, database management method and program | |
Lev-Ari et al. | Quick: a queuing system in cloudkit | |
JPH01112444A (en) | Data access system | |
Martin et al. | Near real-time analytics with ibm db2 analytics accelerator | |
WO2016048831A1 (en) | System and method for supporting dynamic thread pool sizing in a distributed data grid | |
CN117931302A (en) | Parameter file saving and loading method, device, equipment and storage medium | |
CN114416372A (en) | Request processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: 310012 1st floor, building 1, 223 Yile Road, Hangzhou City, Zhejiang Province Patentee after: Yinjiang Technology Co.,Ltd. Address before: 310012 1st floor, building 1, 223 Yile Road, Hangzhou City, Zhejiang Province Patentee before: ENJOYOR Co.,Ltd. |