CN115185673B - Distributed timing task scheduling method, system, storage medium and program product - Google Patents
Distributed timing task scheduling method, system, storage medium and program product Download PDFInfo
- Publication number
- CN115185673B CN115185673B CN202210541507.2A CN202210541507A CN115185673B CN 115185673 B CN115185673 B CN 115185673B CN 202210541507 A CN202210541507 A CN 202210541507A CN 115185673 B CN115185673 B CN 115185673B
- Authority
- CN
- China
- Prior art keywords
- task
- node
- directory
- thread
- distributed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 75
- 238000004590 computer program Methods 0.000 claims description 12
- 230000004044 response Effects 0.000 claims description 11
- 230000002085 persistent effect Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 3
- 239000002699 waste material Substances 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 101150039208 KCNK3 gene Proteins 0.000 description 1
- 101150083764 KCNK9 gene Proteins 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a distributed timing task scheduling method, a system, a storage medium and a program product, wherein the method comprises the following steps: the scheduling server pulls tasks to be executed in a preset time period in the future from the database at fixed time, and creates corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time; and queuing and acquiring a lock by a thread in an idle state of the distributed service node, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task. The distributed timing task scheduling method, the system, the storage medium and the program product provided by the invention can avoid excessive access to the database and reduce the access pressure to the database; task backlog can be avoided, and high availability of distributed server resources is ensured.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a distributed timing task scheduling method, system, storage medium, and program product.
Background
There are many scenarios where there is a need for timing data pushing. For example, business personnel in an enterprise need to acquire report data on a daily basis. The timing data pushing is completed by the corresponding timing data pushing task. To meet timely pushing of timing data as much as possible, timing data pushing tasks are typically performed by distributed services.
The number of timing data push tasks is characterized by uneven time distribution, and a large number of tasks can be configured to be executed in a certain time period. For example, a large number of push tasks are configured to be performed around 9 a.m. The existing distributed timing task pulls a task from the database according to a preset time interval, and the time interval determines the access frequency of the database. Too high or too low an access frequency can cause problems. For example, too high an access frequency can result in too much database pressure; too low an access frequency can result in backlog of tasks, and server resources cannot be fully utilized.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a distributed timing task scheduling method, a distributed timing task scheduling system, a storage medium and a program product.
The invention provides a distributed timing task scheduling method, which comprises the following steps: the scheduling server pulls tasks to be executed in a preset time period in the future from the database at fixed time, and creates corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time; and queuing and acquiring a lock by a thread in an idle state of the distributed service node, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
The invention provides a distributed timing task scheduling method, which further comprises the following steps: creating a corresponding temporary directory node under the zk node list directory node after the distributed service node is online; the scheduling server monitors the state of the temporary directory node created under the zk node list directory node, acquires the incomplete task of the offline distributed service node in response to acquiring the offline information of the distributed service node, and re-creates the corresponding temporary directory node under the zk task queue directory node according to the incomplete task of the offline distributed service node.
According to the distributed timing task scheduling method provided by the invention, the temporary directory node created under the zk task queue directory node is a temporary sequence numbering directory node; after the thread acquires the lock, the task corresponding to one temporary directory node is taken away from the zk task queue directory node, and the method comprises the following steps: and after the thread acquires the lock, taking a task corresponding to the temporary directory node from the zk task queue directory node according to the sequence number of the temporary directory node.
The invention provides a distributed timing task scheduling method, which further comprises the following steps: the distributed service node updates the state of the corresponding task in the database after the thread takes the task and executes the task; before the tasks that the distributed service node to be offline do not complete re-create corresponding temporary directory nodes under the zk task queue directory nodes, respectively, the method further includes: and inquiring the database according to the information of the offline distributed service node to acquire the incomplete task.
According to the distributed timing task scheduling method provided by the invention, the thread queuing acquisition lock of the distributed service node in the idle state comprises the following steps: the thread in the idle state of the distributed service node creates a corresponding temporary sequence number directory node under the zk thread queue directory node; and the thread acquires the serial numbers of other temporary sequence number directory nodes under the zk thread queue directory node, and acquires a lock if the serial number of the temporary sequence number directory node corresponding to the thread is minimum.
According to the distributed timing task scheduling method provided by the invention, after a thread in an idle state of a distributed service node is queued to acquire a lock, the method further comprises the following steps: and in response to the temporary directory node under the zk task queue directory node being empty, the thread listens to the zk task queue directory node.
The invention also provides a distributed timing task scheduling system, which comprises: scheduling server, zookeeper server and distributed service node cluster, distributed service node cluster includes at least one distributed service node, wherein: the scheduling server is used for: the method comprises the steps of regularly pulling tasks to be executed in a preset time period from a database, and creating corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively; the distributed service node is configured to: and queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps executed by the scheduling server or the distributed service node in any of the above-mentioned distributed timing task scheduling methods when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps performed by a scheduling server or a distributed service node in a distributed timed task scheduling method as described in any one of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor performs the steps performed by a scheduling server or a distributed service node in a distributed timed task scheduling method as described in any one of the above.
According to the distributed timing task scheduling method, the distributed timing task scheduling system, the storage medium and the program product, the scheduling server pulls tasks to be executed in a preset time period in the future from the database at fixed time, a relatively longer time interval can be set for pulling the tasks, excessive access to the database is avoided, and access pressure to the database is reduced; by pulling tasks to be executed within a period of time and creating corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time, the task to be executed is stored in the temporary directory nodes of the zookeeper in advance so as to be read and executed by a thread, and the waiting time of the thread when the task data is read is reduced; meanwhile, the thread in the idle state of the distributed service node queues to acquire the lock, and the thread acquires the task corresponding to one temporary directory node from the zk task queue directory node after the lock is acquired, so that the thread is fully utilized, and the resource waste is avoided; therefore, task backlog is avoided, and high availability of distributed server resources is ensured.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a distributed timing task scheduling method provided by the invention;
FIG. 2 is a second flow chart of a method for scheduling distributed timing tasks according to the present invention;
FIG. 3 is a schematic diagram of a distributed timed task scheduling system provided by the present invention;
fig. 4 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The distributed timed task scheduling method, system, storage medium and program product of the present invention are described below in conjunction with fig. 1-4.
Fig. 1 is a schematic flow chart of a distributed timing task scheduling method provided by the invention. As shown in fig. 1, the method includes:
and step 101, the scheduling server pulls tasks to be executed in a preset time period from the database at regular time, and creates corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively.
The scheduling server regularly pulls the tasks to be executed in the future preset time period from the database (instead of pulling a single task unless only one task to be executed in the future preset time period), namely, the task acquisition of the time period batch is performed. The scheduling server regularly pulls tasks to be executed in a future preset period from the database, and the tasks can be executed according to preset time intervals. When pulling a task each time, in order to avoid repeated pulling, a future preset period corresponding to the pulled task may be set according to a preset time interval. For example, if the batch data is pulled once in 1 minute, the task to be executed in the future 1 minute can be pulled from the current time. For example, 9:00 am pulls the tasks to be performed between 9:00 and 9:01 time periods.
zookeeper is a distributed service framework, is a sub-item of Apache Hadoop, and is mainly used for solving some data management problems frequently encountered in distributed applications, such as: unified naming service, state synchronization service, cluster management, management of distributed application configuration items, and the like. The zookeeper maintains a data structure similar to a file system, and each sub-directory entry is called a znode (directory node), and as with the file system, znode can be freely added and deleted, and a sub-znode is added and deleted under a znode, with the only difference that znode can store data. There are four types of znode:
(1) And (5) persisting the directory node. After the client disconnects from the zookeeper, the node still exists.
(2) The persistence sequence numbers the directory node. After the client is disconnected from the zookeeper, the node still exists, and the zookeeper only numbers the node names sequentially.
(3) Temporary directory nodes. After the client disconnects from the zookeeper, the node is deleted.
(4) The temporary sequence numbers the directory nodes. After the client disconnects from the zookeeper, the node is deleted, except that the zookeeper sequentially numbers the node names.
From the above, it follows that the persistent sequence number directory nodes are a subset of the persistent directory nodes and the temporary sequence number directory nodes are a subset of the temporary directory nodes.
zookeeper is abbreviated zk. The zk task queue directory node can be a persistent directory node of the zookeeper, and the scheduling server creates corresponding temporary directory nodes under the zk task queue directory node respectively for the tasks reaching the execution time, namely, adds the temporary directory nodes corresponding to the tasks respectively under the zk task queue directory node. The tasks pulled from the database by the scheduling server include information about the execution of the task, such as data source information, etc. The temporary directory nodes which are added under the zk task queue directory nodes and respectively correspond to the tasks also contain the information for executing the tasks.
Step 102, queuing and acquiring a lock by a thread in an idle state of a distributed service node, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
Multiple distributed service nodes may constitute a cluster of distributed server nodes, collectively performing tasks. In the invention, each distributed service node can execute the task to be executed. The thread in the idle state is a thread that is not currently executing a task. And queuing and acquiring a lock by the thread in an idle state of the distributed service node, and taking a task corresponding to the temporary directory node from the directory node of the zk task queue after the thread acquires the lock. Because the temporary directory node established at the zk task queue directory node corresponds to a task and contains task related information, the process of a thread taking a task corresponding to the temporary directory node may include obtaining task information by reading data of the temporary directory node and updating the state of the corresponding temporary directory node (for example, the thread has taken a task to process), so that after the scheduling server monitors the state change of the corresponding temporary directory node, the connection of the corresponding temporary directory node under the zk task queue directory node is disconnected, and the corresponding temporary directory node disappears.
And the thread releases the lock after taking a task corresponding to the temporary directory node, executes the corresponding task, and re-queues to acquire the lock after executing the task so as to acquire and execute the task again.
According to the distributed timing task scheduling method provided by the invention, the scheduling server pulls tasks to be executed in a future preset period from the database at fixed time, a relatively longer time interval can be set for task pulling, excessive access to the database is avoided, and the access pressure to the database is reduced; by pulling tasks to be executed within a period of time and creating corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time, the task to be executed is stored in the temporary directory nodes of the zookeeper in advance so as to be read and executed by a thread, and the waiting time of the thread when the task data is read is reduced; meanwhile, the thread in the idle state of the distributed service node queues to acquire the lock, and the thread acquires the task corresponding to one temporary directory node from the zk task queue directory node after the lock is acquired, so that the thread is fully utilized, and the resource waste is avoided; therefore, task backlog is avoided, and high availability of distributed server resources is ensured.
The invention provides a distributed timing task scheduling method, which further comprises the following steps: creating a corresponding temporary directory node under the zk node list directory node after the distributed service node is online; the scheduling server monitors the state of the temporary directory node created under the zk node list directory node, acquires the incomplete task of the offline distributed service node in response to acquiring the offline information of the distributed service node, and re-creates the corresponding temporary directory node under the zk task queue directory node according to the incomplete task of the offline distributed service node.
The zk node list directory node may be a persistent directory node of a zookeeper. After the distributed service node is online, a corresponding temporary directory node is created under the zk node list directory node, the distributed service node creates the corresponding temporary directory node under the zk node list directory node to represent the distributed service node, and if the distributed service node is offline (the distributed service node is inevitably disconnected from the zookeeper) because the distributed service node is the temporary directory node, the corresponding temporary directory node disappears.
The scheduling server acquires the offline condition of the distributed service node by monitoring the state of the temporary directory node created under the zk node list directory node, and if the temporary directory node disappears, the corresponding offline condition of the distributed service node is indicated. And responding to the acquired offline information of the distributed service node, acquiring incomplete tasks of the offline distributed service node, and re-creating corresponding temporary directory nodes under zk task queue directory nodes for the offline distributed service node to execute tasks again after acquiring the lock positions by the threads in the idle state.
According to the distributed timing task scheduling method provided by the invention, the corresponding temporary directory nodes are created under the zk node list directory nodes after the distributed service nodes are on line, the scheduling server monitors the states of the temporary directory nodes created under the zk node list directory nodes, and in response to acquiring the offline information of the distributed service nodes, incomplete tasks of the offline distributed service nodes are acquired, and the incomplete tasks of the offline distributed service nodes are respectively created under the zk task queue directory nodes again, so that a task pushing failure re-pushing mechanism caused by the abnormity of the distributed nodes is realized.
According to the distributed timing task scheduling method provided by the invention, the temporary directory node created under the zk task queue directory node is a temporary sequence numbering directory node; after the thread acquires the lock, the task corresponding to one temporary directory node is taken away from the zk task queue directory node, and the method comprises the following steps: and after the thread acquires the lock, taking a task corresponding to the temporary directory node from the zk task queue directory node according to the sequence number of the temporary directory node.
The temporary directory nodes created under the zk task queue directory nodes are temporary sequence numbering directory nodes, namely the temporary directory nodes created under the zk task queue directory nodes are numbered in sequence according to the creation time, and the earlier the creation time is, the smaller the number is.
When the thread acquires the task corresponding to the temporary directory node from the zk task queue directory node after locking, the task corresponding to the temporary directory node is taken from the zk task queue directory node according to the serial number of the temporary sequence numbered directory node. For example, if the zk task queue directory node includes tasks that have been pulled from the database twice or more, then the tasks that were pulled earlier should be preferentially executed. When the scheduling server creates corresponding temporary directory nodes under the zk task queue directory nodes respectively for the tasks reaching the execution time, the scheduling server can also sequentially create the corresponding temporary directory nodes under the zk task queue directory nodes according to the sequence of the task execution time. Thus, after the thread acquires the lock, when the task corresponding to one temporary directory node is taken away from the zk task queue directory node, the task with the small sequence number can be taken away first for execution, so that the task with the early execution time can be executed preferentially.
According to the distributed timing task scheduling method, the temporary directory node created under the zk task queue directory node is set to be the temporary sequence number directory node, and after a thread acquires a lock, a task corresponding to the temporary directory node is taken away from the zk task queue directory node according to the sequence number of the temporary sequence number directory node, so that the task with early execution time can be executed preferentially.
The invention provides a distributed timing task scheduling method, which further comprises the following steps: the distributed service node updates the state of the corresponding task in the database after the thread takes the task and executes the task; before the tasks that the distributed service node to be offline do not complete re-create corresponding temporary directory nodes under the zk task queue directory nodes, respectively, the method further includes: and inquiring the database according to the information of the offline distributed service node to acquire the incomplete task.
After the thread takes the task, the distributed service node updates the state of the corresponding task in the database after executing the task. For example, after a thread takes a task, updating the state of the corresponding task in the database to be that the task is taken away and is in process; after the thread executes the task, the state of the corresponding task in the database is updated to the state that the task has been executed or processed.
The distributed service node may run multiple threads for task processing. Before the scheduling server re-creates corresponding temporary directory nodes under zk task queue directory nodes respectively, the incomplete tasks are obtained according to the information query database of the offline distributed service nodes, and the incomplete tasks, namely the offline distributed service nodes are marked as being processed in the database but are not marked as being executed.
According to the distributed timing task scheduling method provided by the invention, the task state in the database is updated by using the distributed service node, so that the scheduling server can conveniently acquire the information of the unfinished task when the distributed service node is off line, and the task processing is carried out again.
According to the distributed timing task scheduling method provided by the invention, the thread queuing acquisition lock of the distributed service node in the idle state comprises the following steps: the thread in the idle state of the distributed service node creates a corresponding temporary sequence number directory node under the zk thread queue directory node; and the thread acquires the serial numbers of other temporary sequence number directory nodes under the zk thread queue directory node, and acquires a lock if the serial number of the temporary sequence number directory node corresponding to the thread is minimum.
The zk thread queue directory node may be a persistent directory node of a zookeeper. The distributed service node is in idle state, the corresponding temporary sequence number directory nodes are created under zk thread queue directory nodes, and the temporary sequence number directory nodes are sequentially increased according to different creation sequences. The thread acquires the serial numbers of other temporary sequence number directory nodes under the zk thread queue directory node, and if the serial number of the temporary sequence number directory node corresponding to the thread is minimum, the thread acquires the lock. After the lock is acquired, the zk task queue directory node can be accessed. The thread which acquires the lock can process the task, and the thread without the lock waits, so that dirty data generated by operating on a piece of data at the same time is avoided.
After the thread acquires the lock, the connection of the corresponding temporary directory node under the zk thread queue directory node is disconnected, and the corresponding temporary directory node disappears.
According to the distributed timing task scheduling method provided by the invention, the corresponding temporary sequence number directory nodes are created under the zk thread queue directory nodes to control the acquisition of the locks by the threads, so that the reliability of the control of the lock functions is improved.
According to the distributed timing task scheduling method provided by the invention, after a thread in an idle state of a distributed service node is queued to acquire a lock, the method further comprises the following steps: and in response to the temporary directory node under the zk task queue directory node being empty, the thread listens to the zk task queue directory node.
After the thread queuing of the distributed service node in the idle state acquires the lock, the task needs to be acquired from the zk task queue directory node, if the temporary directory node under the zk task queue directory node is empty, the task waiting for processing at the moment is not indicated, the thread monitors the state of the zk task queue directory node, and if a new temporary directory node is generated, the corresponding task is acquired.
According to the distributed timing task scheduling method provided by the invention, the thread monitors the zk task queue directory node by responding to the fact that the temporary directory node under the zk task queue directory node is empty, so that the reliability and timeliness of the thread for acquiring the task are ensured.
FIG. 2 is a second flow chart of the distributed timing task scheduling method according to the present invention. As shown in fig. 2, serviceA is a service provided by a dispatch server, serviceB is a service provided by a distributed server, and the method includes:
the scheduling server regularly pulls all unexecuted tasks in a period of time from a Database (Database), writes the tasks reaching the execution time into zk Task queue nodes task_queue, wherein the zk Task queue nodes task_queue are the Task queue target nodes, writes the tasks reaching the execution time into the zk Task queue nodes task_queue, namely establishes corresponding temporary directory nodes under the zk Task queue nodes task_queue, such as Task1, task2 and Task3, and respectively corresponds to different tasks.
The distributed service Node is on line to write the distributed service Node into the node_list Node, wherein the node_list Node is the zk Node list directory Node, and the distributed service Node is on line to write the distributed service Node into the node_list Node, namely the distributed service Node establishes a temporary directory Node corresponding to the distributed service Node under the node_list Node.
Each idle Thread (Thread in idle state) of the distributed service node writes itself into a thread_queue node, the thread_queue node is the zk Thread queue directory node, each Thread writes itself into the thread_queue node, that is, each Thread establishes a temporary directory node corresponding to itself under the thread_queue node, and the temporary directory node is a temporary sequence number directory node. Thus, the thread obtains the sequence number of the other temporary sequence number directory node, and if the sequence number is the smallest, the thread obtains the lock.
And acquiring a thread of the lock to a zk Task queue directory node (task_queue), taking a Task corresponding to one temporary directory node, releasing the lock, and monitoring the zk Task queue directory node if the Task queue is empty (the zk Task queue directory node does not have the temporary directory node).
The distributed service node updates the state of the corresponding task in the database after the thread takes the task and the task is processed. The scheduling server monitors the state of the node_list Node, acquires the offline information of the distributed service Node, queries a database to acquire the information of the incomplete Task after a certain distributed service Node is offline, and rewrites the incomplete Task to the task_queue Node so as to enable the thread to take out and process the corresponding Task.
The distributed timing task scheduling method provided by the invention can realize batch acquisition of time period tasks, fully utilize server resources and reduce access pressure to a database, and simultaneously increase a failed pushing and re-pushing mechanism caused by abnormal node service, so that the problems of database pressure or task backlog caused by overhigh or overlow frequency of acquiring single task by timing access to the database can be solved, and the problem of task pushing failure caused by abnormal node service can be solved. According to the distributed timing task scheduling method, the distributed timing task pushing service ensures that the server resources are fully utilized in the peak time triggered by the timing task time, so that the task backlog is reduced to the minimum, and meanwhile, the database is ensured not to bear excessive access pressure.
The distributed timed task scheduling system provided by the invention is described below, and the distributed timed task scheduling system described below and the distributed timed task scheduling method described above can be referred to correspondingly.
Fig. 3 is a schematic structural diagram of a distributed timing task scheduling system provided by the present invention. As shown in fig. 3, the system includes: a dispatch server 1, a zookeeper server 2 and a distributed service node cluster 3 comprising at least one distributed service node 31, wherein: the scheduling server 1 is configured to: the method comprises the steps of regularly pulling tasks to be executed in a preset time period from a database, and creating corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively; the distributed service node 31 is configured to: and queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
According to the distributed timing task scheduling system provided by the invention, the scheduling server pulls tasks to be executed in a preset time period in the future from the database at fixed time, and can set a relatively longer time interval to pull the tasks, so that excessive access to the database is avoided, and the access pressure to the database is reduced; by pulling tasks to be executed within a period of time and creating corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time, the task to be executed is stored in the temporary directory nodes of the zookeeper in advance so as to be read and executed by a thread, and the waiting time of the thread when the task data is read is reduced; meanwhile, the thread in the idle state of the distributed service node queues to acquire the lock, and the thread acquires the task corresponding to one temporary directory node from the zk task queue directory node after the lock is acquired, so that the thread is fully utilized, and the resource waste is avoided; therefore, task backlog is avoided, and high availability of distributed server resources is ensured.
According to the distributed timing task scheduling system provided by the invention, the distributed service node 31 is further used for creating a corresponding temporary directory node under the zk node list directory node after being on line; the scheduling server 1 is further configured to monitor a state of a temporary directory node created under the zk node list directory node, obtain an incomplete task of the offline distributed service node in response to obtaining the offline information of the distributed service node, and re-create a corresponding temporary directory node under the zk task queue directory node with the incomplete task of the offline distributed service node.
According to the distributed timing task scheduling system provided by the invention, the corresponding temporary directory nodes are created under the zk node list directory nodes after the distributed service nodes are on line, the scheduling server monitors the states of the temporary directory nodes created under the zk node list directory nodes, and in response to acquiring the offline information of the distributed service nodes, incomplete tasks of the offline distributed service nodes are acquired, and the incomplete tasks of the offline distributed service nodes are respectively created under the zk task queue directory nodes again, so that a task pushing failure re-pushing mechanism caused by the abnormity of the distributed nodes is realized.
According to the distributed timing task scheduling system provided by the invention, the temporary directory node created under the zk task queue directory node is a temporary sequence numbering directory node; the distributed service node 31 is specifically configured to, when the task corresponding to one temporary directory node is taken away from the zk task queue directory node after the thread acquires the lock: and after the thread acquires the lock, numbering the serial number of the directory node to the zk task queue directory node according to the temporary sequence, and taking a task corresponding to the temporary directory node.
According to the distributed timing task scheduling system provided by the invention, the temporary directory node created under the zk task queue directory node is set as the temporary sequence number directory node, and the task corresponding to the temporary directory node is taken away from the zk task queue directory node according to the sequence number of the temporary sequence number directory node after the thread acquires the lock, so that the task with early execution time can be executed preferentially.
According to the distributed timing task scheduling system provided by the invention, the distributed service node 31 is further used for updating the state of the corresponding task in the database after the thread takes the task and after executing the task; the scheduling server 1 is further configured to, before being configured to re-create the corresponding temporary directory nodes under the zk task queue directory nodes, the tasks that the distributed service node is not completed with the offline, respectively: and inquiring the database according to the information of the offline distributed service node to acquire the incomplete task.
According to the distributed timing task scheduling system provided by the invention, the task state in the database is updated by using the distributed service node, so that the scheduling server can conveniently acquire the information of unfinished tasks when the distributed service node is offline, and the task processing is carried out again.
According to the distributed timing task scheduling system provided by the invention, when the distributed service node 31 is used for queuing and acquiring locks for threads in an idle state, the distributed service node is specifically used for: creating a corresponding temporary sequence number directory node under the zk thread queue directory node by the thread in the idle state; and the thread acquires the serial numbers of other temporary sequence number directory nodes under the zk thread queue directory node, and acquires a lock if the serial number of the temporary sequence number directory node corresponding to the thread is minimum.
According to the distributed timing task scheduling system provided by the invention, the corresponding temporary sequence number directory nodes are created under the zk thread queue directory nodes to control the acquisition of the locks by the threads, so that the reliability of the control of the lock functions is improved.
According to the distributed timing task scheduling system provided by the invention, after the distributed service node 31 is used for queuing and acquiring locks for the threads in the idle state, the distributed service node is further used for: and in response to the temporary directory node under the zk task queue directory node being empty, the thread listens to the zk task queue directory node.
According to the distributed timing task scheduling system provided by the invention, the thread monitors the zk task queue directory node by responding to the fact that the temporary directory node under the zk task queue directory node is empty, so that the reliability and timeliness of the thread for acquiring tasks are ensured.
Fig. 4 illustrates a physical schematic diagram of an electronic device, as shown in fig. 4, which may include: processor 410, communication interface (Communications Interface) 420, memory 430 and communication bus 440, wherein processor 410, communication interface 420 and memory 430 communicate with each other via communication bus 440. Processor 410 may invoke logic instructions in memory 430 to perform a distributed timed task scheduling method that includes the steps performed by a scheduling server: the method comprises the steps of regularly pulling tasks to be executed in a preset time period from a database, and creating corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively; or including the steps performed by the distributed service node: and queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of performing the distributed timing task scheduling method provided by the methods above, the method comprising the steps of: the method comprises the steps of regularly pulling tasks to be executed in a preset time period from a database, and creating corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively; or including the steps performed by the distributed service node: and queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the distributed timed task scheduling method provided by the above methods, the method comprising the steps of: the method comprises the steps of regularly pulling tasks to be executed in a preset time period from a database, and creating corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively; or including the steps performed by the distributed service node: and queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (8)
1. A distributed timing task scheduling method, comprising:
the scheduling server pulls tasks to be executed in a future preset period from the database according to preset time intervals, and creates corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time; wherein the future preset period is set according to the preset time interval;
the distributed service node is in an idle state, the thread queues to acquire a lock, the thread acquires the lock, then takes away a task corresponding to the temporary directory node from the zk task queue directory node, releases the lock, and re-queues to acquire the lock after executing the task;
after the thread in the idle state of the distributed service node queues to acquire the lock, the method further comprises: and in response to the temporary directory node under the zk task queue directory node being empty, the thread listens to the zk task queue directory node.
2. The distributed timed task scheduling method according to claim 1, characterized in that the method further comprises:
creating a corresponding temporary directory node under the zk node list directory node after the distributed service node is online;
the scheduling server monitors the state of the temporary directory node created under the zk node list directory node, acquires the incomplete task of the offline distributed service node in response to acquiring the offline information of the distributed service node, and re-creates the corresponding temporary directory node under the zk task queue directory node according to the incomplete task of the offline distributed service node.
3. The distributed timed task scheduling method according to claim 1, wherein the temporary directory node created under the zk task queue directory node is a temporary sequence numbered directory node;
after the thread acquires the lock, the task corresponding to one temporary directory node is taken away from the zk task queue directory node, and the method comprises the following steps: and after the thread acquires the lock, taking a task corresponding to the temporary directory node from the zk task queue directory node according to the sequence number of the temporary directory node.
4. The distributed timed task scheduling method according to claim 2, characterized in that the method further comprises: the distributed service node updates the state of the corresponding task in the database after the thread takes the task and executes the task;
before the tasks that the distributed service node to be offline do not complete re-create corresponding temporary directory nodes under the zk task queue directory nodes, respectively, the method further includes: and inquiring the database according to the information of the offline distributed service node to acquire the incomplete task.
5. The method of claim 1, wherein the thread queuing the distributed service node in an idle state acquires a lock, comprising:
the thread in the idle state of the distributed service node creates a corresponding temporary sequence number directory node under the zk thread queue directory node;
and the thread acquires the serial numbers of other temporary sequence number directory nodes under the zk thread queue directory node, and acquires a lock if the serial number of the temporary sequence number directory node corresponding to the thread is minimum.
6. A distributed timed task scheduling system comprising a scheduling server, a zookeeper server, and a distributed service node cluster, the distributed server node cluster comprising at least one distributed service node, wherein:
the scheduling server is used for: pulling tasks to be executed in a future preset period from a database according to preset time intervals at fixed time, and creating corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time; wherein the future preset period is set according to the preset time interval;
the distributed service node is configured to: queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing again to acquire the lock after executing the task;
after the distributed service node queues the thread in the idle state to acquire the lock, the distributed service node is further configured to: and in response to the temporary directory node under the zk task queue directory node being empty, the thread listens to the zk task queue directory node.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, performs the steps performed by a scheduling server or a distributed service node in a distributed timed task scheduling method according to any one of claims 1 to 5.
8. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps performed by a scheduling server or a distributed service node in a distributed timed task scheduling method according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210541507.2A CN115185673B (en) | 2022-05-17 | 2022-05-17 | Distributed timing task scheduling method, system, storage medium and program product |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210541507.2A CN115185673B (en) | 2022-05-17 | 2022-05-17 | Distributed timing task scheduling method, system, storage medium and program product |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115185673A CN115185673A (en) | 2022-10-14 |
CN115185673B true CN115185673B (en) | 2023-10-31 |
Family
ID=83513972
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210541507.2A Active CN115185673B (en) | 2022-05-17 | 2022-05-17 | Distributed timing task scheduling method, system, storage medium and program product |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115185673B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118332174B (en) * | 2024-06-13 | 2024-10-29 | 荣耀终端有限公司 | Data crawling method, system and computer readable storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108132833A (en) * | 2017-12-08 | 2018-06-08 | 广州视源电子科技股份有限公司 | Distributed timing task scheduling method and device based on zookeeper technology |
CN111290854A (en) * | 2020-01-20 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Task management method, device and system, computer storage medium and electronic equipment |
CN112307105A (en) * | 2020-11-03 | 2021-02-02 | 平安普惠企业管理有限公司 | Timing task running method, device, equipment and storage medium based on multithreading |
CN112486695A (en) * | 2020-12-07 | 2021-03-12 | 浪潮云信息技术股份公司 | Distributed lock implementation method under high concurrency service |
US20210374152A1 (en) * | 2020-06-02 | 2021-12-02 | Illinois Institute Of Technology | Label-based data representation i/o process and system |
-
2022
- 2022-05-17 CN CN202210541507.2A patent/CN115185673B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108132833A (en) * | 2017-12-08 | 2018-06-08 | 广州视源电子科技股份有限公司 | Distributed timing task scheduling method and device based on zookeeper technology |
CN111290854A (en) * | 2020-01-20 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Task management method, device and system, computer storage medium and electronic equipment |
US20210374152A1 (en) * | 2020-06-02 | 2021-12-02 | Illinois Institute Of Technology | Label-based data representation i/o process and system |
CN112307105A (en) * | 2020-11-03 | 2021-02-02 | 平安普惠企业管理有限公司 | Timing task running method, device, equipment and storage medium based on multithreading |
CN112486695A (en) * | 2020-12-07 | 2021-03-12 | 浪潮云信息技术股份公司 | Distributed lock implementation method under high concurrency service |
Also Published As
Publication number | Publication date |
---|---|
CN115185673A (en) | 2022-10-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108388479B (en) | Delayed message pushing method and device, computer equipment and storage medium | |
US8832173B2 (en) | System and method of multithreaded processing across multiple servers | |
US10338958B1 (en) | Stream adapter for batch-oriented processing frameworks | |
US8938421B2 (en) | Method and a system for synchronizing data | |
CN108563502B (en) | Task scheduling method and device | |
US10133797B1 (en) | Distributed heterogeneous system for data warehouse management | |
US20050038772A1 (en) | Fast application notification in a clustered computing system | |
CN113448712A (en) | Task scheduling execution method and device | |
CN110647570B (en) | Data processing method and device and electronic equipment | |
CN110543512B (en) | Information synchronization method, device and system | |
CN115185673B (en) | Distributed timing task scheduling method, system, storage medium and program product | |
CN110619014A (en) | ETL-based data extraction method | |
CN115185787B (en) | Method and device for processing transaction log | |
US8301750B2 (en) | Apparatus, system, and method for facilitating communication between an enterprise information system and a client | |
CN112433830A (en) | ZooKeeper-based distributed task scheduling method, system and storage medium | |
CN111158930A (en) | Redis-based high-concurrency time-delay task system and processing method | |
US20200236165A1 (en) | System and method for synchronization of media objects between devices operating in a multiroom system | |
CN113761052A (en) | Database synchronization method and device | |
US10922145B2 (en) | Scheduling software jobs having dependencies | |
CN115189931A (en) | Distributed key management method, device, equipment and storage medium | |
CN113722390A (en) | Data storage method and system | |
US11836125B1 (en) | Scalable database dependency monitoring and visualization system | |
EP3214549A1 (en) | Information processing device, method, and program | |
EP3198803B1 (en) | Message service | |
CN115543585B (en) | Enterprise number card data synchronization method, server and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |