CN115185673B - Distributed timing task scheduling method, system, storage medium and program product - Google Patents

Distributed timing task scheduling method, system, storage medium and program product Download PDF

Info

Publication number
CN115185673B
CN115185673B CN202210541507.2A CN202210541507A CN115185673B CN 115185673 B CN115185673 B CN 115185673B CN 202210541507 A CN202210541507 A CN 202210541507A CN 115185673 B CN115185673 B CN 115185673B
Authority
CN
China
Prior art keywords
task
node
directory
thread
distributed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210541507.2A
Other languages
Chinese (zh)
Other versions
CN115185673A (en
Inventor
缪桓举
赵谦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seashell Housing Beijing Technology Co Ltd
Original Assignee
Seashell Housing Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seashell Housing Beijing Technology Co Ltd filed Critical Seashell Housing Beijing Technology Co Ltd
Priority to CN202210541507.2A priority Critical patent/CN115185673B/en
Publication of CN115185673A publication Critical patent/CN115185673A/en
Application granted granted Critical
Publication of CN115185673B publication Critical patent/CN115185673B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a distributed timing task scheduling method, a system, a storage medium and a program product, wherein the method comprises the following steps: the scheduling server pulls tasks to be executed in a preset time period in the future from the database at fixed time, and creates corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time; and queuing and acquiring a lock by a thread in an idle state of the distributed service node, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task. The distributed timing task scheduling method, the system, the storage medium and the program product provided by the invention can avoid excessive access to the database and reduce the access pressure to the database; task backlog can be avoided, and high availability of distributed server resources is ensured.

Description

Distributed timing task scheduling method, system, storage medium and program product
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a distributed timing task scheduling method, system, storage medium, and program product.
Background
There are many scenarios where there is a need for timing data pushing. For example, business personnel in an enterprise need to acquire report data on a daily basis. The timing data pushing is completed by the corresponding timing data pushing task. To meet timely pushing of timing data as much as possible, timing data pushing tasks are typically performed by distributed services.
The number of timing data push tasks is characterized by uneven time distribution, and a large number of tasks can be configured to be executed in a certain time period. For example, a large number of push tasks are configured to be performed around 9 a.m. The existing distributed timing task pulls a task from the database according to a preset time interval, and the time interval determines the access frequency of the database. Too high or too low an access frequency can cause problems. For example, too high an access frequency can result in too much database pressure; too low an access frequency can result in backlog of tasks, and server resources cannot be fully utilized.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a distributed timing task scheduling method, a distributed timing task scheduling system, a storage medium and a program product.
The invention provides a distributed timing task scheduling method, which comprises the following steps: the scheduling server pulls tasks to be executed in a preset time period in the future from the database at fixed time, and creates corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time; and queuing and acquiring a lock by a thread in an idle state of the distributed service node, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
The invention provides a distributed timing task scheduling method, which further comprises the following steps: creating a corresponding temporary directory node under the zk node list directory node after the distributed service node is online; the scheduling server monitors the state of the temporary directory node created under the zk node list directory node, acquires the incomplete task of the offline distributed service node in response to acquiring the offline information of the distributed service node, and re-creates the corresponding temporary directory node under the zk task queue directory node according to the incomplete task of the offline distributed service node.
According to the distributed timing task scheduling method provided by the invention, the temporary directory node created under the zk task queue directory node is a temporary sequence numbering directory node; after the thread acquires the lock, the task corresponding to one temporary directory node is taken away from the zk task queue directory node, and the method comprises the following steps: and after the thread acquires the lock, taking a task corresponding to the temporary directory node from the zk task queue directory node according to the sequence number of the temporary directory node.
The invention provides a distributed timing task scheduling method, which further comprises the following steps: the distributed service node updates the state of the corresponding task in the database after the thread takes the task and executes the task; before the tasks that the distributed service node to be offline do not complete re-create corresponding temporary directory nodes under the zk task queue directory nodes, respectively, the method further includes: and inquiring the database according to the information of the offline distributed service node to acquire the incomplete task.
According to the distributed timing task scheduling method provided by the invention, the thread queuing acquisition lock of the distributed service node in the idle state comprises the following steps: the thread in the idle state of the distributed service node creates a corresponding temporary sequence number directory node under the zk thread queue directory node; and the thread acquires the serial numbers of other temporary sequence number directory nodes under the zk thread queue directory node, and acquires a lock if the serial number of the temporary sequence number directory node corresponding to the thread is minimum.
According to the distributed timing task scheduling method provided by the invention, after a thread in an idle state of a distributed service node is queued to acquire a lock, the method further comprises the following steps: and in response to the temporary directory node under the zk task queue directory node being empty, the thread listens to the zk task queue directory node.
The invention also provides a distributed timing task scheduling system, which comprises: scheduling server, zookeeper server and distributed service node cluster, distributed service node cluster includes at least one distributed service node, wherein: the scheduling server is used for: the method comprises the steps of regularly pulling tasks to be executed in a preset time period from a database, and creating corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively; the distributed service node is configured to: and queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps executed by the scheduling server or the distributed service node in any of the above-mentioned distributed timing task scheduling methods when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps performed by a scheduling server or a distributed service node in a distributed timed task scheduling method as described in any one of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor performs the steps performed by a scheduling server or a distributed service node in a distributed timed task scheduling method as described in any one of the above.
According to the distributed timing task scheduling method, the distributed timing task scheduling system, the storage medium and the program product, the scheduling server pulls tasks to be executed in a preset time period in the future from the database at fixed time, a relatively longer time interval can be set for pulling the tasks, excessive access to the database is avoided, and access pressure to the database is reduced; by pulling tasks to be executed within a period of time and creating corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time, the task to be executed is stored in the temporary directory nodes of the zookeeper in advance so as to be read and executed by a thread, and the waiting time of the thread when the task data is read is reduced; meanwhile, the thread in the idle state of the distributed service node queues to acquire the lock, and the thread acquires the task corresponding to one temporary directory node from the zk task queue directory node after the lock is acquired, so that the thread is fully utilized, and the resource waste is avoided; therefore, task backlog is avoided, and high availability of distributed server resources is ensured.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a distributed timing task scheduling method provided by the invention;
FIG. 2 is a second flow chart of a method for scheduling distributed timing tasks according to the present invention;
FIG. 3 is a schematic diagram of a distributed timed task scheduling system provided by the present invention;
fig. 4 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The distributed timed task scheduling method, system, storage medium and program product of the present invention are described below in conjunction with fig. 1-4.
Fig. 1 is a schematic flow chart of a distributed timing task scheduling method provided by the invention. As shown in fig. 1, the method includes:
and step 101, the scheduling server pulls tasks to be executed in a preset time period from the database at regular time, and creates corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively.
The scheduling server regularly pulls the tasks to be executed in the future preset time period from the database (instead of pulling a single task unless only one task to be executed in the future preset time period), namely, the task acquisition of the time period batch is performed. The scheduling server regularly pulls tasks to be executed in a future preset period from the database, and the tasks can be executed according to preset time intervals. When pulling a task each time, in order to avoid repeated pulling, a future preset period corresponding to the pulled task may be set according to a preset time interval. For example, if the batch data is pulled once in 1 minute, the task to be executed in the future 1 minute can be pulled from the current time. For example, 9:00 am pulls the tasks to be performed between 9:00 and 9:01 time periods.
zookeeper is a distributed service framework, is a sub-item of Apache Hadoop, and is mainly used for solving some data management problems frequently encountered in distributed applications, such as: unified naming service, state synchronization service, cluster management, management of distributed application configuration items, and the like. The zookeeper maintains a data structure similar to a file system, and each sub-directory entry is called a znode (directory node), and as with the file system, znode can be freely added and deleted, and a sub-znode is added and deleted under a znode, with the only difference that znode can store data. There are four types of znode:
(1) And (5) persisting the directory node. After the client disconnects from the zookeeper, the node still exists.
(2) The persistence sequence numbers the directory node. After the client is disconnected from the zookeeper, the node still exists, and the zookeeper only numbers the node names sequentially.
(3) Temporary directory nodes. After the client disconnects from the zookeeper, the node is deleted.
(4) The temporary sequence numbers the directory nodes. After the client disconnects from the zookeeper, the node is deleted, except that the zookeeper sequentially numbers the node names.
From the above, it follows that the persistent sequence number directory nodes are a subset of the persistent directory nodes and the temporary sequence number directory nodes are a subset of the temporary directory nodes.
zookeeper is abbreviated zk. The zk task queue directory node can be a persistent directory node of the zookeeper, and the scheduling server creates corresponding temporary directory nodes under the zk task queue directory node respectively for the tasks reaching the execution time, namely, adds the temporary directory nodes corresponding to the tasks respectively under the zk task queue directory node. The tasks pulled from the database by the scheduling server include information about the execution of the task, such as data source information, etc. The temporary directory nodes which are added under the zk task queue directory nodes and respectively correspond to the tasks also contain the information for executing the tasks.
Step 102, queuing and acquiring a lock by a thread in an idle state of a distributed service node, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
Multiple distributed service nodes may constitute a cluster of distributed server nodes, collectively performing tasks. In the invention, each distributed service node can execute the task to be executed. The thread in the idle state is a thread that is not currently executing a task. And queuing and acquiring a lock by the thread in an idle state of the distributed service node, and taking a task corresponding to the temporary directory node from the directory node of the zk task queue after the thread acquires the lock. Because the temporary directory node established at the zk task queue directory node corresponds to a task and contains task related information, the process of a thread taking a task corresponding to the temporary directory node may include obtaining task information by reading data of the temporary directory node and updating the state of the corresponding temporary directory node (for example, the thread has taken a task to process), so that after the scheduling server monitors the state change of the corresponding temporary directory node, the connection of the corresponding temporary directory node under the zk task queue directory node is disconnected, and the corresponding temporary directory node disappears.
And the thread releases the lock after taking a task corresponding to the temporary directory node, executes the corresponding task, and re-queues to acquire the lock after executing the task so as to acquire and execute the task again.
According to the distributed timing task scheduling method provided by the invention, the scheduling server pulls tasks to be executed in a future preset period from the database at fixed time, a relatively longer time interval can be set for task pulling, excessive access to the database is avoided, and the access pressure to the database is reduced; by pulling tasks to be executed within a period of time and creating corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time, the task to be executed is stored in the temporary directory nodes of the zookeeper in advance so as to be read and executed by a thread, and the waiting time of the thread when the task data is read is reduced; meanwhile, the thread in the idle state of the distributed service node queues to acquire the lock, and the thread acquires the task corresponding to one temporary directory node from the zk task queue directory node after the lock is acquired, so that the thread is fully utilized, and the resource waste is avoided; therefore, task backlog is avoided, and high availability of distributed server resources is ensured.
The invention provides a distributed timing task scheduling method, which further comprises the following steps: creating a corresponding temporary directory node under the zk node list directory node after the distributed service node is online; the scheduling server monitors the state of the temporary directory node created under the zk node list directory node, acquires the incomplete task of the offline distributed service node in response to acquiring the offline information of the distributed service node, and re-creates the corresponding temporary directory node under the zk task queue directory node according to the incomplete task of the offline distributed service node.
The zk node list directory node may be a persistent directory node of a zookeeper. After the distributed service node is online, a corresponding temporary directory node is created under the zk node list directory node, the distributed service node creates the corresponding temporary directory node under the zk node list directory node to represent the distributed service node, and if the distributed service node is offline (the distributed service node is inevitably disconnected from the zookeeper) because the distributed service node is the temporary directory node, the corresponding temporary directory node disappears.
The scheduling server acquires the offline condition of the distributed service node by monitoring the state of the temporary directory node created under the zk node list directory node, and if the temporary directory node disappears, the corresponding offline condition of the distributed service node is indicated. And responding to the acquired offline information of the distributed service node, acquiring incomplete tasks of the offline distributed service node, and re-creating corresponding temporary directory nodes under zk task queue directory nodes for the offline distributed service node to execute tasks again after acquiring the lock positions by the threads in the idle state.
According to the distributed timing task scheduling method provided by the invention, the corresponding temporary directory nodes are created under the zk node list directory nodes after the distributed service nodes are on line, the scheduling server monitors the states of the temporary directory nodes created under the zk node list directory nodes, and in response to acquiring the offline information of the distributed service nodes, incomplete tasks of the offline distributed service nodes are acquired, and the incomplete tasks of the offline distributed service nodes are respectively created under the zk task queue directory nodes again, so that a task pushing failure re-pushing mechanism caused by the abnormity of the distributed nodes is realized.
According to the distributed timing task scheduling method provided by the invention, the temporary directory node created under the zk task queue directory node is a temporary sequence numbering directory node; after the thread acquires the lock, the task corresponding to one temporary directory node is taken away from the zk task queue directory node, and the method comprises the following steps: and after the thread acquires the lock, taking a task corresponding to the temporary directory node from the zk task queue directory node according to the sequence number of the temporary directory node.
The temporary directory nodes created under the zk task queue directory nodes are temporary sequence numbering directory nodes, namely the temporary directory nodes created under the zk task queue directory nodes are numbered in sequence according to the creation time, and the earlier the creation time is, the smaller the number is.
When the thread acquires the task corresponding to the temporary directory node from the zk task queue directory node after locking, the task corresponding to the temporary directory node is taken from the zk task queue directory node according to the serial number of the temporary sequence numbered directory node. For example, if the zk task queue directory node includes tasks that have been pulled from the database twice or more, then the tasks that were pulled earlier should be preferentially executed. When the scheduling server creates corresponding temporary directory nodes under the zk task queue directory nodes respectively for the tasks reaching the execution time, the scheduling server can also sequentially create the corresponding temporary directory nodes under the zk task queue directory nodes according to the sequence of the task execution time. Thus, after the thread acquires the lock, when the task corresponding to one temporary directory node is taken away from the zk task queue directory node, the task with the small sequence number can be taken away first for execution, so that the task with the early execution time can be executed preferentially.
According to the distributed timing task scheduling method, the temporary directory node created under the zk task queue directory node is set to be the temporary sequence number directory node, and after a thread acquires a lock, a task corresponding to the temporary directory node is taken away from the zk task queue directory node according to the sequence number of the temporary sequence number directory node, so that the task with early execution time can be executed preferentially.
The invention provides a distributed timing task scheduling method, which further comprises the following steps: the distributed service node updates the state of the corresponding task in the database after the thread takes the task and executes the task; before the tasks that the distributed service node to be offline do not complete re-create corresponding temporary directory nodes under the zk task queue directory nodes, respectively, the method further includes: and inquiring the database according to the information of the offline distributed service node to acquire the incomplete task.
After the thread takes the task, the distributed service node updates the state of the corresponding task in the database after executing the task. For example, after a thread takes a task, updating the state of the corresponding task in the database to be that the task is taken away and is in process; after the thread executes the task, the state of the corresponding task in the database is updated to the state that the task has been executed or processed.
The distributed service node may run multiple threads for task processing. Before the scheduling server re-creates corresponding temporary directory nodes under zk task queue directory nodes respectively, the incomplete tasks are obtained according to the information query database of the offline distributed service nodes, and the incomplete tasks, namely the offline distributed service nodes are marked as being processed in the database but are not marked as being executed.
According to the distributed timing task scheduling method provided by the invention, the task state in the database is updated by using the distributed service node, so that the scheduling server can conveniently acquire the information of the unfinished task when the distributed service node is off line, and the task processing is carried out again.
According to the distributed timing task scheduling method provided by the invention, the thread queuing acquisition lock of the distributed service node in the idle state comprises the following steps: the thread in the idle state of the distributed service node creates a corresponding temporary sequence number directory node under the zk thread queue directory node; and the thread acquires the serial numbers of other temporary sequence number directory nodes under the zk thread queue directory node, and acquires a lock if the serial number of the temporary sequence number directory node corresponding to the thread is minimum.
The zk thread queue directory node may be a persistent directory node of a zookeeper. The distributed service node is in idle state, the corresponding temporary sequence number directory nodes are created under zk thread queue directory nodes, and the temporary sequence number directory nodes are sequentially increased according to different creation sequences. The thread acquires the serial numbers of other temporary sequence number directory nodes under the zk thread queue directory node, and if the serial number of the temporary sequence number directory node corresponding to the thread is minimum, the thread acquires the lock. After the lock is acquired, the zk task queue directory node can be accessed. The thread which acquires the lock can process the task, and the thread without the lock waits, so that dirty data generated by operating on a piece of data at the same time is avoided.
After the thread acquires the lock, the connection of the corresponding temporary directory node under the zk thread queue directory node is disconnected, and the corresponding temporary directory node disappears.
According to the distributed timing task scheduling method provided by the invention, the corresponding temporary sequence number directory nodes are created under the zk thread queue directory nodes to control the acquisition of the locks by the threads, so that the reliability of the control of the lock functions is improved.
According to the distributed timing task scheduling method provided by the invention, after a thread in an idle state of a distributed service node is queued to acquire a lock, the method further comprises the following steps: and in response to the temporary directory node under the zk task queue directory node being empty, the thread listens to the zk task queue directory node.
After the thread queuing of the distributed service node in the idle state acquires the lock, the task needs to be acquired from the zk task queue directory node, if the temporary directory node under the zk task queue directory node is empty, the task waiting for processing at the moment is not indicated, the thread monitors the state of the zk task queue directory node, and if a new temporary directory node is generated, the corresponding task is acquired.
According to the distributed timing task scheduling method provided by the invention, the thread monitors the zk task queue directory node by responding to the fact that the temporary directory node under the zk task queue directory node is empty, so that the reliability and timeliness of the thread for acquiring the task are ensured.
FIG. 2 is a second flow chart of the distributed timing task scheduling method according to the present invention. As shown in fig. 2, serviceA is a service provided by a dispatch server, serviceB is a service provided by a distributed server, and the method includes:
the scheduling server regularly pulls all unexecuted tasks in a period of time from a Database (Database), writes the tasks reaching the execution time into zk Task queue nodes task_queue, wherein the zk Task queue nodes task_queue are the Task queue target nodes, writes the tasks reaching the execution time into the zk Task queue nodes task_queue, namely establishes corresponding temporary directory nodes under the zk Task queue nodes task_queue, such as Task1, task2 and Task3, and respectively corresponds to different tasks.
The distributed service Node is on line to write the distributed service Node into the node_list Node, wherein the node_list Node is the zk Node list directory Node, and the distributed service Node is on line to write the distributed service Node into the node_list Node, namely the distributed service Node establishes a temporary directory Node corresponding to the distributed service Node under the node_list Node.
Each idle Thread (Thread in idle state) of the distributed service node writes itself into a thread_queue node, the thread_queue node is the zk Thread queue directory node, each Thread writes itself into the thread_queue node, that is, each Thread establishes a temporary directory node corresponding to itself under the thread_queue node, and the temporary directory node is a temporary sequence number directory node. Thus, the thread obtains the sequence number of the other temporary sequence number directory node, and if the sequence number is the smallest, the thread obtains the lock.
And acquiring a thread of the lock to a zk Task queue directory node (task_queue), taking a Task corresponding to one temporary directory node, releasing the lock, and monitoring the zk Task queue directory node if the Task queue is empty (the zk Task queue directory node does not have the temporary directory node).
The distributed service node updates the state of the corresponding task in the database after the thread takes the task and the task is processed. The scheduling server monitors the state of the node_list Node, acquires the offline information of the distributed service Node, queries a database to acquire the information of the incomplete Task after a certain distributed service Node is offline, and rewrites the incomplete Task to the task_queue Node so as to enable the thread to take out and process the corresponding Task.
The distributed timing task scheduling method provided by the invention can realize batch acquisition of time period tasks, fully utilize server resources and reduce access pressure to a database, and simultaneously increase a failed pushing and re-pushing mechanism caused by abnormal node service, so that the problems of database pressure or task backlog caused by overhigh or overlow frequency of acquiring single task by timing access to the database can be solved, and the problem of task pushing failure caused by abnormal node service can be solved. According to the distributed timing task scheduling method, the distributed timing task pushing service ensures that the server resources are fully utilized in the peak time triggered by the timing task time, so that the task backlog is reduced to the minimum, and meanwhile, the database is ensured not to bear excessive access pressure.
The distributed timed task scheduling system provided by the invention is described below, and the distributed timed task scheduling system described below and the distributed timed task scheduling method described above can be referred to correspondingly.
Fig. 3 is a schematic structural diagram of a distributed timing task scheduling system provided by the present invention. As shown in fig. 3, the system includes: a dispatch server 1, a zookeeper server 2 and a distributed service node cluster 3 comprising at least one distributed service node 31, wherein: the scheduling server 1 is configured to: the method comprises the steps of regularly pulling tasks to be executed in a preset time period from a database, and creating corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively; the distributed service node 31 is configured to: and queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
According to the distributed timing task scheduling system provided by the invention, the scheduling server pulls tasks to be executed in a preset time period in the future from the database at fixed time, and can set a relatively longer time interval to pull the tasks, so that excessive access to the database is avoided, and the access pressure to the database is reduced; by pulling tasks to be executed within a period of time and creating corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time, the task to be executed is stored in the temporary directory nodes of the zookeeper in advance so as to be read and executed by a thread, and the waiting time of the thread when the task data is read is reduced; meanwhile, the thread in the idle state of the distributed service node queues to acquire the lock, and the thread acquires the task corresponding to one temporary directory node from the zk task queue directory node after the lock is acquired, so that the thread is fully utilized, and the resource waste is avoided; therefore, task backlog is avoided, and high availability of distributed server resources is ensured.
According to the distributed timing task scheduling system provided by the invention, the distributed service node 31 is further used for creating a corresponding temporary directory node under the zk node list directory node after being on line; the scheduling server 1 is further configured to monitor a state of a temporary directory node created under the zk node list directory node, obtain an incomplete task of the offline distributed service node in response to obtaining the offline information of the distributed service node, and re-create a corresponding temporary directory node under the zk task queue directory node with the incomplete task of the offline distributed service node.
According to the distributed timing task scheduling system provided by the invention, the corresponding temporary directory nodes are created under the zk node list directory nodes after the distributed service nodes are on line, the scheduling server monitors the states of the temporary directory nodes created under the zk node list directory nodes, and in response to acquiring the offline information of the distributed service nodes, incomplete tasks of the offline distributed service nodes are acquired, and the incomplete tasks of the offline distributed service nodes are respectively created under the zk task queue directory nodes again, so that a task pushing failure re-pushing mechanism caused by the abnormity of the distributed nodes is realized.
According to the distributed timing task scheduling system provided by the invention, the temporary directory node created under the zk task queue directory node is a temporary sequence numbering directory node; the distributed service node 31 is specifically configured to, when the task corresponding to one temporary directory node is taken away from the zk task queue directory node after the thread acquires the lock: and after the thread acquires the lock, numbering the serial number of the directory node to the zk task queue directory node according to the temporary sequence, and taking a task corresponding to the temporary directory node.
According to the distributed timing task scheduling system provided by the invention, the temporary directory node created under the zk task queue directory node is set as the temporary sequence number directory node, and the task corresponding to the temporary directory node is taken away from the zk task queue directory node according to the sequence number of the temporary sequence number directory node after the thread acquires the lock, so that the task with early execution time can be executed preferentially.
According to the distributed timing task scheduling system provided by the invention, the distributed service node 31 is further used for updating the state of the corresponding task in the database after the thread takes the task and after executing the task; the scheduling server 1 is further configured to, before being configured to re-create the corresponding temporary directory nodes under the zk task queue directory nodes, the tasks that the distributed service node is not completed with the offline, respectively: and inquiring the database according to the information of the offline distributed service node to acquire the incomplete task.
According to the distributed timing task scheduling system provided by the invention, the task state in the database is updated by using the distributed service node, so that the scheduling server can conveniently acquire the information of unfinished tasks when the distributed service node is offline, and the task processing is carried out again.
According to the distributed timing task scheduling system provided by the invention, when the distributed service node 31 is used for queuing and acquiring locks for threads in an idle state, the distributed service node is specifically used for: creating a corresponding temporary sequence number directory node under the zk thread queue directory node by the thread in the idle state; and the thread acquires the serial numbers of other temporary sequence number directory nodes under the zk thread queue directory node, and acquires a lock if the serial number of the temporary sequence number directory node corresponding to the thread is minimum.
According to the distributed timing task scheduling system provided by the invention, the corresponding temporary sequence number directory nodes are created under the zk thread queue directory nodes to control the acquisition of the locks by the threads, so that the reliability of the control of the lock functions is improved.
According to the distributed timing task scheduling system provided by the invention, after the distributed service node 31 is used for queuing and acquiring locks for the threads in the idle state, the distributed service node is further used for: and in response to the temporary directory node under the zk task queue directory node being empty, the thread listens to the zk task queue directory node.
According to the distributed timing task scheduling system provided by the invention, the thread monitors the zk task queue directory node by responding to the fact that the temporary directory node under the zk task queue directory node is empty, so that the reliability and timeliness of the thread for acquiring tasks are ensured.
Fig. 4 illustrates a physical schematic diagram of an electronic device, as shown in fig. 4, which may include: processor 410, communication interface (Communications Interface) 420, memory 430 and communication bus 440, wherein processor 410, communication interface 420 and memory 430 communicate with each other via communication bus 440. Processor 410 may invoke logic instructions in memory 430 to perform a distributed timed task scheduling method that includes the steps performed by a scheduling server: the method comprises the steps of regularly pulling tasks to be executed in a preset time period from a database, and creating corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively; or including the steps performed by the distributed service node: and queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of performing the distributed timing task scheduling method provided by the methods above, the method comprising the steps of: the method comprises the steps of regularly pulling tasks to be executed in a preset time period from a database, and creating corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively; or including the steps performed by the distributed service node: and queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the distributed timed task scheduling method provided by the above methods, the method comprising the steps of: the method comprises the steps of regularly pulling tasks to be executed in a preset time period from a database, and creating corresponding temporary directory nodes under zk task queue directory nodes for the tasks reaching the execution time respectively; or including the steps performed by the distributed service node: and queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing and acquiring the lock again after executing the task.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. A distributed timing task scheduling method, comprising:
the scheduling server pulls tasks to be executed in a future preset period from the database according to preset time intervals, and creates corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time; wherein the future preset period is set according to the preset time interval;
the distributed service node is in an idle state, the thread queues to acquire a lock, the thread acquires the lock, then takes away a task corresponding to the temporary directory node from the zk task queue directory node, releases the lock, and re-queues to acquire the lock after executing the task;
after the thread in the idle state of the distributed service node queues to acquire the lock, the method further comprises: and in response to the temporary directory node under the zk task queue directory node being empty, the thread listens to the zk task queue directory node.
2. The distributed timed task scheduling method according to claim 1, characterized in that the method further comprises:
creating a corresponding temporary directory node under the zk node list directory node after the distributed service node is online;
the scheduling server monitors the state of the temporary directory node created under the zk node list directory node, acquires the incomplete task of the offline distributed service node in response to acquiring the offline information of the distributed service node, and re-creates the corresponding temporary directory node under the zk task queue directory node according to the incomplete task of the offline distributed service node.
3. The distributed timed task scheduling method according to claim 1, wherein the temporary directory node created under the zk task queue directory node is a temporary sequence numbered directory node;
after the thread acquires the lock, the task corresponding to one temporary directory node is taken away from the zk task queue directory node, and the method comprises the following steps: and after the thread acquires the lock, taking a task corresponding to the temporary directory node from the zk task queue directory node according to the sequence number of the temporary directory node.
4. The distributed timed task scheduling method according to claim 2, characterized in that the method further comprises: the distributed service node updates the state of the corresponding task in the database after the thread takes the task and executes the task;
before the tasks that the distributed service node to be offline do not complete re-create corresponding temporary directory nodes under the zk task queue directory nodes, respectively, the method further includes: and inquiring the database according to the information of the offline distributed service node to acquire the incomplete task.
5. The method of claim 1, wherein the thread queuing the distributed service node in an idle state acquires a lock, comprising:
the thread in the idle state of the distributed service node creates a corresponding temporary sequence number directory node under the zk thread queue directory node;
and the thread acquires the serial numbers of other temporary sequence number directory nodes under the zk thread queue directory node, and acquires a lock if the serial number of the temporary sequence number directory node corresponding to the thread is minimum.
6. A distributed timed task scheduling system comprising a scheduling server, a zookeeper server, and a distributed service node cluster, the distributed server node cluster comprising at least one distributed service node, wherein:
the scheduling server is used for: pulling tasks to be executed in a future preset period from a database according to preset time intervals at fixed time, and creating corresponding temporary directory nodes under zk task queue directory nodes respectively for the tasks reaching the execution time; wherein the future preset period is set according to the preset time interval;
the distributed service node is configured to: queuing and acquiring a lock by a thread in an idle state, taking a task corresponding to the temporary directory node from the zk task queue directory node after the thread acquires the lock, releasing the lock, and queuing again to acquire the lock after executing the task;
after the distributed service node queues the thread in the idle state to acquire the lock, the distributed service node is further configured to: and in response to the temporary directory node under the zk task queue directory node being empty, the thread listens to the zk task queue directory node.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, performs the steps performed by a scheduling server or a distributed service node in a distributed timed task scheduling method according to any one of claims 1 to 5.
8. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps performed by a scheduling server or a distributed service node in a distributed timed task scheduling method according to any one of claims 1 to 5.
CN202210541507.2A 2022-05-17 2022-05-17 Distributed timing task scheduling method, system, storage medium and program product Active CN115185673B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210541507.2A CN115185673B (en) 2022-05-17 2022-05-17 Distributed timing task scheduling method, system, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210541507.2A CN115185673B (en) 2022-05-17 2022-05-17 Distributed timing task scheduling method, system, storage medium and program product

Publications (2)

Publication Number Publication Date
CN115185673A CN115185673A (en) 2022-10-14
CN115185673B true CN115185673B (en) 2023-10-31

Family

ID=83513972

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210541507.2A Active CN115185673B (en) 2022-05-17 2022-05-17 Distributed timing task scheduling method, system, storage medium and program product

Country Status (1)

Country Link
CN (1) CN115185673B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118332174B (en) * 2024-06-13 2024-10-29 荣耀终端有限公司 Data crawling method, system and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108132833A (en) * 2017-12-08 2018-06-08 广州视源电子科技股份有限公司 Distributed timing task scheduling method and device based on zookeeper technology
CN111290854A (en) * 2020-01-20 2020-06-16 腾讯科技(深圳)有限公司 Task management method, device and system, computer storage medium and electronic equipment
CN112307105A (en) * 2020-11-03 2021-02-02 平安普惠企业管理有限公司 Timing task running method, device, equipment and storage medium based on multithreading
CN112486695A (en) * 2020-12-07 2021-03-12 浪潮云信息技术股份公司 Distributed lock implementation method under high concurrency service
US20210374152A1 (en) * 2020-06-02 2021-12-02 Illinois Institute Of Technology Label-based data representation i/o process and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108132833A (en) * 2017-12-08 2018-06-08 广州视源电子科技股份有限公司 Distributed timing task scheduling method and device based on zookeeper technology
CN111290854A (en) * 2020-01-20 2020-06-16 腾讯科技(深圳)有限公司 Task management method, device and system, computer storage medium and electronic equipment
US20210374152A1 (en) * 2020-06-02 2021-12-02 Illinois Institute Of Technology Label-based data representation i/o process and system
CN112307105A (en) * 2020-11-03 2021-02-02 平安普惠企业管理有限公司 Timing task running method, device, equipment and storage medium based on multithreading
CN112486695A (en) * 2020-12-07 2021-03-12 浪潮云信息技术股份公司 Distributed lock implementation method under high concurrency service

Also Published As

Publication number Publication date
CN115185673A (en) 2022-10-14

Similar Documents

Publication Publication Date Title
CN108388479B (en) Delayed message pushing method and device, computer equipment and storage medium
US8832173B2 (en) System and method of multithreaded processing across multiple servers
US10338958B1 (en) Stream adapter for batch-oriented processing frameworks
US8938421B2 (en) Method and a system for synchronizing data
CN108563502B (en) Task scheduling method and device
US10133797B1 (en) Distributed heterogeneous system for data warehouse management
US20050038772A1 (en) Fast application notification in a clustered computing system
CN113448712A (en) Task scheduling execution method and device
CN110647570B (en) Data processing method and device and electronic equipment
CN110543512B (en) Information synchronization method, device and system
CN115185673B (en) Distributed timing task scheduling method, system, storage medium and program product
CN110619014A (en) ETL-based data extraction method
CN115185787B (en) Method and device for processing transaction log
US8301750B2 (en) Apparatus, system, and method for facilitating communication between an enterprise information system and a client
CN112433830A (en) ZooKeeper-based distributed task scheduling method, system and storage medium
CN111158930A (en) Redis-based high-concurrency time-delay task system and processing method
US20200236165A1 (en) System and method for synchronization of media objects between devices operating in a multiroom system
CN113761052A (en) Database synchronization method and device
US10922145B2 (en) Scheduling software jobs having dependencies
CN115189931A (en) Distributed key management method, device, equipment and storage medium
CN113722390A (en) Data storage method and system
US11836125B1 (en) Scalable database dependency monitoring and visualization system
EP3214549A1 (en) Information processing device, method, and program
EP3198803B1 (en) Message service
CN115543585B (en) Enterprise number card data synchronization method, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant