CN111782360A

CN111782360A - Distributed task scheduling method and device

Info

Publication number: CN111782360A
Application number: CN202010594993.5A
Authority: CN
Inventors: 王轶凡; 张楠; 陈灿; 申木川
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2020-06-28
Filing date: 2020-06-28
Publication date: 2020-10-16
Anticipated expiration: 2040-06-28
Also published as: CN111782360B

Abstract

The embodiment of the application provides a distributed task scheduling method and a distributed task scheduling device, wherein the method comprises the following steps: independently accessing a database for storing task configuration information to acquire configuration information of a target task; if the target task is determined not to designate a server, after the target task is determined to meet a preset triggering condition, adding the target task into an asynchronous queue so as to enable one of a plurality of servers in distributed arrangement to acquire the target task from the asynchronous queue in a competitive random access mode; and monitoring the server executing the target task in real time, and controlling the target task to recover the normal running state when the target task is in the abnormal running state. The method and the device can effectively improve the efficiency, reliability and effectiveness of the task allocation process, the task execution process and the exception handling process, further can comprehensively and effectively improve the efficiency of the whole task scheduling process, and can improve the robustness and the task processing efficiency of the task scheduling system of an enterprise.

Description

Distributed task scheduling method and device

Technical Field

The application relates to the technical field of data processing, in particular to a distributed task scheduling method and device.

Background

In most enterprises, a task scheduling system is a very important infrastructure system, and most of the existing task scheduling tools are realized based on a java thread pool and a time slice rotation principle.

At present, a task scheduling process often designates a server to sequentially execute a plurality of independent tasks, but the task allocation process, the processing timeliness and the abnormal conditions are not effectively controlled. Therefore, in the existing task scheduling process, task acquisition failure is often caused by a plurality of servers acquiring tasks simultaneously, server overload operation is caused by a single task being executed for too long, or task execution abnormality cannot be found in time, and the like, so that the task scheduling process is blocked, for example, resources cannot be released due to queue congestion. Therefore, in any of the above cases, the task processing efficiency in the whole task scheduling process is low.

Disclosure of Invention

Aiming at the problems in the prior art, the application provides a distributed task scheduling method and device, which can effectively improve the efficiency, reliability and effectiveness of a task allocation process, a task execution process and an exception handling process, further can comprehensively and effectively improve the efficiency of the whole task scheduling process, and can improve the robustness and task processing efficiency of a task scheduling system of an enterprise.

In order to solve the technical problem, the application provides the following technical scheme:

in a first aspect, the present application provides a distributed task scheduling method, including:

separately accessing a database for storing task configuration information to acquire configuration information of a target task from the database;

if the target task is determined to be a server which is not specified according to the configuration information of the target task, after the target task is determined to meet a preset triggering condition, adding the target task into an asynchronous queue so as to enable one of a plurality of servers which are distributed to be arranged to obtain the target task from the asynchronous queue in a competitive random access mode;

and monitoring a server executing the target task in real time, and controlling the target task to restore to a normal operation state again by applying a preset scheduling mode when the target task is monitored to be in an abnormal operation state.

Further, the individually accessing a database for storing task configuration information to obtain the configuration information of the target task from the database includes:

checking whether the database is in a locked state currently, and if not, applying a database lock to lock the database;

and reading the configuration information of the target task from the database.

Further, still include:

and if the database is in the locked state currently after being checked, checking whether the database is in the locked state currently again after a preset time interval.

Further, if it is determined that the target task does not specify a server according to the configuration information of the target task, after it is determined that the target task meets a preset trigger condition, adding the target task to an asynchronous queue, so that one of a plurality of servers in distributed configuration obtains the target task from the asynchronous queue in a competitive random access manner, including:

judging whether the target task designates a server or not from the configuration information of the target task, and if not, determining that the resource occupancy rate of the target task is smaller than an occupancy threshold value;

and judging whether the current task number of the asynchronous queue is equal to or greater than a preset maximum concurrency threshold value or not, if not, and adding the target task into the asynchronous queue after determining that the target task meets a preset trigger condition according to the configuration information of the target task, so that one of a plurality of servers in distributed arrangement obtains the target task from the asynchronous queue in a competitive random access mode.

Further, still include:

if the server appointed by the target task is known through judgment, determining that the resource occupancy rate of the target task is greater than or equal to an occupancy threshold value;

and acquiring the identifier of the server appointed by the target task from the configuration information of the target task, and adding the target task and the identifier of the corresponding appointed server into an asynchronous queue so that the server appointed by the target task acquires the target task from the asynchronous queue.

Further, after the adding the target task to the asynchronous queue, the method further includes:

modifying the state information of the target task in the asynchronous queue into information for indicating that the task is in a to-be-processed state;

and releasing the database lock of the database.

Further, before the monitoring the server that is executing the target task in real time, the method further includes:

when or after the server acquires the target task in the asynchronous queue, modifying the state information of the target task in the asynchronous queue into information for indicating that the task is in an executing state;

recording the starting time of the server for executing the target task and the identification of the server;

correspondingly, the monitoring the server executing the target task in real time, and when the target task is monitored to be in an abnormal operation state, controlling the target task to resume a normal operation state again by applying a preset scheduling mode, including:

if the time for executing the target task by the server exceeds an execution time threshold according to the starting time for executing the target task by the server and the identifier of the server, determining that the target task is in an abnormal operation state currently;

carrying out interrupt processing on the current execution process of the target task;

and adding the target task into the asynchronous queue again, and/or restarting a server executing the target task.

In a second aspect, the present application provides a distributed task scheduling apparatus, including:

the task acquisition module is used for independently accessing a database for storing task configuration information to acquire the configuration information of the target task from the database;

the distributed task execution module is used for adding the target task into an asynchronous queue after the target task meets a preset trigger condition if the target task is determined to be not assigned to a server according to the configuration information of the target task, so that one of a plurality of servers in distributed arrangement obtains the target task from the asynchronous queue in a competitive random access mode;

and the exception handling module is used for monitoring the server executing the target task in real time, and controlling the target task to recover the normal running state again by applying a preset scheduling mode when the target task is monitored to be in the abnormal running state.

Further, the task obtaining module comprises:

the locking access unit is used for checking whether the database is currently in a locked state or not, and if not, the database is locked by applying a database lock;

and the information reading unit is used for reading the configuration information of the target task from the database.

Further, the task obtaining module further includes:

and the repeated query unit is used for checking whether the database is in the locked state after a preset time interval if the database is in the locked state currently after checking.

Further, the distributed task execution module is configured to perform the following:

Further, the distributed task execution module is further configured to perform the following:

Further, still include:

the first task state modification module is used for modifying the state information of the target task in the asynchronous queue into information for representing that the task is in a to-be-processed state;

and the unlocking module is used for releasing the database lock of the database.

Further, still include:

the second task state modification module is used for modifying the state information of the target task in the asynchronous queue into information for representing that the task is in an executing state when or after the server acquires the target task in the asynchronous queue;

the data recording module is used for recording the starting time of the server for executing the target task and the identifier of the server;

correspondingly, the exception handling module is configured to execute the following:

In a third aspect, the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the distributed task scheduling method when executing the program.

In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the distributed task scheduling method.

According to the technical scheme, the distributed task scheduling method and device provided by the application comprise the following steps: separately accessing a database for storing task configuration information to acquire configuration information of a target task from the database; if the target task is determined to be a server which is not specified according to the configuration information of the target task, after the target task is determined to meet a preset trigger condition, adding the target task into an asynchronous queue so as to enable one of a plurality of servers which are distributed to be arranged to acquire the target task from the asynchronous queue in a competitive random access mode for execution; and monitoring a server executing the target task in real time, and controlling the target task to restore to a normal operation state again by applying a preset scheduling mode when the target task is monitored to be in an abnormal operation state. According to the method and the system, the database for storing the task configuration information is independently accessed, so that the problem of task allocation failure caused by the fact that a plurality of servers simultaneously access the database to acquire tasks can be effectively solved, the reliability and the effectiveness of task allocation can be effectively improved, and the efficiency of a task scheduling process can be effectively improved; the tasks are divided into two types of tasks which can be executed by a designated server and can be competitively acquired by the servers, so that the tasks with large data volume can be effectively executed by the designated server, the tasks with small data volume are executed by an idle server, if the tasks with large data volume cause the queue of the designated server to be blocked or crashed, other distributed servers can still normally work, the operation of the whole task scheduling process cannot be influenced, the tasks with larger resource consumption cannot be executed on the same server, the risk of crash of the server can be effectively reduced, and the efficiency of the task scheduling process can be effectively improved; by monitoring the abnormality of the executed tasks, the abnormal tasks can be processed in time, and the task execution efficiency and the reliability of the whole task scheduling process are improved; that is to say, the method and the device can effectively improve the efficiency, reliability and effectiveness of the task allocation process, the task execution process and the exception handling process, further can comprehensively and effectively improve the efficiency of the whole task scheduling process, and can improve the robustness and the task processing efficiency of the enterprise task scheduling system.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart illustrating a distributed task scheduling method in an embodiment of the present application.

Fig. 2 is a first specific flowchart of step 100 in the distributed task scheduling method in the embodiment of the present application.

Fig. 3 is a second specific flowchart of step 100 in the distributed task scheduling method in the embodiment of the present application.

Fig. 4 is a first specific flowchart of step 200 in the distributed task scheduling method in the embodiment of the present application.

Fig. 5 is a second specific flowchart of step 200 in the distributed task scheduling method in the embodiment of the present application.

Fig. 6 is a flowchart illustrating a distributed task scheduling

method including steps

010 and 020 in this embodiment.

Fig. 7 is a flowchart illustrating a distributed task scheduling

method including steps

030, 040, and 310 through 330 in this embodiment of the present application.

Fig. 8 is a schematic diagram of a first structure of a distributed task scheduling apparatus in an embodiment of the present application.

Fig. 9 is a schematic diagram of a first structure of the task obtaining module 10 in the distributed task scheduling apparatus in the embodiment of the present application.

Fig. 10 is a schematic diagram of a second structure of the task obtaining module 10 in the distributed task scheduling apparatus in the embodiment of the present application.

Fig. 11 is a schematic diagram of a second structure of a distributed task scheduling apparatus in an embodiment of the present application.

Fig. 12 is a schematic diagram of a third structure of a distributed task scheduling apparatus in an embodiment of the present application.

FIG. 13 is a schematic diagram of an implementation of a distributed task scheduling system based on bypass asynchronous queue monitoring according to an application example of the present application.

Fig. 14 is a schematic diagram of a data processing process of a task computing module provided in an application example of the present application.

FIG. 15 is a diagram of a data processing procedure for each atomic task in an asynchronous queue provided by an example of the application of the present application.

FIG. 16 is a flow chart of task asynchronous execution and bypass monitoring business provided by an application example of the present application.

Fig. 17 is a schematic structural diagram of an electronic device in the embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In view of the problem of low task processing efficiency in the whole task scheduling process existing in the existing task scheduling process, embodiments of the present application respectively provide a distributed task scheduling method, a distributed task scheduling apparatus, an electronic device, and a computer-readable storage medium, where a database for storing task configuration information is separately accessed to obtain configuration information of a target task from the database, and the problem of task allocation failure caused by multiple servers simultaneously accessing the database to obtain tasks can be effectively avoided by separately accessing the database for storing task configuration information, so that reliability and effectiveness of task allocation can be effectively improved, and further efficiency of the task scheduling process can be effectively improved; if the target task is determined not to designate a server according to the configuration information of the target task, after the target task is determined to meet a preset triggering condition, the target task is added into an asynchronous queue, so that one of a plurality of servers in distributed arrangement acquires the target task from the asynchronous queue in a competitive random access mode, the tasks are divided into two types of tasks which can be executed by the designated server and can be competitively acquired by the servers, the tasks with large data volume can be effectively executed by the designated server, the tasks with small data volume are executed by an idle server, if the tasks with large data volume cause queue blockage or downtime of the designated server, other servers in distributed arrangement can still normally work, the operation of the whole task scheduling process cannot be influenced, and the tasks with larger consumed resources cannot be executed on the same server, the risk of the downtime of the server can be effectively reduced, and the efficiency of the task scheduling process can be effectively improved; the method comprises the steps of monitoring a server executing the target task in real time, controlling the target task to restore to a normal running state again by applying a preset scheduling mode when the target task is monitored to be in an abnormal running state, and carrying out abnormal monitoring on the executed task to timely process the abnormal task so as to improve the efficiency of task execution and the reliability of the whole task scheduling process; that is to say, the method and the device can effectively improve the efficiency, reliability and effectiveness of the task allocation process, the task execution process and the exception handling process, further can comprehensively and effectively improve the efficiency of the whole task scheduling process, and can improve the robustness and the task processing efficiency of the enterprise task scheduling system.

The following is a description of each of the embodiments.

In order to avoid the problem of low task scheduling efficiency caused by the fact that task acquisition fails due to the fact that a plurality of servers acquire tasks simultaneously, server overload operation is caused due to the fact that a single task is executed for too long time, or task execution abnormity cannot be found in time, and the like, the task scheduling process is blocked, the application provides an embodiment of a distributed task scheduling method, and referring to fig. 1, the distributed task scheduling method specifically comprises the following contents:

step 100: a database for storing task configuration information is separately accessed to obtain configuration information of a target task from the database.

In step 100, individual access means that only one server is allowed to access the database during the current time period.

It is to be understood that in one or more embodiments provided herein, the task refers to a basic work element to be performed by a computer in a multiprogramming or multiprocessing environment, which is one or more sequences of instructions processed by a control program, and may specifically refer to a data file. Each task represents a single thread that is executed by a program or a group of programs. The first program executed under each task is the main program, and the others are all auxiliary programs.

Step 200: and if the target task is determined not to designate a server according to the configuration information of the target task, after the target task is determined to meet a preset trigger condition, adding the target task into an asynchronous queue so as to enable one of a plurality of servers in distributed arrangement to acquire the target task from the asynchronous queue in a competitive random access mode.

It can be understood that the database is mainly used for storing various pieces of configuration information of each task, the configuration information includes identification of the task, task trigger conditions, allowed concurrency number, start and end time of task execution, each completion condition and address of a server needing to be executed, and the database provides a database lock for a currently accessed server.

And then, executing the target task by the server which acquires the target task from the asynchronous queue.

In step 200, the specific process of determining whether the target task meets the preset trigger condition is as follows: and reading task configuration in the configuration information of the target task, and acquiring the task number, the last completion time, the trigger condition and the maximum concurrency number information. And inquiring the number of the tasks currently executed in the current asynchronous queue through the task number, if the number is larger than the maximum concurrency number, not continuously judging, and if not, continuously judging whether the triggering condition is met. Comparing the current time, the trigger time and the last successful completion time through the trigger conditions (such as interval type, time point type, multiple times per day type, appointed date type and the like) of the tasks, judging whether the tasks meet the conditions needing to be operated, and if so, grabbing the tasks and putting the tasks into an asynchronous queue to wait for processing.

The random access method refers to a process from a user sending a random access preamble to trying to access a network to a process before a basic signaling connection is established between the user and the network. The random access is classified into a competitive random access and a non-competitive random access. The contention random access refers to a random access procedure initiated by a random selection of a preamble by a server. In the contention-based random access method, for a server, an asynchronous queue is a resource pool to select, and different servers can use the same resource, resulting in resource contention. On the other hand, in the non-contention based random access scheme, a specific resource is reserved and allocated to a server at a certain time.

Step 300: and monitoring a server executing the target task in real time, and controlling the target task to restore to a normal operation state again by applying a preset scheduling mode when the target task is monitored to be in an abnormal operation state.

As can be seen from the above description, the distributed task scheduling method provided in the embodiment of the present application, by independently accessing the database for storing task configuration information, can effectively avoid the problem of task allocation failure caused by a plurality of servers accessing the database to obtain tasks at the same time, can effectively improve the reliability and effectiveness of task allocation, and can further effectively improve the efficiency of the task scheduling process; the tasks are divided into two types of tasks which can be executed by a designated server and can be competitively acquired by the servers, so that the tasks with large data volume can be effectively executed by the designated server, the tasks with small data volume are executed by an idle server, if the tasks with large data volume cause the queue of the designated server to be blocked or crashed, other distributed servers can still normally work, the operation of the whole task scheduling process cannot be influenced, the tasks with larger resource consumption cannot be executed on the same server, the risk of crash of the server can be effectively reduced, and the efficiency of the task scheduling process can be effectively improved; by monitoring the abnormality of the executed tasks, the abnormal tasks can be processed in time, and the task execution efficiency and the reliability of the whole task scheduling process are improved; that is to say, the method and the device can effectively improve the efficiency, reliability and effectiveness of the task allocation process, the task execution process and the exception handling process, further can comprehensively and effectively improve the efficiency of the whole task scheduling process, and can improve the robustness and the task processing efficiency of the enterprise task scheduling system.

In order to implement the locking and accessing process when the database is not occupied in the individual access, in an embodiment of the distributed task scheduling method provided by the present application, referring to fig. 2, step 100 in the distributed task scheduling method specifically includes the following contents:

step 110: check if the database is currently in a locked state, and if not, go to step 120.

Step 120: and locking the database by applying a database lock.

Step 130: and reading the configuration information of the target task from the database.

As can be seen from the above description, the distributed task scheduling method provided in the embodiment of the present application can further avoid the problem of task allocation failure caused by multiple servers accessing the database to obtain tasks at the same time, and can effectively improve reliability and effectiveness of task allocation, thereby effectively improving efficiency of the task scheduling process.

In order to implement the re-query process when the database is occupied in the individual access, in an embodiment of the distributed task scheduling method provided by the present application, referring to fig. 3, the following content is further specifically included after step 110 in the distributed task scheduling method:

if the database is currently in the locked state as viewed in step 110, step 140 is performed.

Step 140: and checking whether the database is in a locked state at present after a preset time interval.

As can be seen from the above description, the distributed task scheduling method provided in the embodiments of the present application provides a subsequent processing mode when the access to the database fails, and can effectively ensure the reliability and validity of the access to the database, and further can further improve the reliability and validity of task allocation, and further can effectively improve the efficiency of the task scheduling process.

In order to add a small task of an unspecified server into a queue to be randomly acquired, in an embodiment of the distributed task scheduling method provided by the present application, referring to fig. 4, a step 200 in the distributed task scheduling method specifically includes the following contents:

step 210: and judging whether the target task designates a server or not from the configuration information of the target task, and if not, determining that the resource occupancy rate of the target task is smaller than an occupancy threshold value.

Step 220: and judging whether the current task number of the asynchronous queue is equal to or greater than a preset maximum concurrency threshold, if not, executing step 230.

Step 230: and after determining that the target task meets a preset trigger condition according to the configuration information of the target task, adding the target task into an asynchronous queue so as to enable one of a plurality of servers in distributed arrangement to acquire the target task from the asynchronous queue in a competitive random access mode.

As can be seen from the above description, the distributed task scheduling method provided in the embodiment of the present application can enable a task with a small data size to be executed by an idle server, and if a task with a large data size causes a queue of a designated server to be blocked or down, other servers in distributed settings can still work normally, and the operation of the whole task scheduling process is not affected.

In order to add a large task of a designated server into a queue for the designated server to obtain, in an embodiment of the distributed task scheduling method provided by the present application, referring to fig. 5, the following contents are further specifically included after step 220 in the distributed task scheduling method:

if it is determined in step 220 that the target task is the designated server, it is determined that the resource occupancy rate of the target task is greater than or equal to the occupancy threshold, and step 240 is executed.

Step 240: and acquiring the identifier of the server appointed by the target task from the configuration information of the target task, and adding the target task and the identifier of the corresponding appointed server into an asynchronous queue so that the server appointed by the target task acquires the target task from the asynchronous queue.

As can be seen from the above description, the distributed task scheduling method provided in the embodiment of the present application can enable a task with a large data size to be executed by a specified server, and if the task with the large data size causes a queue of the specified server to be blocked or down, other servers in distributed settings can still work normally, and the operation of the whole task scheduling process is not affected.

In order to unlock the database, in an embodiment of the distributed task scheduling method provided by the present application, referring to fig. 6, after step 200 and before step 300 in the distributed task scheduling method, the following contents are further included:

step 010: and modifying the state information of the target task in the asynchronous queue into information for indicating that the task is in a to-be-processed state.

Step 020: and releasing the database lock of the database.

As can be seen from the above description, the distributed task scheduling method provided in the embodiment of the present application can effectively unlock the database in time, so that other servers can access the database in time, and thus reliability and validity of accessing the database can be effectively ensured, and reliability and validity of task allocation can be further improved.

In order to process an abnormal task, in an embodiment of the distributed task scheduling method provided by the present application, referring to fig. 7, after step 020 and before step 300 in the distributed task scheduling method, the following contents are further specifically included:

step 030: and modifying the state information of the target task in the asynchronous queue into information for indicating that the task is in an executing state when or after the server acquires the target task in the asynchronous queue.

Step 040: and recording the starting time of the server for executing the target task and the identification of the server.

Correspondingly, the step 300 specifically includes the following steps:

step 310: and if the time for executing the target task by the server exceeds an execution time threshold according to the starting time for executing the target task by the server and the identifier of the server, determining that the target task is in an abnormal operation state currently.

Step 320: and carrying out interrupt processing on the current execution process of the target task.

Step 330: and adding the target task into the asynchronous queue again, and/or restarting a server executing the target task.

As can be seen from the above description, the distributed task scheduling method provided in the embodiments of the present application can process an abnormal task in time, and improve the efficiency of task execution and the reliability of the entire task scheduling process.

In terms of software, in order to avoid the problem of low task scheduling efficiency caused by the fact that task acquisition fails due to the fact that a plurality of servers acquire tasks simultaneously, server overload operation is caused due to the fact that a single task is executed for too long time, or task execution abnormality cannot be found in time, and the like, the present application provides an embodiment of a distributed task scheduling device for executing all or part of the contents in the distributed task scheduling method, and referring to fig. 8, the distributed task scheduling device specifically includes the following contents:

and the task obtaining module 10 is used for separately accessing a database for storing task configuration information to obtain the configuration information of the target task from the database.

And the distributed task execution module 20 is configured to, if it is determined that the target task does not specify a server according to the configuration information of the target task, add the target task to an asynchronous queue after determining that the target task meets a preset trigger condition, so that one of the servers in the distributed configuration obtains the target task from the asynchronous queue in a competitive random access manner.

And the exception handling module 30 is configured to monitor a server executing the target task in real time, and when it is monitored that the target task is in an abnormal operation state, control the target task to resume a normal operation state again by using a preset scheduling manner.

As can be seen from the above description, the distributed task scheduling device provided in the embodiment of the present application, by independently accessing the database for storing task configuration information, can effectively avoid the problem of task allocation failure caused by a plurality of servers accessing the database to obtain tasks at the same time, can effectively improve the reliability and effectiveness of task allocation, and can further effectively improve the efficiency of the task scheduling process; the tasks are divided into two types of tasks which can be executed by a designated server and can be competitively acquired by the servers, so that the tasks with large data volume can be effectively executed by the designated server, the tasks with small data volume are executed by an idle server, if the tasks with large data volume cause the queue of the designated server to be blocked or crashed, other distributed servers can still normally work, the operation of the whole task scheduling process cannot be influenced, the tasks with larger resource consumption cannot be executed on the same server, the risk of crash of the server can be effectively reduced, and the efficiency of the task scheduling process can be effectively improved; by monitoring the abnormality of the executed tasks, the abnormal tasks can be processed in time, and the task execution efficiency and the reliability of the whole task scheduling process are improved; that is to say, the method and the device can effectively improve the efficiency, reliability and effectiveness of the task allocation process, the task execution process and the exception handling process, further can comprehensively and effectively improve the efficiency of the whole task scheduling process, and can improve the robustness and the task processing efficiency of the enterprise task scheduling system.

In order to implement the locking and accessing process when the database is not occupied in the individual access, in an embodiment of the distributed task scheduling apparatus provided in the present application, referring to fig. 9, a task obtaining module 10 in the distributed task scheduling apparatus specifically includes the following contents:

and the locking access unit 11 is configured to check whether the database is currently in a locked state, and if not, apply a database lock to lock the database.

And an information reading unit 12, configured to read configuration information of the target task from the database.

As can be seen from the above description, the distributed task scheduling device provided in the embodiment of the present application can further avoid the problem of task allocation failure caused by a plurality of servers accessing the database to obtain tasks at the same time, and can effectively improve reliability and effectiveness of task allocation, thereby effectively improving efficiency of the task scheduling process.

In order to implement the re-query process when the database is occupied in the individual access, in an embodiment of the distributed task scheduling device provided in the present application, referring to fig. 10, a task obtaining module 10 in the distributed task scheduling device further includes the following contents:

and the repeated query unit 13 is configured to, if it is known through a check that the database is currently in the locked state, check whether the database is currently in the locked state again after a preset time interval.

As can be seen from the above description, the distributed task scheduling device provided in the embodiment of the present application provides a subsequent processing mode when the access to the database fails, and can effectively ensure the reliability and validity of the access to the database, and further can further improve the reliability and validity of task allocation, and further can effectively improve the efficiency of the task scheduling process.

In order to add a small task to a queue of an unspecified server to be randomly acquired, in an embodiment of the distributed task scheduling apparatus provided in the present application, a distributed task execution module 20 in the distributed task scheduling apparatus is specifically configured to execute the following:

As can be seen from the above description, the distributed task scheduling apparatus provided in the embodiment of the present application can enable a task with a small data size to be executed by an idle server, and if a task with a large data size causes a queue of a designated server to be blocked or down, other servers in distributed settings can still work normally, and the operation of the whole task scheduling process is not affected.

In order to add a large task of a designated server into a queue for the designated server to obtain, in an embodiment of the distributed task scheduling apparatus provided in the present application, the distributed task execution module 20 in the distributed task scheduling apparatus is further specifically configured to execute the following:

As can be seen from the above description, the distributed task scheduling apparatus provided in the embodiment of the present application can enable a task with a large data size to be executed by a specified server, and if the task with the large data size causes a queue of the specified server to be blocked or down, other servers in distributed settings can still work normally, and the operation of the whole task scheduling process is not affected.

In order to unlock the database, in an embodiment of the distributed task scheduling device provided in the present application, referring to fig. 11, the following is further specifically included in the distributed task scheduling device:

the first task state modification module 01 is used for modifying the state information of the target task in the asynchronous queue into information for indicating that the task is in a to-be-processed state;

and the unlocking module 02 is used for releasing the database lock of the database.

As can be seen from the above description, the distributed task scheduling device provided in the embodiment of the present application can effectively unlock the database in time, so that other servers can access the database in time, and thus reliability and validity of accessing the database can be effectively ensured, and reliability and validity of task allocation can be further improved.

In order to process an abnormal task, in an embodiment of the distributed task scheduling device provided in the present application, referring to fig. 12, the following is further specifically included in the distributed task scheduling device:

a second task state modification module 03, configured to modify, when or after the server acquires the target task in the asynchronous queue, state information of the target task in the asynchronous queue into information indicating that the task is in an executing state;

the data recording module 04 is used for recording the starting time of the server for executing the target task and the identifier of the server;

correspondingly, the exception handling module 30 is specifically configured to execute the following:

As can be seen from the above description, the distributed task scheduling apparatus provided in the embodiments of the present application can process an abnormal task in time, and improve the efficiency of task execution and the reliability of the whole task scheduling process.

In order to further explain the scheme, the application also provides a specific application example for realizing the distributed task scheduling method by applying the distributed task scheduling system, and the application example of the application is based on a task degree scheduling principle of time slice rotation and a task queue asynchronous execution mechanism, comprehensively considers the factors of task period, expected execution time, priority and the like, and provides a method for task extraction, distributed asynchronous execution and bypass queue asynchronous monitoring. The method has the advantages that the normal scheduling of the tasks is guaranteed, the execution state of the tasks is monitored in real time, the situations that the task queue is interrupted or resources cannot be released due to the fact that the execution time of the tasks is too long are avoided, the abnormal tasks are interrupted in time and early-warning is conducted, automatic processing or manual intervention is conducted, and system resources are recycled.

Referring to fig. 13, the distributed task scheduling system based on bypass asynchronous queue monitoring includes a database module, a task calculation module, and an asynchronous execution and monitoring module. The database module is mainly used for storing a data source, a task configuration table and a parameter table of the task scheduling system, wherein the task configuration table comprises a task triggering condition, a permitted concurrency number, a starting and ending time of task execution, each completion condition and a server address required to be executed, and a database lock is provided. The task computing module is used for extracting tasks, establishing a task queue, and performing concurrency control and exception handling. And acquiring the tasks to be processed by circularly traversing the database configuration table, and adding the tasks to be processed into an asynchronous queue, wherein tasks 1 to n are formed in the asynchronous queue, and n is a positive integer greater than 2. A plurality of servers (machines 1 to N, N is a positive integer greater than 2) compete to acquire tasks in the task queue, and the asynchronous execution and monitoring modules are used for asynchronously executing the tasks, monitoring the execution condition and updating the task parameter table. And recording an execution result, performing automatic/manual intervention on the abnormal task, and rejoining the task queue to wait for execution.

Referring to fig. 14, the data processing process of the task computing module specifically includes the following contents:

step 201: and acquiring the database lock, and if the database lock is idle, locking to start a calculation task. Namely: and if the database lock is acquired from the database, the fact that other servers are calculating the task to be executed is indicated, the database lock is checked again after three seconds, and the fact that only one server is calculating the task at the same time is guaranteed. And if the current lock is idle, locking the database and continuing to execute.

Step 202: and reading the executing task in the asynchronous queue, and judging whether the maximum concurrency number allowed by the task is exceeded or not. Namely: and reading task configuration in the database, and acquiring the task number, the last completion time, the trigger condition and the maximum concurrency number information. And inquiring the number of the tasks currently executed in the current asynchronous queue through the task number, if the number is larger than the maximum concurrency number, not continuously judging, and if not, continuously judging whether the triggering condition is met.

Step 203: and acquiring the configuration of each task from a database, and comparing the trigger condition, the last completion time and the current time. And grabbing the tasks meeting the conditions and putting the tasks into an asynchronous queue. Namely: comparing the current time, the trigger time and the last successful completion time through the trigger conditions (such as interval type, time point type, multiple times per day type, appointed date type and the like) of the tasks, judging whether the tasks meet the conditions needing to be operated, and if so, grabbing the tasks and putting the tasks into an asynchronous queue to wait for processing.

Step 204: and releasing the database lock, recording the execution log and entering the next cycle. Namely: after the calculation is completed, the task to be processed is recorded in the task processing list log table, and the status is set to "pending". The database lock is released and ready to enter a calculation.

Referring to fig. 15, the data processing process of each atomic task in the asynchronous queue specifically includes the following contents:

step 301: and in the Beforecall preparation phase, setting the task state to be in execution. Namely: in the preparation stage before task execution, the task state value in the task processing list log table is in execution, and the starting time and the executed server IP are recorded.

Step 302: and the Call execution stage is used for executing the task, acquiring an execution result and monitoring the execution time. Namely: executing the task content, waiting for a return result, capturing abnormal conditions (such as error report in the task, overlong execution time and the like) in the execution process, and transmitting the execution result to the step 303.

Step 303: and (5) performing a completion phase by using the Aftercall, and recording completion time and task execution conditions. Namely: and recording the execution completion conditions of the tasks, including the execution completion time, the execution results (success, failure) and error information.

Referring to fig. 16, the task asynchronous execution and bypass monitoring service flow specifically includes the following contents:

step 401: and acquiring the tasks to be executed in the task queue. Namely: each server competes for acquiring tasks in the task queue, and database locking processing is adopted when the tasks are acquired, so that the situation that the same task is acquired by a plurality of servers and repeatedly executed is avoided.

Step 402: and submitting the task to a thread pool for execution. Namely: encapsulating the acquired tasks into a Callable interface, submitting the Callable interface to a thread pool for execution, and queuing for execution if the number of threads in the thread pool is greater than a preset threshold (5 in the system), so as to avoid the situation that the server is blocked by the tasks due to overhigh resource occupation

Step 403: and presetting task overtime time and asynchronously waiting for a processing result. Namely: and presetting timeout time, asynchronously executing and monitoring the execution condition of the thread, entering step 401 to start next cycle, and grabbing the task to execute, so as to avoid that the grabbing efficiency of the task is influenced due to waiting for the execution result, and the task queue is too long. And carrying out asynchronous bypass processing.

Step 404: and (4) performing asynchronous execution, waiting for a processing result, and performing early warning on abnormal conditions and repairing manually/automatically. Namely: and acquiring an asynchronous execution result (Future) of the task, if the execution time of the task exceeds preset time (24 hours in the system), automatically interrupting the thread, releasing memory resources, freeing a thread pool, throwing an overtime error to step 303, recording a task processing log, and if the task is interrupted due to internal abnormality of the task, also reporting and recording an internal error. The system judges whether the task can be repeatedly executed according to the parameter table, and if the task can be repeatedly executed, the failed task is automatically added into the thread queue to wait for re-execution. If the task error of the same category reaches a certain threshold value or the task can not be executed repeatedly, an early warning is sent out for manual intervention, the system provides a manual interface, and the task can be manually added into a thread queue for re-execution after the error is checked out.

In a specific example, the application example of the application is mainly used for executing asynchronous tasks such as data file loading and file asynchronous generation, and a distributed technology is adopted, and a plurality of servers compete to read a database lock to acquire a task needing asynchronous execution and add the task into a thread team. For each task, the server for executing flexible configuration can be configured through the parameter table. For example, four servers including machine 1, machine 2, machine 3, and machine 4 compete for obtaining tasks in the configuration table for execution in production, task a may access a large amount of data files to load every day, and there is a risk that the servers execute for a long time or go down, and task B needs to generate a large amount of service files to send out every day. We can specify through the configuration table that task a can only be fetched by machine 1 and task B can only be fetched by machine 2. Other less resource consuming tasks are then contended for random acquisition by machine 1, machine 2, machine 3, and machine 4. If the task A causes the queue of the machine 1 to be blocked or crashed, other three machines can still work normally, and the operation of the whole task scheduling system cannot be influenced. The task A and the task B which consume larger resources cannot be executed on the same server, and the risk of the downtime of the server is reduced. If the task A causes the machine 1 to send out interruption or overtime early warning, the method such as reloading the task or restarting the server can be used for removing faults and continuing normal operation. The system adopts a distributed task scheduling mechanism, and is safer and more reliable.

The application example of the application is based on a task degree scheduling principle of time slice rotation and a task queue asynchronous execution mechanism, a distributed task scheduling system which is provided with a task scheduling tool and used for bypass asynchronous queue monitoring is designed, server resource calculation and task scheduling are fully utilized, the task execution condition is asynchronously monitored, abnormal tasks are timely processed, the problems that resources cannot be released and the operation efficiency of the scheduling tool is low due to the fact that a certain task on a single server is abnormal in the task scheduling process are solved, and the robustness and the task processing efficiency of the whole system are improved.

In terms of hardware, to avoid the problem of low task scheduling efficiency caused by the blocked task scheduling process due to reasons that a task acquisition failure is caused by a plurality of servers acquiring tasks simultaneously, a server runs in an overload manner due to a long execution time of a single task, or an abnormal task execution cannot be found in time, the present application provides an embodiment of an electronic device for implementing all or part of the contents in the distributed task scheduling method, where the electronic device specifically includes the following contents:

fig. 17 is a schematic block diagram of a system configuration of an electronic device 9600 according to an embodiment of the present application. As shown in fig. 17, the electronic device 9600 can include a central processor 9100 and a memory 9140; the memory 9140 is coupled to the central processor 9100. Notably, this fig. 17 is exemplary; other types of structures may also be used in addition to or in place of the structure to implement telecommunications or other functions.

In one embodiment, the distributed task scheduling functionality may be integrated into a central processor. Wherein the central processor may be configured to control:

As can be seen from the above description, according to the electronic device provided in the embodiment of the present application, by independently accessing the database for storing task configuration information, the problem of task allocation failure caused by a plurality of servers accessing the database to obtain tasks at the same time can be effectively avoided, the reliability and effectiveness of task allocation can be effectively improved, and further, the efficiency of the task scheduling process can be effectively improved; the tasks are divided into two types of tasks which can be executed by a designated server and can be competitively acquired by the servers, so that the tasks with large data volume can be effectively executed by the designated server, the tasks with small data volume are executed by an idle server, if the tasks with large data volume cause the queue of the designated server to be blocked or crashed, other distributed servers can still normally work, the operation of the whole task scheduling process cannot be influenced, the tasks with larger resource consumption cannot be executed on the same server, the risk of crash of the server can be effectively reduced, and the efficiency of the task scheduling process can be effectively improved; by monitoring the abnormality of the executed tasks, the abnormal tasks can be processed in time, and the task execution efficiency and the reliability of the whole task scheduling process are improved; that is to say, the method and the device can effectively improve the efficiency, reliability and effectiveness of the task allocation process, the task execution process and the exception handling process, further can comprehensively and effectively improve the efficiency of the whole task scheduling process, and can improve the robustness and the task processing efficiency of the enterprise task scheduling system.

In another embodiment, the distributed task scheduling apparatus may be configured separately from the central processor 9100, for example, the distributed task scheduling apparatus may be configured as a chip connected to the central processor 9100, and the distributed task scheduling function is implemented by the control of the central processor.

As shown in fig. 17, the electronic device 9600 may further include: a communication module 9110, an input unit 9120, an audio processor 9130, a display 9160, and a power supply 9170. It is noted that the electronic device 9600 also does not necessarily include all of the components shown in fig. 17; in addition, the electronic device 9600 may further include components not shown in fig. 17, which can be referred to in the related art.

As shown in fig. 17, a central processor 9100, sometimes referred to as a controller or operational control, can include a microprocessor or other processor device and/or logic device, which central processor 9100 receives input and controls the operation of the various components of the electronic device 9600.

The memory 9140 can be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information relating to the failure may be stored, and a program for executing the information may be stored. And the central processing unit 9100 can execute the program stored in the memory 9140 to realize information storage or processing, or the like.

The input unit 9120 provides input to the central processor 9100. The input unit 9120 is, for example, a key or a touch input device. Power supply 9170 is used to provide power to electronic device 9600. The display 9160 is used for displaying display objects such as images and characters. The display may be, for example, an LCD display, but is not limited thereto.

The memory 9140 can be a solid state memory, e.g., Read Only Memory (ROM), Random Access Memory (RAM), a SIM card, or the like. There may also be a memory that holds information even when power is off, can be selectively erased, and is provided with more data, an example of which is sometimes called an EPROM or the like. The memory 9140 could also be some other type of device. Memory 9140 includes a buffer memory 9141 (sometimes referred to as a buffer). The memory 9140 may include an application/function storage portion 9142, the application/function storage portion 9142 being used for storing application programs and function programs or for executing a flow of operations of the electronic device 9600 by the central processor 9100.

The memory 9140 can also include a data store 9143, the data store 9143 being used to store data, such as contacts, digital data, pictures, sounds, and/or any other data used by an electronic device. The driver storage portion 9144 of the memory 9140 may include various drivers for the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, contact book applications, etc.).

The communication module 9110 is a transmitter/receiver 9110 that transmits and receives signals via an antenna 9111. The communication module (transmitter/receiver) 9110 is coupled to the central processor 9100 to provide input signals and receive output signals, which may be the same as in the case of a conventional mobile communication terminal.

Based on different communication technologies, a plurality of communication modules 9110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, may be provided in the same electronic device. The communication module (transmitter/receiver) 9110 is also coupled to a speaker 9131 and a microphone 9132 via an audio processor 9130 to provide audio output via the speaker 9131 and receive audio input from the microphone 9132, thereby implementing ordinary telecommunications functions. The audio processor 9130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 9130 is also coupled to the central processor 9100, thereby enabling recording locally through the microphone 9132 and enabling locally stored sounds to be played through the speaker 9131.

An embodiment of the present application further provides a computer-readable storage medium capable of implementing all the steps in the distributed task scheduling method in the foregoing embodiment, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements all the steps of the distributed task scheduling method in which an execution subject is a server or a client, for example, when the processor executes the computer program, the processor implements the following steps:

As can be seen from the above description, the computer-readable storage medium provided in the embodiment of the present application, by independently accessing the database for storing task configuration information, can effectively avoid the problem of task allocation failure caused by a plurality of servers accessing the database to obtain tasks at the same time, can effectively improve the reliability and effectiveness of task allocation, and can further effectively improve the efficiency of the task scheduling process; the tasks are divided into two types of tasks which can be executed by a designated server and can be competitively acquired by the servers, so that the tasks with large data volume can be effectively executed by the designated server, the tasks with small data volume are executed by an idle server, if the tasks with large data volume cause the queue of the designated server to be blocked or crashed, other distributed servers can still normally work, the operation of the whole task scheduling process cannot be influenced, the tasks with larger resource consumption cannot be executed on the same server, the risk of crash of the server can be effectively reduced, and the efficiency of the task scheduling process can be effectively improved; by monitoring the abnormality of the executed tasks, the abnormal tasks can be processed in time, and the task execution efficiency and the reliability of the whole task scheduling process are improved; that is to say, the method and the device can effectively improve the efficiency, reliability and effectiveness of the task allocation process, the task execution process and the exception handling process, further can comprehensively and effectively improve the efficiency of the whole task scheduling process, and can improve the robustness and the task processing efficiency of the enterprise task scheduling system.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A distributed task scheduling method, comprising:

2. The distributed task scheduling method of claim 1, wherein the individually accessing the database for storing task configuration information to obtain the configuration information of the target task from the database comprises:

and reading the configuration information of the target task from the database.

3. The distributed task scheduling method of claim 2, further comprising:

4. The distributed task scheduling method according to claim 1, wherein if it is determined that the target task does not specify a server according to the configuration information of the target task, after it is determined that the target task meets a preset trigger condition, the target task is added to an asynchronous queue, so that one of a plurality of servers in distributed configuration acquires the target task from the asynchronous queue in a competitive random access manner, including:

5. The distributed task scheduling method of claim 4, further comprising:

6. The distributed task scheduling method of claim 1, further comprising, after said adding the target task to the asynchronous queue:

and releasing the database lock of the database.

7. The distributed task scheduling method of claim 1, further comprising, before the monitoring the server that is executing the target task in real time:

8. A distributed task scheduler, comprising:

9. The distributed task scheduler of claim 8, wherein the task obtaining module comprises:

10. The distributed task scheduler of claim 9, wherein the task obtaining module further comprises:

11. The distributed task scheduler of claim 8, wherein the distributed task execution module is configured to:

12. The distributed task scheduler of claim 11, wherein the distributed task execution module is further configured to:

13. The distributed task scheduler of claim 8, further comprising:

14. The distributed task scheduler of claim 8, further comprising:

15. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the distributed task scheduling method of any one of claims 1 to 7 when executing the program.

16. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the distributed task scheduling method of any one of claims 1 to 7.