CN111427670A - Task scheduling method and system - Google Patents

Task scheduling method and system Download PDF

Info

Publication number
CN111427670A
CN111427670A CN201910019054.5A CN201910019054A CN111427670A CN 111427670 A CN111427670 A CN 111427670A CN 201910019054 A CN201910019054 A CN 201910019054A CN 111427670 A CN111427670 A CN 111427670A
Authority
CN
China
Prior art keywords
scheduling
task
application
item
execution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910019054.5A
Other languages
Chinese (zh)
Inventor
杨坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201910019054.5A priority Critical patent/CN111427670A/en
Publication of CN111427670A publication Critical patent/CN111427670A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a task scheduling method and a task scheduling system, and relates to the technical field of computers. One embodiment of the method comprises: allocating a scheduling thread of scheduling application to each sharding item in at least one sharding item of the current task type; the scheduling thread is used for selecting and processing execution application of the corresponding fragment item, and sending a scheduling strategy preset for the current task type and the identifier of the fragment item to the selected execution application; when receiving a task to be executed belonging to a current task type: and the execution application acquires the service data of the fragment item of the task to be executed according to the scheduling strategy and the identifier of the fragment item so as to process the fragment item of the task to be executed. The implementation mode can realize high-availability task scheduling supporting multi-task scheduling and task fragmentation.

Description

Task scheduling method and system
Technical Field
The invention relates to the technical field of computers, in particular to a task scheduling method and a task scheduling device.
Background
The timing task is a common scenario in application services, such as calculating a user's telephone charge once every month in the field of telecommunications. In practice, some scenarios of asynchronous processing may also use timed task processing, such as sending an email to the user after he places an order. Task scheduling systems are specific solutions to the problem of timing tasks, and the existing task scheduling systems are mainly classified into the following three categories: the system comprises a single-machine version timing task system represented by Timer, a distributed cluster multi-task scheduling system represented by Quartz, and a distributed scheduling system supporting task slicing represented by TBSchedule and Elastic-Job.
In the process of implementing the invention, the inventor finds that: the first system cannot realize high availability and high performance processing of the system, and the second system cannot realize fragment processing of tasks and flexible expansion of task scheduling. For the third type of system, because the system mostly depends on ZooKeeper (a distributed application program coordination service) in design, and only one node provides service at the same time, the system cannot cope with the situation that the task scheduling scale is large; meanwhile, the monitoring of the operation state of the operation node is based on the Transmission Control Protocol (TCP) (Transmission Control protocol) connection, and the service stability is poor because the network often has jitter in a distributed environment; finally, in the design of such systems, task scheduling logic and service processing logic are all arranged in one job node, so that the flexible expansion of service processing capability and task scheduling capability cannot be ensured.
Disclosure of Invention
In view of this, embodiments of the present invention provide a task scheduling method and system, which can implement high-availability task scheduling supporting multi-task scheduling and task fragmentation.
To achieve the above object, according to one aspect of the present invention, a task scheduling method is provided.
The task scheduling method of the embodiment of the invention comprises the following steps: allocating a scheduling thread of scheduling application to each sharding item in at least one sharding item of the current task type; the scheduling thread is used for selecting and processing execution application of the corresponding fragment item, and sending a scheduling strategy preset for the current task type and the identifier of the fragment item to the selected execution application; when receiving a task to be executed belonging to a current task type: and the execution application acquires the service data of the fragment item of the task to be executed according to the scheduling strategy and the identifier of the fragment item so as to process the fragment item of the task to be executed.
Optionally, the method further comprises: when an execution application is started, registering a task type corresponding to the execution application in a registration center, and setting a scheduling strategy for the registered task type; the scheduling strategy comprises the following steps: the number of the fragmentation items of the task type, the scheduling interval and the maximum data volume acquired at one time.
Optionally, the allocating a scheduling thread of a scheduling application to each sharded item in at least one sharded item of the current task type specifically includes: for any sharded item of the current task type: the preempting threads of the plurality of scheduling applications try to acquire a preset data lock, the scheduling application acquiring the data lock starts a scheduling thread, and the scheduling thread is distributed to the fragment item; after the scheduling application starts the scheduling thread, releasing the data lock; the method further comprises: and when the scheduling application corresponding to the scheduling thread is not available, the scheduling thread is reallocated for the fragment item corresponding to the scheduling thread.
Optionally, the scheduling thread may select an execution application for processing the corresponding fragment item according to the following steps: acquiring currently available execution application from a registry; the registry determines whether the execution application is available or not through a heartbeat signal sent by each execution application; and determining the executive application for processing the fragmentation item from the currently available executive applications according to the load balancing strategy.
Optionally, the executing application may obtain service data of the task to be executed corresponding to the fragmentation item of the executing application according to the following steps: and in a data table where the service data of the task to be executed is located, performing remainder operation on the number of the fragmentation items in the scheduling policy by using the main key data of each record, and determining the record with the remainder result being the identifier of the fragmentation item and the number not greater than the maximum data volume acquired at one time in the scheduling policy as the service data of the fragmentation item of the task to be executed.
Optionally, the method further comprises: when the quantity of the service data of the fragmentation item of the task to be executed, which is acquired by the execution application, is less than the total quantity of the service data of the fragmentation item of the task to be executed, the execution application for processing the fragmentation item is reselected, and the scheduling strategy of the current task type and the identifier of the fragmentation item are sent to the execution application; when another task to be executed belonging to the current task type is received, the execution application for processing each fragment item is reselected, and the scheduling strategy of the current task type and the identifier of the corresponding fragment item are sent to the execution application; the scheduling application and the executing application are disposed in different computer clusters.
To achieve the above object, according to another aspect of the present invention, there is provided a task scheduling system.
The task scheduling system of the embodiment of the invention can comprise: setting at least one scheduling system for scheduling applications and setting at least one execution system for executing the applications; the scheduling application can be used for starting a scheduling thread and distributing the scheduling thread to a fragment item of the current task type; the scheduling thread is used for selecting an execution application for processing the fragment item and sending a scheduling strategy preset for the current task type and the identifier of the fragment item to the selected execution application; the executing application is operable to: and when a task to be executed belonging to the current task type is received, acquiring the service data of the fragment item of the task to be executed according to the scheduling strategy and the identifier of the fragment item so as to process the fragment item of the task to be executed.
Optionally, the system may further comprise: the registration center is used for realizing the registration of the task type corresponding to the execution application and the calling application when the execution application and the calling application are started; determining whether each executing application is available through a heartbeat signal sent by the executing application; determining whether each scheduling application is available through a heartbeat signal sent by the scheduling application; the control center is used for setting a scheduling strategy for the task types registered in the registration center; the scheduling strategy comprises the following steps: the number of the fragmentation items of the task type, the scheduling interval and the maximum data volume acquired at one time.
Optionally, multiple scheduling applications of the scheduling system may be used to: for any fragment item of the current task type, a plurality of started preemption threads are utilized to try to acquire a preset data lock; the scheduling application that acquires the data lock may be used to: starting a scheduling thread, distributing the scheduling thread to the fragment item, and releasing a data lock; the dispatch thread may be further operable to: acquiring currently available execution applications from a registry, and determining the execution applications for processing the corresponding fragmentation items according to a load balancing strategy; the executing application may be further operable to: and in a data table where the service data of the task to be executed is located, performing remainder operation on the number of the fragmentation items in the scheduling policy by using the main key data of each record, and determining the record with the remainder result being the identifier of the fragmentation item and the number not greater than the maximum data volume acquired at one time in the scheduling policy as the service data of the fragmentation item of the task to be executed.
Optionally, the registry may be further configured to: storing the data lock and its current state; storing the corresponding relation between the task type and the realization interface; storing the corresponding relation between the task type and the scheduling strategy; storing the corresponding relation between the execution application, the server address and the execution application running state; storing the corresponding relation of the scheduling application, the server address, the scheduling thread and the scheduling thread running state; and storing the corresponding relation between the task type, the fragment item identification, the scheduling thread and the current state of the fragment item.
To achieve the above object, according to still another aspect of the present invention, there is provided a computer-readable storage medium.
A computer-readable storage medium of the present invention has stored thereon a computer program which, when executed by a processor, implements the task scheduling method provided by the present invention.
According to the technical scheme of the invention, one embodiment of the invention has the following advantages or beneficial effects:
firstly, after each task type is started, a scheduling thread is allocated to each fragment item, and the scheduling thread is used for selecting an execution application meeting preset conditions for the fragment item and sending a corresponding scheduling strategy and a fragment item identifier to the execution application so that the execution application can obtain service data of the fragment item in the task to be executed. Through the arrangement, the separation of the task scheduling logic (based on the scheduling application) and the service processing capacity (based on the execution application) is realized, so that the elastic expansion of the service processing capacity and the task scheduling capacity is ensured.
Secondly, the invention can realize high-availability task scheduling through a plurality of scheduling applications in the scheduling system, and solves the problem caused by the serious dependence of TBschedule, Elastic-Job and the like on the ZooKeeper; meanwhile, the database is used as a registration center to store task types, execution applications, scheduling applications, fragment items, scheduling threads, scheduling strategies, data locks and related data, and when the data volume is large, the database and the table can be divided according to needs, so that the horizontal expansion capability of the system is improved, and a larger-scale task scheduling scene is supported.
Thirdly, the invention adopts a heartbeat mechanism to replace the TCP connection in the prior art, thereby accurately reflecting the running state of the application and ensuring the stability of the system; the high availability of the system is further improved by effectively coping with the situations of online and offline of the scheduling application and the execution application so that the scheduling strategy is not changed along with the number of the applications.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a diagram illustrating the main steps of a task scheduling method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of components of a task scheduling system according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments of the present invention and the technical features of the embodiments may be combined with each other without conflict.
Fig. 1 is a schematic diagram of main steps of a task scheduling method according to an embodiment of the present invention.
As shown in fig. 1, the task scheduling method according to the embodiment of the present invention may be specifically executed according to the following steps:
step S101: allocating a scheduling thread of scheduling application to each sharding item in at least one sharding item of the current task type; the scheduling thread is used for selecting and processing the execution application of the corresponding fragment item, and sending the scheduling policy preset for the current task type and the identifier of the fragment item to the selected execution application.
In the embodiment of the present invention, the task type refers to a type of tasks having similar characteristics, and one task type corresponds to a plurality of tasks in an actual application. The following are several different task types: sending an email to the user, sending a short message to the user, and making a call to the user. In a specific application, if the task to be executed is: and sending mails to 100 specified users, wherein the task to be executed belongs to a task type of sending mails to the users.
It will be appreciated that task types are often associated with one or more front-end servers executing business processing logic, that is, the servers have the ability to perform specific tasks belonging to the task types. Generally, one or more execution applications are provided in the server to directly execute tasks. Executing an application refers to executing business processing logic and then processing a computer program of a task. In practical applications, a certain policy needs to be adopted to send a task to a suitable execution application, which requires a calling unit to schedule the task. It is understood that the scheduling unit refers to a computer program that executes task scheduling logic and distributes tasks to reasonable execution applications, and the computer program is generally deployed in a server in which a back end is responsible for task scheduling, and the scheduling application and the execution applications can be arranged in different computer clusters to realize the stripping of task calling logic and business processing logic.
With the rapid development of internet technology, executing a specific task in an actual scene often requires acquiring and processing a large amount of data, and if the task is sent to an execution application for processing, the efficiency is extremely low, and the business requirements cannot be met. Therefore, it is necessary to divide a task into a plurality of sharding items, and one executing application processes one sharding item, so that the task execution efficiency can be greatly improved. Generally, sharded items can be partitioned according to primary key values of task related data. For example, the task a is to send mails to 100 users, and the related data required for executing the task is mailboxes of 100 users, the content of each mail, and the like, and if the task needs to be divided into 10 sharding items, the task may be divided according to a primary key (user ID) of a user data table. Specifically, the user ID may perform a remainder operation on the number of sharded items 10, use the user with the remainder result of 0 (i.e. the mantissa is 0) as the user of the first sharded item, use the user with the remainder result of 1 (i.e. the mantissa is 1) as the user of the second sharded item, and use the user with the remainder result of 2 (i.e. the mantissa is 2) as the user … … of the third sharded item, where the remainder result may be used as an identifier of the corresponding sharded item.
In a specific application scenario, in order to uniformly manage tasks belonging to the same task type, a scheduling policy may be set for each task type, where the scheduling policy may include: the number of the fragmentation items of the task type, the scheduling interval and the maximum data volume acquired at one time. Wherein the number of sharding items limits the total sharding number of the task type; the scheduling interval refers to a time interval between two adjacent scheduling applications, that is, a time interval between two adjacent processing tasks of the applications, and in practical applications, a thread (that is, a scheduling thread to be described later) executing the scheduling logic initiates task scheduling according to the scheduling interval; acquiring the maximum amount of data at a time refers to the maximum amount of data that can be acquired at one time when the executing application processes the task. It can be understood that in a transaction involving mass data, even if the fragmented items are fragmented, the data amount is still huge, and in order to further improve the data processing efficiency, the maximum data amount obtained at one time for each task type may be configured, so as to limit the data amount processed at a single time.
In this step, a scheduling thread may be allocated to each fragment item of the current task type, where the scheduling thread is used to execute a task scheduling logic, and select an execution application for processing the fragment item of the current task type. In specific application, the scheduling thread can be allocated to the fragment item through the following steps:
for any sharded item of the current task type: the preemption threads of the plurality of scheduling applications try to acquire a preset data lock, namely, a preemption behavior is sent to the data lock, and finally, the scheduling application acquiring the data lock starts a scheduling thread which is distributed to the fragmentation item. Thereafter, the scheduling application releases the data lock. The data lock may be a global lock acting on a data line level, the data lock and its current state data may be stored in an independent data table, and the current state data may be two values respectively indicating that the data lock is preempted and released.
In embodiments of the invention, after assigning the scheduling thread to a sharded item, the scheduling thread is able to select an execution application from the currently available execution applications that is suitable for processing the sharded item. Specifically, the scheduling thread first obtains all currently available execution application lists, and then selects one execution application from the available execution application lists according to load balancing algorithms such as a random algorithm and a consistent hash algorithm. In practical applications, when the execution application is started, the task type related to the execution application is registered in a registry, and the registry can be a computer cluster comprising a database. Then, each executing application checks its running state at regular time and sends a heartbeat signal to the registry. The registry can judge whether the execution application is available through the received heartbeat signal, and finally stores the running state of the execution unit in a data table. When the scheduling thread of the scheduling application is started, the scheduling application can start a thread for synchronizing information, and the thread can access a data table of the operating state of the execution application stored in the registry and synchronize the available execution application information to the scheduling thread. In practical application, the scheduling thread may also obtain the execution server with the minimum load according to a load balancing algorithm, and then select the execution application in an idle state from the execution servers as the execution application for processing the fragmentation item.
In a specific application scenario, a scheduling server where the scheduling thread is located may be paralyzed, and in this case, the scheduling thread is terminated, and the correspondence relationship between the scheduling thread and the fragmentation item is also stopped. At this time, the foregoing locking mechanism may be utilized to reallocate the scheduling thread for the fragmentation item, thereby implementing failover of the scheduling application.
After selecting the execution application, the scheduling thread may send the scheduling policy of the corresponding task type and the identifier of the corresponding fragment item to the execution application. After the execution application acquires the data, the service data of the fragmentation item can be determined.
Step S102: when receiving a task to be executed belonging to a current task type: and acquiring the service data of the fragment item of the task to be executed by the execution application according to the scheduling strategy and the identifier of the fragment item so as to process the fragment item of the task to be executed.
In this step, the executing application may obtain the task to be executed through the monitoring service database, and obtain the service data of the fragment item of the task to be executed according to the corresponding scheduling policy and the identifier of the fragment item, thereby implementing the processing of the task to be executed. For example, the fragment item identifier corresponding to one executing application is 2, and the scheduling policy is: the number of the fragmentation items 10 is 10, the scheduling interval is 3s, the maximum data volume is obtained once by 100, after the task to be executed is issued, the execution application may perform a remainder operation on the number of the fragmentation items 10 in the scheduling policy by using the primary key data (user ID) of each record in the corresponding data table, and determine the record with the remainder result of 2 and the number of no more than 100 as the service data of the fragmentation item of the task to be executed, that is, the first 100 records with the user ID mantissa of 2 as the required data.
In practical application, data of one fragment item of a task to be executed is often difficult to be acquired and processed by an executing application at one time, and when the quantity of service data of the fragment item of the task to be executed, acquired by the executing application, is less than the total quantity of the service data of the fragment item of the task to be executed, after the first execution, a scheduling thread reselects the executing application for processing the fragment item according to a load balancing strategy, and sends a scheduling strategy of a current task type and an identifier of the fragment item to the executing application. In addition, when another task to be executed belonging to the current task type is received, the scheduling thread reselects the execution application for processing each fragment item, and sends the scheduling strategy of the current task type and the identifier of the corresponding fragment item to the execution application.
In the technical scheme of the embodiment of the invention, after each task type is started, a scheduling thread is allocated to each fragment item, and the scheduling thread is used for selecting an execution application meeting a preset condition for the fragment item and sending a corresponding scheduling strategy and a fragment item identifier to the execution application so that the execution application can acquire service data of the fragment item in the task to be executed. Through the arrangement, the separation of the task scheduling logic (based on the scheduling application) and the service processing capacity (based on the execution application) is realized, so that the elastic expansion of the service processing capacity and the task scheduling capacity is ensured. In addition, the invention can realize high-availability task scheduling through a plurality of scheduling applications in the scheduling system, and solves the problem caused by the serious dependence on the ZooKeeper of TBSchedule, Elastic-Job and the like; meanwhile, the database is used as a registration center to store task types, execution applications, scheduling applications, fragment items, scheduling threads, scheduling strategies, data locks and related data, and when the data volume is large, the database and the table can be divided according to needs, so that the horizontal expansion capability of the system is improved, and a larger-scale task scheduling scene is supported. Finally, the invention adopts a heartbeat mechanism to replace the TCP connection in the prior art, thereby accurately reflecting the running state of the application and ensuring the stability of the system; the high availability of the system is further improved by effectively coping with the situations of online and offline of the scheduling application and the execution application so that the scheduling strategy is not changed along with the number of the applications.
Fig. 2 is a schematic diagram of components of a task scheduling system according to an embodiment of the present invention, where the task scheduling system may be used as a specific system for implementing the task scheduling method.
As shown in FIG. 2, the task scheduling system may include a registry, a control center, a scheduling system, and an execution system.
The execution system can comprise a plurality of front-end servers, and each server can be provided with a plurality of execution applications. The execution system is responsible for specific service logic processing, Remote Procedure Call (RPC) (remote Procedure call) service is exposed outwards for being called by the scheduling system, and after an RPC request sent by a scheduling application in the scheduling system is received, the execution application can automatically acquire required fragment data according to a scheduling policy and a fragment item identifier carried in the request.
The registry can comprise a computer cluster arranged with a plurality of registration applications and a database comprising a plurality of data tables, and can be used for registering task types and scheduling applications and acquiring the running state of the applications according to heartbeat signals sent by the execution applications and the scheduling applications at regular time. Meanwhile, the registry can provide an interface to the control center, so that the staff of the control center can configure, issue (to the registry), start or stop the scheduling policy for the registered task type. It can be understood that after the worker starts the task type, the steps of allocating the scheduling thread and selecting the execution application can be executed. In addition, a plurality of data tables can be maintained in the database of the registry to store relevant data in the task scheduling process. Specifically, the Work table may store the name of the task type, the task type identifier, and the implementation interface; the Schedule table can store task types and scheduling strategies configured by workers and can also store the running states of corresponding scheduling threads; the Schedule global lock table can store the data lock and the current state thereof; the Manager table can store scheduling application, corresponding server Internet Protocol (IP) (internet protocol) address, scheduling thread and scheduling application running state; the Server table can store task types, fragment item identifiers, scheduling threads and current states of the fragment items (for example, to be started or started), and can also store corresponding scheduling strategies; the Execute table may store the executing application, the server address, and the running state of the executing application. It is worth mentioning that when dealing with the situation that the data amount is increasing, the tables can be divided into banks and tables to realize horizontal expansion, thereby improving the data processing efficiency.
The scheduling system may include a plurality of back-end servers, each of which may deploy a plurality of scheduling applications. The scheduling system controls preemptive scheduling of a plurality of scheduling applications through a data lock mechanism, ensures that only one scheduling application can be allocated to one fragment item of one task type at the same time, and can start one scheduling thread for task scheduling after the scheduling application acquires the data lock.
The control center issues a scheduling strategy to the registration center by calling an RPC service interface provided by the registration center, and controls the starting and stopping of scheduling tasks.
The above is the basic function of each subsystem of the task scheduling system, and the following describes the workflow of the task scheduling system based on the above contents, and the steps are as follows:
1. when each execution application in the execution system is started, the relevant task type is registered in the registry, and a record is stored in a Work table aiming at one task type.
2. The dispatching center inquires the task type registered in the registration center and displays the task type on a Web page. The staff member may query the system for a default scheduling policy.
3. The worker may configure a scheduling policy for the newly registered task type and initiate assignment of the task type. The configured scheduling policy will be stored in the Schedule table, and each sharded item of a task type will be stored as a record in the Server table.
4. The scheduling system starts a plurality of scheduling applications and registers the scheduling applications with the registry. Thereafter, the relevant data for each scheduling application is stored as a record in the Manager table.
5. And each scheduling application tries to acquire the data lock in the Schedule global lock table by using the preemption thread, the scheduling application finally acquiring the data lock starts the scheduling thread aiming at the current fragment item, and then releases the data lock, and the scheduling application continues to start the scheduling thread by preempting the data lock.
6. The scheduling thread started in the previous step acquires the available execution application in the Execute table in the registry, selects a proper execution application according to a load balancing algorithm to process the corresponding fragmentation item, and sends the scheduling strategy of the data including the number of the fragmentation items, the scheduling interval, the maximum data volume acquired at one time and the like and the identification of the corresponding fragmentation item to the selected execution application. When receiving the task to be executed, the execution application can acquire corresponding fragment data and then execute the fragment data.
Through the steps, the high-availability scheduling and execution of the tasks can be realized. It will be appreciated that when an executing application is not available, the thread of execution of the sharded item that it processes selects another application from the currently available executing applications to process the sharded item. When one scheduling application is unavailable, the Server table deletes the scheduling thread and the corresponding fragment item, and at the moment, a plurality of scheduling applications can allocate the scheduling thread to the fragment item again through a data lock preemption mechanism.
Thus, in the embodiment of the invention, a complete set of distributed and highly available task scheduling system supporting multi-task scheduling, task fragmentation and batch processing is provided. Compared with the prior art, the invention realizes the registry service based on the database, realizes the support of massive internet concurrent scheduling scenes through the database sub-database and sub-table technology, and successfully solves the problem that the processing capacity of the centralized registry Zookeeper is limited. Meanwhile, the invention does not depend on the TCP connection of the Zookeeper as the basis of the re-fragmentation of the scheduling task, thereby improving the stability of the scheduling center. Finally, the task scheduling logic and the service processing logic in the existing scheme are arranged in two independent clusters for processing, so that the flexible capacity expansion of the scheduling logic node and the service logic node is realized, and the flexibility of the system is improved.
Although the task scheduling method of the present invention is described above by taking the task scheduling system shown in fig. 2 as an example, this does not limit the application scenario of the present invention.
It should be noted that, for the convenience of description, the foregoing method embodiments are described as a series of acts, but those skilled in the art will appreciate that the present invention is not limited by the order of acts described, and that some steps may in fact be performed in other orders or concurrently. Moreover, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required to implement the invention.
To facilitate a better implementation of the above-described aspects of embodiments of the present invention, the task scheduling system shown in FIG. 2 is further described below.
The task scheduling system provided by the embodiment of the invention can comprise: setting at least one scheduling system for scheduling applications and setting at least one execution system for executing the applications; the scheduling application can be used for starting a scheduling thread and distributing the scheduling thread to a fragment item of the current task type; the scheduling thread can be used for selecting an execution application for processing the fragment item and sending a scheduling strategy preset for the current task type and the identifier of the fragment item to the selected execution application; the executing application is operable to: and when a task to be executed belonging to the current task type is received, acquiring the service data of the fragment item of the task to be executed according to the scheduling strategy and the identifier of the fragment item so as to process the fragment item of the task to be executed.
In an embodiment of the present invention, the system may further include: the registration center is used for realizing the registration of the task type corresponding to the execution application and the calling application when the execution application and the calling application are started; determining whether each executing application is available through a heartbeat signal sent by the executing application; determining whether each scheduling application is available through a heartbeat signal sent by the scheduling application; the control center is used for setting a scheduling strategy for the task types registered in the registration center; the scheduling strategy comprises the following steps: the number of the fragmentation items of the task type, the scheduling interval and the maximum data volume acquired at one time.
As a preferred approach, multiple scheduling applications of the scheduling system may be used to: for any fragment item of the current task type, a plurality of started preemption threads are utilized to try to acquire a preset data lock; the scheduling application that acquires the data lock may be used to: starting a scheduling thread, distributing the scheduling thread to the fragment item, and releasing a data lock; the dispatch thread may be further operable to: acquiring currently available execution applications from a registry, and determining the execution applications for processing the corresponding fragmentation items according to a load balancing strategy; the executing application may be further operable to: and in a data table where the service data of the task to be executed is located, performing remainder operation on the number of the fragmentation items in the scheduling policy by using the main key data of each record, and determining the record with the remainder result being the identifier of the fragmentation item and the number not greater than the maximum data volume acquired at one time in the scheduling policy as the service data of the fragmentation item of the task to be executed.
Preferably, in an embodiment of the present invention, the registry may be further configured to: storing the data lock and its current state; storing the corresponding relation between the task type and the realization interface; storing the corresponding relation between the task type and the scheduling strategy; storing the corresponding relation between the execution application, the server address and the execution application running state; storing the corresponding relation of the scheduling application, the server address, the scheduling thread and the scheduling thread running state; and storing the corresponding relation between the task type, the fragment item identification, the scheduling thread and the current state of the fragment item.
In the technical scheme of the embodiment of the invention, after each task type is started, a scheduling thread is allocated to each fragment item, and the scheduling thread is used for selecting an execution application meeting a preset condition for the fragment item and sending a corresponding scheduling strategy and a fragment item identifier to the execution application so that the execution application can acquire service data of the fragment item in the task to be executed. Through the arrangement, the separation of the task scheduling logic (based on the scheduling application) and the service processing capacity (based on the execution application) is realized, so that the elastic expansion of the service processing capacity and the task scheduling capacity is ensured. In addition, the invention can realize high-availability task scheduling through a plurality of scheduling applications in the scheduling system, and solves the problem caused by the serious dependence on the ZooKeeper of TBSchedule, Elastic-Job and the like; meanwhile, the database is used as a registration center to store task types, execution applications, scheduling applications, fragment items, scheduling threads, scheduling strategies, data locks and related data, and when the data volume is large, the database and the table can be divided according to needs, so that the horizontal expansion capability of the system is improved, and a larger-scale task scheduling scene is supported. Finally, the invention adopts a heartbeat mechanism to replace the TCP connection in the prior art, thereby accurately reflecting the running state of the application and ensuring the stability of the system; the high availability of the system is further improved by effectively coping with the situations of online and offline of the scheduling application and the execution application so that the scheduling strategy is not changed along with the number of the applications.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to perform steps comprising: allocating a scheduling thread of scheduling application to each sharding item in at least one sharding item of the current task type; the scheduling thread is used for selecting and processing execution application of the corresponding fragment item, and sending a scheduling strategy preset for the current task type and the identifier of the fragment item to the selected execution application; when receiving a task to be executed belonging to a current task type: and the execution application acquires the service data of the fragment item of the task to be executed according to the scheduling strategy and the identifier of the fragment item so as to process the fragment item of the task to be executed.
In the technical scheme of the embodiment of the invention, after each task type is started, a scheduling thread is allocated to each fragment item, and the scheduling thread is used for selecting an execution application meeting a preset condition for the fragment item and sending a corresponding scheduling strategy and a fragment item identifier to the execution application so that the execution application can acquire service data of the fragment item in the task to be executed. Through the arrangement, the separation of the task scheduling logic (based on the scheduling application) and the service processing capacity (based on the execution application) is realized, so that the elastic expansion of the service processing capacity and the task scheduling capacity is ensured. In addition, the invention can realize high-availability task scheduling through a plurality of scheduling applications in the scheduling system, and solves the problem caused by the serious dependence on the ZooKeeper of TBSchedule, Elastic-Job and the like; meanwhile, the database is used as a registration center to store task types, execution applications, scheduling applications, fragment items, scheduling threads, scheduling strategies, data locks and related data, and when the data volume is large, the database and the table can be divided according to needs, so that the horizontal expansion capability of the system is improved, and a larger-scale task scheduling scene is supported. Finally, the invention adopts a heartbeat mechanism to replace the TCP connection in the prior art, thereby accurately reflecting the running state of the application and ensuring the stability of the system; the high availability of the system is further improved by effectively coping with the situations of online and offline of the scheduling application and the execution application so that the scheduling strategy is not changed along with the number of the applications.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (11)

1. A method for task scheduling, comprising:
allocating a scheduling thread of scheduling application to each sharding item in at least one sharding item of the current task type; the scheduling thread is used for selecting and processing execution application of the corresponding fragment item, and sending a scheduling strategy preset for the current task type and the identifier of the fragment item to the selected execution application;
when receiving a task to be executed belonging to a current task type: and the execution application acquires the service data of the fragment item of the task to be executed according to the scheduling strategy and the identifier of the fragment item so as to process the fragment item of the task to be executed.
2. The method of claim 1, further comprising:
when an execution application is started, registering a task type corresponding to the execution application in a registration center, and setting a scheduling strategy for the registered task type; the scheduling strategy comprises the following steps: the number of the fragmentation items of the task type, the scheduling interval and the maximum data volume acquired at one time.
3. The method of claim 1,
the allocating a scheduling thread of a scheduling application to each sharded item in at least one sharded item of the current task type specifically includes: for any sharded item of the current task type: the preempting threads of the plurality of scheduling applications try to acquire a preset data lock, the scheduling application acquiring the data lock starts a scheduling thread, and the scheduling thread is distributed to the fragment item; after the scheduling application starts the scheduling thread, releasing the data lock; and
the method further comprises: and when the scheduling application corresponding to the scheduling thread is not available, the scheduling thread is reallocated for the fragment item corresponding to the scheduling thread.
4. The method according to claim 2, wherein the scheduling thread selects an executing application that processes the corresponding sharded item according to the following steps:
acquiring currently available execution application from a registry; the registry determines whether the execution application is available or not through a heartbeat signal sent by each execution application;
and determining the executive application for processing the fragmentation item from the currently available executive applications according to the load balancing strategy.
5. The method according to claim 2, wherein the executing application obtains the service data corresponding to the fragmentation item of the executing application of the task to be executed according to the following steps:
and in a data table where the service data of the task to be executed is located, performing remainder operation on the number of the fragmentation items in the scheduling policy by using the main key data of each record, and determining the record with the remainder result being the identifier of the fragmentation item and the number not greater than the maximum data volume acquired at one time in the scheduling policy as the service data of the fragmentation item of the task to be executed.
6. The method according to any one of claims 1-5, wherein the method further comprises:
when the quantity of the service data of the fragmentation item of the task to be executed, which is acquired by the execution application, is less than the total quantity of the service data of the fragmentation item of the task to be executed, the execution application for processing the fragmentation item is reselected, and the scheduling strategy of the current task type and the identifier of the fragmentation item are sent to the execution application;
when another task to be executed belonging to the current task type is received, the execution application for processing each fragment item is reselected, and the scheduling strategy of the current task type and the identifier of the corresponding fragment item are sent to the execution application; and
the scheduling application and the executing application are disposed in different computer clusters.
7. A task scheduling system, comprising: setting at least one scheduling system for scheduling applications and setting at least one execution system for executing the applications; wherein the content of the first and second substances,
the scheduling application is used for starting a scheduling thread and distributing the scheduling thread to a fragment item of the current task type; the scheduling thread is used for selecting an execution application for processing the fragment item and sending a scheduling strategy preset for the current task type and the identifier of the fragment item to the selected execution application;
the execution application is to: and when a task to be executed belonging to the current task type is received, acquiring the service data of the fragment item of the task to be executed according to the scheduling strategy and the identifier of the fragment item so as to process the fragment item of the task to be executed.
8. The system of claim 7, further comprising:
the registration center is used for realizing the registration of the task type corresponding to the execution application and the calling application when the execution application and the calling application are started; determining whether each executing application is available through a heartbeat signal sent by the executing application; determining whether each scheduling application is available through a heartbeat signal sent by the scheduling application;
the control center is used for setting a scheduling strategy for the task types registered in the registration center; the scheduling strategy comprises the following steps: the number of the fragmentation items of the task type, the scheduling interval and the maximum data volume acquired at one time.
9. The system of claim 8,
a plurality of scheduling applications of the scheduling system are for: for any fragment item of the current task type, a plurality of started preemption threads are utilized to try to acquire a preset data lock;
the scheduling application that acquires the data lock is to: starting a scheduling thread, distributing the scheduling thread to the fragment item, and releasing a data lock;
the dispatch thread is further to: acquiring currently available execution applications from a registry, and determining the execution applications for processing the corresponding fragmentation items according to a load balancing strategy; and
the execution application is further to: and in a data table where the service data of the task to be executed is located, performing remainder operation on the number of the fragmentation items in the scheduling policy by using the main key data of each record, and determining the record with the remainder result being the identifier of the fragmentation item and the number not greater than the maximum data volume acquired at one time in the scheduling policy as the service data of the fragmentation item of the task to be executed.
10. The system of claim 9, wherein the registry is further configured to:
storing the data lock and its current state;
storing the corresponding relation between the task type and the realization interface;
storing the corresponding relation between the task type and the scheduling strategy;
storing the corresponding relation between the execution application, the server address and the execution application running state;
storing the corresponding relation of the scheduling application, the server address, the scheduling thread and the scheduling thread running state; and
and storing the corresponding relation between the task type, the fragment item identification, the scheduling thread and the current state of the fragment item.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-6.
CN201910019054.5A 2019-01-09 2019-01-09 Task scheduling method and system Pending CN111427670A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910019054.5A CN111427670A (en) 2019-01-09 2019-01-09 Task scheduling method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910019054.5A CN111427670A (en) 2019-01-09 2019-01-09 Task scheduling method and system

Publications (1)

Publication Number Publication Date
CN111427670A true CN111427670A (en) 2020-07-17

Family

ID=71545980

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910019054.5A Pending CN111427670A (en) 2019-01-09 2019-01-09 Task scheduling method and system

Country Status (1)

Country Link
CN (1) CN111427670A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342508A (en) * 2021-07-07 2021-09-03 湖南快乐阳光互动娱乐传媒有限公司 Task scheduling method and device
CN113742045A (en) * 2021-09-15 2021-12-03 上海淇玥信息技术有限公司 Distributed task processing method and device and electronic equipment
CN113986507A (en) * 2021-11-01 2022-01-28 佛山技研智联科技有限公司 Job scheduling method and device, computer equipment and storage medium
WO2022032532A1 (en) * 2020-08-12 2022-02-17 Alibaba Group Holding Limited Sharding for workflow applications in serverless architectures
CN116860421A (en) * 2023-09-05 2023-10-10 中信消费金融有限公司 Task processing method and task processing system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022032532A1 (en) * 2020-08-12 2022-02-17 Alibaba Group Holding Limited Sharding for workflow applications in serverless architectures
CN113342508A (en) * 2021-07-07 2021-09-03 湖南快乐阳光互动娱乐传媒有限公司 Task scheduling method and device
CN113742045A (en) * 2021-09-15 2021-12-03 上海淇玥信息技术有限公司 Distributed task processing method and device and electronic equipment
CN113986507A (en) * 2021-11-01 2022-01-28 佛山技研智联科技有限公司 Job scheduling method and device, computer equipment and storage medium
CN116860421A (en) * 2023-09-05 2023-10-10 中信消费金融有限公司 Task processing method and task processing system

Similar Documents

Publication Publication Date Title
CN111427670A (en) Task scheduling method and system
US10942795B1 (en) Serverless call distribution to utilize reserved capacity without inhibiting scaling
US11119826B2 (en) Serverless call distribution to implement spillover while avoiding cold starts
CN111338773B (en) Distributed timing task scheduling method, scheduling system and server cluster
CN106406983B (en) Task scheduling method and device in cluster
US9577961B2 (en) Input/output management in a distributed strict queue
US10686728B2 (en) Systems and methods for allocating computing resources in distributed computing
US10200295B1 (en) Client selection in a distributed strict queue
EP4066112A1 (en) Serverless call distribution to utilize reserved capacity without inhibiting scaling
CA3168286A1 (en) Data flow processing method and system
JP6881575B2 (en) Resource allocation systems, management equipment, methods and programs
US20100138540A1 (en) Method of managing organization of a computer system, computer system, and program for managing organization
US9584593B2 (en) Failure management in a distributed strict queue
CN107402956B (en) Data processing method and device for large task and computer readable storage medium
US9591101B2 (en) Message batching in a distributed strict queue
WO2021057514A1 (en) Task scheduling method and apparatus, computer device, and computer readable medium
WO2016061935A1 (en) Resource scheduling method, device and computer storage medium
CN111641515A (en) VNF life cycle management method and device
CN111459639A (en) Distributed task management platform and method supporting global multi-machine-room deployment
US7114156B2 (en) System and method for processing multiple work flow requests from multiple users in a queuing system
CN111597033A (en) Task scheduling method and device
US9577878B2 (en) Geographic awareness in a distributed strict queue
CN101778131A (en) Data synchronization system
CN107025257A (en) A kind of transaction methods and device
CN112698929A (en) Information acquisition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination