CN111897638B

CN111897638B - Distributed task scheduling method and system

Info

Publication number: CN111897638B
Application number: CN202010732336.2A
Authority: CN
Inventors: 黄强; 曾耀武
Original assignee: Guangzhou Huya Technology Co Ltd
Current assignee: Guangzhou Huya Technology Co Ltd
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2024-04-19
Anticipated expiration: 2040-07-27
Also published as: CN111897638A

Abstract

The embodiment of the invention discloses a distributed task scheduling method and a distributed task scheduling system. The method is performed by a working node included in a distributed scheduling system including a plurality of master nodes, the method comprising: periodically sending a parameter acquisition request to at least one main node, and receiving the task type and the total task load fed back by the main node for local storage; when a new task processing condition is detected, calculating the number of requested tasks according to the number of tasks in processing and the total task load, and constructing a task acquisition request according to the number of requested tasks and the task type; and sending a task acquisition request to a first target master node determined from the plurality of master nodes, wherein the task acquisition request is used for indicating the first target master node to acquire tasks matched with the number of the requested tasks and the task types for feedback. According to the technical scheme provided by the embodiment of the invention, the task types and the task quantity are actively requested to the main node by the working node, so that the calculated amount of the main node is reduced, and the concurrency of the tasks supported by the system is improved.

Description

Distributed task scheduling method and system

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a distributed task scheduling method and system.

Background

Currently, when developing an application program, a user needs to run the application program on a server cluster in parallel through a task scheduling system so as to perfect application development.

In the prior art, a commonly used task scheduling system is a Yarn with a master-slave structure, but the Yarn is difficult to use and is not friendly to users with poor work. In addition, for the application program with simpler service code logic, the development workload is larger, the development efficiency is lower, and meanwhile, since the Yarn needs to apply resources first when starting the task, the calculation amount of the main node is heavier, the high concurrency requirement cannot be met, and the system is down possibly caused by the main node fault.

Disclosure of Invention

The embodiment of the invention provides a distributed task scheduling method and a distributed task scheduling system, which realize high concurrency of task scheduling by actively requesting task types and task numbers from a main node through a working node.

In a first aspect, an embodiment of the present invention provides a distributed task scheduling method, where the method is performed by a working node included in a distributed scheduling system, the distributed scheduling system includes a plurality of master nodes, and the method includes:

periodically sending a parameter acquisition request to at least one main node, and receiving the task type and the total task load fed back by the main node for local storage;

When a new task processing condition is detected, calculating the number of requested tasks according to the number of tasks in processing and the total task load, and constructing a task acquisition request according to the number of requested tasks and the task type;

And sending a task acquisition request to a first target master node determined from the plurality of master nodes, wherein the task acquisition request is used for indicating the first target master node to acquire tasks matched with the number of the requested tasks and the task types for feedback.

Optionally, the method further comprises:

When the task state reporting condition is detected, task state information of at least one currently processed task is acquired, wherein the task state information comprises: mapping relation between task identification and task state;

And sending task state information to a second target master node determined in the plurality of master nodes, wherein the task state information is used for indicating the second target master node to store the received task state information.

Optionally, after sending the task acquisition request to the first target master node determined among the plurality of master nodes, the method further includes:

if the task fed back by the first target master node is not received within the first waiting time period, determining a new target master node in the plurality of master nodes again, and sending a task acquisition request to the new target master node again; and/or

If the task state updating success response fed back by the second target master node is not received within the second waiting time period, determining a new target master node in the plurality of master nodes again, and sending the current task state information to the new target master node again.

In a second aspect, an embodiment of the present invention further provides a distributed task scheduling method, where the method is performed by a master node included in a distributed scheduling system, the distributed scheduling system includes a plurality of master nodes, and the method includes:

Inquiring a metadata base according to a parameter acquisition request sent by a first target working node, acquiring a task type matched with the parameter acquisition request and feeding back the total amount of task load to the first target working node;

The metadata base stores the mapping relation between the working nodes and the node parameters, wherein the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and task load total;

Extracting a target task request number and a target task type matched with the task acquisition request according to the task acquisition request sent by the second target working node;

and acquiring the task feedback matched with the target task request number and the target task type to the second target node.

Optionally, acquiring task feedback matched with the target task request number and the target task type to the second target node includes:

Inquiring a state database according to the target task type, and acquiring a target task list to be allocated, which is matched with the target task type; the state database stores task identification lists to be allocated, which correspond to the task types respectively;

Acquiring a task identification set matched with the target task request number from a target task list to be allocated;

Requesting metadata of the tasks matched with the task identification set from a metadata base, feeding back the metadata of the tasks to the second target working node, and storing a mapping relation between the task identifications and the metadata of the tasks in the metadata base.

Optionally, the method further comprises:

Inquiring a state database according to state report information sent by a third target working node, and acquiring a target current execution task identification set and a target allocated task state set matched with the third target working node;

The state database stores a current execution task identification set and an allocated task state set which correspond to each working node respectively;

Extracting task identifiers with the task state being finished from the state report information, and removing the task identifiers from the target current execution task identifier set;

Updating a target assigned task state set according to task states of all task identifiers included in the state reporting information, and adding a heartbeat time stamp of the state reporting information into an updating result;

And/or

After requesting metadata of the task matching the task identification set from the metadata base and feeding back the metadata of the task to the second target working node, the method further comprises:

And updating the current execution task identification set and the assigned task state set corresponding to the second target node according to the task identification set.

In a third aspect, an embodiment of the present invention further provides a distributed task scheduling system, including: the system comprises a plurality of working nodes, a plurality of master nodes, a state database and a metadata base, wherein the master nodes are respectively in communication connection with the working nodes, the state database and the metadata base, and the master nodes are respectively provided with a plurality of communication interfaces, wherein:

the working node is used for executing the distributed task scheduling method applied to the working node, which is provided by any embodiment of the invention;

A master node for executing the distributed task scheduling method applied to the master node as provided in any embodiment of the present invention;

the state database is used for storing a task identification list to be allocated, a current task identification set to be executed and an allocated task state set, wherein the task identification list to be allocated and the current task identification set to be executed are respectively corresponding to each working node;

the metadata base is used for storing the mapping relation between the task identification and the metadata of the task and the mapping relation between the working node and the node parameter, and the node parameter comprises: task type and task load total.

Optionally, the method further comprises: the front-end operation platform is respectively in communication connection with the metadata database and the state database;

The front-end operation platform is used for generating metadata of a plurality of tasks matched with the task parameters according to the task parameters configured in the visual task operation interface by a user, and storing the corresponding relation between the metadata of the tasks and the task identifications into the metadata base;

According to at least one task type configured by a user on a task operation interface, determining task types corresponding to the tasks respectively, and storing the corresponding relation between the task types and the task identifications in a state database.

Optionally, the front-end operation platform is further configured to:

Generating a mapping relation between the working node and the node parameter according to node parameter configuration information configured by a user on a task operation interface, and storing the mapping relation in a metadata base;

wherein the node parameters include: task type and task load total.

Optionally, the front-end operation platform is further configured to:

responding to a state query request for a target task input by a user in a task operation interface, acquiring target metadata matched with the target task from a metadata base, and acquiring a current task state matched with the target task and a heartbeat time stamp in the latest update state from each assigned task state set of a state database to perform visual display; and/or

The front-end operating platform is also for: and responding to a state query request input by a user in a task operation interface and aiming at a target working node, acquiring a current execution task identification set matched with the target working node and an allocated task state set from a state database, and performing visual display.

The embodiment of the invention provides a distributed task scheduling method and a distributed task scheduling system, wherein a working node periodically sends a parameter acquisition request to at least one main node, receives a task type and a task load total amount fed back by the main node for local storage, calculates a request task number according to the number of tasks in process and the task load total amount when a new task processing condition is detected, and constructs a task acquisition request according to the request task number and the task type; the task acquisition request is sent to a first target master node determined in a plurality of master nodes, and the task acquisition request is used for indicating the first target master node to acquire tasks matched with the requested task number and task types for feedback, so that the problem that the high concurrency requirement cannot be met due to large calculated amount when the master node actively distributes tasks for the work nodes in the prior art is solved, the task types and the task numbers are actively requested to the master node through the work nodes, the calculated amount of the master node is reduced, and the task concurrency supported by a system is improved.

Drawings

FIG. 1 is a flow chart of a distributed task scheduling method according to a first embodiment of the present invention;

FIG. 2 is a flow chart of a distributed task scheduling method in a second embodiment of the present invention;

FIG. 3 is a timing diagram of a distributed task scheduling method in accordance with a third embodiment of the present invention;

FIG. 4 is a schematic diagram of a distributed task scheduler according to a fourth embodiment of the present invention;

FIG. 5 is a schematic diagram of a distributed task scheduler according to a fifth embodiment of the present invention;

FIG. 6 is a schematic diagram of a distributed task scheduling system according to a sixth embodiment of the present invention;

Fig. 7 is a schematic structural diagram of an electronic device in a seventh embodiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.

Example 1

Fig. 1 is a flowchart of a distributed task scheduling method according to a first embodiment of the present invention, where the present embodiment may be applied to a case where a working node actively requests a task from a master node to implement high-concurrency task scheduling, the method may be performed by a distributed task scheduling device, and the device may be implemented by software and/or hardware and may be generally integrated in the working node included in the distributed scheduling system. As shown in fig. 1, the method is performed by a working node included in a distributed scheduling system including a plurality of master nodes, the method comprising:

and 110, periodically sending a parameter acquisition request to at least one master node, and receiving the task type and the task load total fed back by the master node for local storage.

It should be noted that, the master node and the working node provided by the embodiment of the present invention may be servers, a master-slave structure is provided between the master node and the working node, each working node may communicate with any master node, and each master node may also provide services for any working node that communicates with the master node.

In this embodiment, since the working node does not know the node parameters of the node during initialization, and the node parameters of each working node are periodically updated, in order to enable the working node to inform the master node of the task types and the task numbers that the working node can process when requesting tasks from the master node, the working node needs to periodically select at least one master node from a plurality of master nodes to establish communication connection, send a parameter acquisition request carrying a self node identifier to the master node that successfully establishes communication connection, and then receive the task types and the task load amounts fed back by the master node for the parameter acquisition request, and store the task types and the task load amounts locally.

In this embodiment, for a given working node, one master node may be randomly selected from multiple master nodes to connect. For example, the working node a randomly selects the master node B and sends a communication connection establishment request thereto, and if the master node B successfully establishes a communication connection with the working node a, it indicates that the master node B is available or idle; if the connection fails, it indicates that the primary node B is not currently available. At this time, the working node a randomly selects one of the master nodes again to send a communication connection establishment request until a communication connection is established with the available master node. Then, the working node sends a parameter acquisition request to the selected available master node through the established communication connection to acquire the node parameters of the node.

And 120, when a new task processing condition is detected, calculating the number of the requested tasks according to the number of the tasks in process and the total task load, and constructing a task acquisition request according to the number of the requested tasks and the task type.

In this embodiment, after the working node obtains the node parameters of the node, if a new task processing condition is detected, for example, the task currently processed is already processed, or the task currently processed is in an end state without further processing, etc., the task number capable of being currently processed by the node, that is, the request task number, is calculated according to the total task load in the node parameters and the task number in processing of the node, and a task obtaining request is generated according to the task type and the request task number.

And 130, sending a task acquisition request to a first target master node determined from the plurality of master nodes, wherein the task acquisition request is used for indicating the first target master node to acquire tasks matched with the requested task number and task types for feedback.

Wherein, for the same working node, the master node that sends the parameter acquisition request and the first target master node that sends the task acquisition request may not be the same master node.

In the embodiment, the working node actively requests the tasks of the specified types and the specified number from the master node, so that the master node can directly select the target tasks of the specified types of the specified number from the tasks to be processed and feed back the target tasks to the working node, and the working node capable of being processed is not required to be searched for each task to be processed through calculation, thereby reducing the calculated amount of the master node and easily realizing high concurrency of the tasks.

Optionally, the method may further include: when the task state reporting condition is detected, task state information of at least one currently processed task is acquired, wherein the task state information comprises: mapping relation between task identification and task state; and sending task state information to a second target master node determined in the plurality of master nodes, wherein the task state information is used for indicating the second target master node to store the received task state information.

In this embodiment, when a task is processed by a working node, if a task status reporting condition is detected, for example, when a task status reporting time is reached, a task identifier of a current processing task and task status data generated in a task processing process are obtained to form task status information, at least one master node is randomly selected from a plurality of master nodes to establish connection, and the task status information is sent to a second target master node that is successfully connected, so that the task status information is stored and updated by the second target master node.

In this embodiment, when the working node reports the task state information, only the task identifier and the task state data need to be reported, so that light-weight data transmission is realized, the task state information that the master node needs to store and update is relatively simplified, and high concurrency scheduling of the task is facilitated.

Optionally, after sending the task acquisition request to the first target master node determined among the plurality of master nodes, the method may further include: if the task fed back by the first target master node is not received within the first waiting time period, determining a new target master node in the plurality of master nodes again, and sending a task acquisition request to the new target master node again; and/or if the task state updating success response fed back by the second target master node is not received within the second waiting time, determining a new target master node in the plurality of master nodes again, and sending the current task state information to the new target master node again.

In this embodiment, after the working node sends a task acquisition request to the first target master node that successfully establishes communication connection, if a task fed back by the first target master node is not received within the first waiting duration, the first target master node is considered to have a fault, and the connection between the first target master node and the first target master node is disconnected, so that the master node is randomly selected from other master nodes again to establish communication connection, the successfully connected master node is used as a new target master node, and the task acquisition request is resent to the new target master node to acquire the task through the new target master node.

And after the working node sends the task state information to the second target main node, if the task state updating success response fed back by the second target main node is not received in the second waiting time period, the second target main node is considered to be faulty, and the connection between the second target main node and the second target main node is disconnected, so that the main node is randomly selected from other main nodes again to establish communication connection, the main node which is successfully connected is used as a new target main node, and the task state information is resent to the new target main node, so that the task state information is stored through the new target main node. The first waiting duration and the second waiting duration can be adjusted according to service requirements, and are not particularly limited.

In this embodiment, in order to solve the problem that the master node fails in the communication process between the working node and the master node, the working node may retransmit the request to the new master node in time, which is helpful for implementing high concurrency scheduling of tasks.

According to the technical scheme, a working node periodically sends a parameter acquisition request to at least one main node, receives a task type and a task load total amount fed back by the main node, performs local storage, calculates a request task number according to the task number and the task load total amount in processing when a new task processing condition is detected, and constructs a task acquisition request according to the request task number and the task type; the task acquisition request is sent to a first target master node determined in a plurality of master nodes, and the task acquisition request is used for indicating the first target master node to acquire tasks matched with the requested task number and task types for feedback, so that the problem that the high concurrency requirement cannot be met due to large calculated amount when the master node actively distributes tasks for the work nodes in the prior art is solved, the task types and the task numbers are actively requested to the master node through the work nodes, the calculated amount of the master node is reduced, and the task concurrency supported by a system is improved.

Example two

Fig. 2 is a flowchart of a distributed task scheduling method according to a first embodiment of the present invention, where the present embodiment is applicable to a case where a master node performs task scheduling according to a task request of a working node to implement task high concurrency scheduling, the method may be performed by a distributed task scheduling device, and the device may be implemented by software and/or hardware and may be generally integrated in the master node included in the distributed scheduling system. As shown in fig. 2, the method is performed by a master node included in a distributed scheduling system including a plurality of master nodes, the method comprising:

Step 210, according to the parameter acquisition request sent by the first target working node, querying a metadata base, acquiring a task type matched with the parameter acquisition request and feeding back the total task load to the first target working node.

The metadata base stores the mapping relation between the working nodes and the node parameters, wherein the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and task load total.

In this embodiment, the mapping relationship between the working node and the node parameter is written into the metadata base through the visual task operation interface provided by the front-end operation platform. The working nodes are uniformly managed and divided by configuring node parameters such as task types, task load total amounts and the like which can be processed for each working node, so that the working nodes can actively acquire tasks for processing after initialization, and a master node is not required to search the working nodes which can be processed for each task to be processed.

In this embodiment, after receiving a parameter acquisition request sent by a first target working node, a master node may extract a node identifier of the first target working node from the parameter acquisition request, further query a mapping relationship matched with the node identifier from a metadata database according to the node identifier, determine a node parameter of the first target working node according to the mapping relationship, and feed back the node parameter to the first target working node. The node parameters comprise task types and task load total amounts, the task types represent the types of tasks which can be processed by the working node, and the task types can be divided according to the concurrency quantity of the tasks, or the resource quantity occupied by the tasks, or other factors; the total amount of task load represents the maximum number of tasks that a worker node can process in parallel.

And 220, extracting the target task request number and the target task type matched with the task acquisition request according to the task acquisition request sent by the second target working node.

In this embodiment, after receiving a task acquisition request sent by a second target working node, the master node extracts node parameters corresponding to the second target working node from the task acquisition request, and determines a task type and a task request number required by the second target working node.

Step 230, acquiring task feedback matched with the target task request number and the target task type to the second target node.

Optionally, obtaining task feedback matching the target task request number and the target task type to the second target node may include: inquiring a state database according to the target task type, and acquiring a target task list to be allocated, which is matched with the target task type; the state database stores task identification lists to be allocated, which correspond to the task types respectively; acquiring a task identification set matched with the target task request number from a target task list to be allocated; requesting metadata of the tasks matched with the task identification set from a metadata base, feeding back the metadata of the tasks to the second target working node, and storing a mapping relation between the task identifications and the metadata of the tasks in the metadata base.

In this embodiment, after extracting the target task type and the target task request number from the received task acquisition request, the master node may search, according to the target task type, a target task list to be allocated that is matched with the target task type in the state database, where the target task list to be allocated includes a plurality of task identifiers in a ready state, select, from the target task list to be allocated, a plurality of task identifiers of the target task request as target task identifiers, then search, from the metadata base, task metadata corresponding to each target task identifier, and return each task metadata to the second target work node, so that the second target work node processes the task corresponding to each task metadata acquisition.

In this embodiment, only the task identifiers to be allocated corresponding to the task types are stored in the state database, and the stored data are relatively simple, which belongs to lightweight data storage and is conducive to achieving high concurrency scheduling of tasks. And the master node can directly distribute the tasks to be processed matched with the task acquisition request to the working nodes by receiving the task acquisition request actively sent by the working nodes, and can distribute the tasks to be processed without polling each working node to perform related calculation, so that the calculated amount of the master node is reduced, and high concurrency of the tasks can be easily realized.

Optionally, the method may further include: inquiring a state database according to state report information sent by a third target working node, and acquiring a target current execution task identification set and a target allocated task state set matched with the third target working node; the state database stores a current execution task identification set and an allocated task state set which correspond to each working node respectively; extracting task identifiers with the task state being finished from the state report information, and removing the task identifiers from the target current execution task identifier set; updating a target assigned task state set according to task states of all task identifiers included in the state reporting information, and adding a heartbeat time stamp of the state reporting information into an updating result; and/or

After requesting metadata of the task matching the task identification set from the metadata base and feeding back the metadata of the task to the second target working node, the method may further include: and updating the current execution task identification set and the assigned task state set corresponding to the second target node according to the task identification set.

In this embodiment, the state database further stores a current execution task identifier set and an allocated task state set corresponding to each working node, where the task identifier of the current processing task of each working node and task state information generated in the task processing process and reported by each working node are recorded respectively. After receiving the status report information sent by the third target working node, the master node acquires the node identification of the third target working node from the status report information, queries a status database according to the node identification, acquires a target current execution task identification set and a target assigned task status set matched with the third target working node, extracts the task status of each task identification from the status report information, and removes the task identification of the ending status from the target current execution task identification set, thereby indicating that the task is completed or that the task is not required to be processed. And updating task state information corresponding to each task identifier contained in the state report information into a target assigned task state set, adding a heartbeat time stamp of the state report information into an update result, and recording the execution condition of each task in each time period.

In this embodiment, in order to record the task currently processed by each working node and the task allocation situation in the system, after feeding back the metadata of the task matched with the task identifier set to the second target working node, the master node updates each task identifier included in the task identifier set to the currently executed task identifier set and the allocated task state set corresponding to the second target node, so as to record the task identifier of the task currently processed by the second target node and the task state information generated by the second target node in the task processing process.

In this embodiment, the state database stores the current execution task identifier set and the assigned task state set corresponding to each working node, and the content stored in the state database is very simple and belongs to lightweight data storage, so that the master node needs less work when reporting task state information by the working node, and is conducive to achieving high concurrency scheduling of tasks.

According to the technical scheme of the embodiment of the invention, a master node queries a metadata base according to a parameter acquisition request sent by a first target working node, acquires a task type matched with the parameter acquisition request and feeds back the total task load to the first target working node; the metadata base stores the mapping relation between the working nodes and the node parameters, wherein the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and task load total; extracting a target task request number and a target task type matched with the task acquisition request according to the task acquisition request sent by the second target working node; the task feedback matched with the target task request number and the target task type is acquired to the second target node, the problem that the high concurrency requirement cannot be met because the calculated amount is large when the master node actively distributes tasks for the work node in the prior art is solved, the calculated amount of the master node is reduced by actively requesting the task type and the task number to the master node by the work node, the task concurrency supported by the system is improved, and the light weight of task state reporting is realized by simplifying the data stored in the state database, the communication speed is improved, and the task concurrency supported by the system is further improved.

Example III

Fig. 3 is a timing diagram of a distributed task scheduling method in a third embodiment of the present invention, which can be combined with each of the alternatives in the above embodiments. Specifically, referring to fig. 3, the method may include the steps of:

firstly, a user configures initialization task parameters for a specified task, such as a working node for processing the task, a task type that the working node can process, a task load, and the like, on a visual task operation interface provided by a front-end operation platform.

Then, the front-end operation platform sets the task to be in a ready state, and stores the task identification of the task and the corresponding task state in a state database. After receiving the confirmation response returned by the state database, constructing corresponding task metadata according to the initialized task parameters configured by the user for the task, such as the acquisition address of the task, the task type and the like, and storing the corresponding task metadata into the metadata database. After receiving the confirmation response returned by the metadata base, adding the task identifier of the task into a task identifier list to be allocated, which is matched with the task type in the state database, waiting to be processed by the working node, and completing the task online process.

After a user submits a task, a plurality of subtasks are generated, for example, when a script of a crawling webpage is utilized to capture target entries, links of various new entries are captured from a classification entry of the target webpage, each link corresponds to a page to be captured, so that a plurality of subtasks are generated, and each subtask waiting to be executed is added to a task identification list waiting to be allocated in a state database to wait for processing.

Then, the working node randomly selects a master node from a plurality of master nodes to establish connection, and applies for registration to the successfully connected master node to obtain node parameters set by a user for the node, for example, task types and task load total amounts which can be processed by the working node.

After the master node is started, an interface is started to receive a request sent by the working node, and if the master node is idle and available, the working node can successfully establish connection with the master node. And when the master node fails, the connection between the working node and the failed master node is automatically disconnected, and the working node reselects other idle master nodes for communication.

Then, the working node randomly selects a master node from a plurality of master nodes again to establish connection, and sends a task acquisition request carrying task types and task request numbers to the successfully connected master nodes, the master nodes acquire target task identifiers matched with the received task acquisition request from a task identifier list to be allocated in a state database, acquire target metadata matched with each target task identifier from a metadata base, and return the target metadata to the working node sending the task acquisition request.

And then, the working node acquires the target task according to the task acquisition address in the target metadata, and processes the target task according to the task detailed information in the target metadata. The working node periodically selects the master node to establish connection, and feeds back task state information generated by each current processing task in the current period to the successfully connected master node. And the master node stores the received task state information into a state database to update the task state information, and if the master node determines that the state of the current task is an ending state according to the task state information, the master node sends a task ending instruction to a working node which sends the task state information so as to terminate the processing of the current task by the working node.

After finishing the current task, the working node triggers a new task processing condition, randomly selects a master node from a plurality of master nodes again to establish connection, and sends a task acquisition request to the successfully connected master node so as to continue processing the task.

In this embodiment, for the case that the user places the task offline, first, after the user places the current task offline on the front-end operation platform, the front-end operation platform changes the task state of the current task in the state database into an end state, sends a task end instruction to a working node that processes the current task, and moves the task identifier of the current task to an end queue, thereby completing task offline.

The front-end operation platform and the plurality of master nodes share the same state database.

After receiving the task ending instruction, the working node randomly selects a master node from a plurality of master nodes to establish connection, sends current task state information to the successfully connected master node, sends the received current task state information to a state database, and sends the task ending instruction to the working node.

And finally, the master node respectively acquires effective information corresponding to the current task from the state database and the metadata database according to the task identification of the current task, generates an effective task record, stores the effective task record in the display list so as to enable a user to check the running state information of the current task, and then deletes the data of the current task stored in the state database and the metadata database so as to reduce memory occupation.

Example IV

Fig. 4 is a schematic structural diagram of a distributed task scheduling device in a fourth embodiment of the present invention. The embodiment can be applied to the situation that the working node actively requests tasks from the main node to realize high-concurrency scheduling of the tasks, and the device can be realized by software and/or hardware and can be generally integrated in the working node included in the distributed scheduling system. As shown in fig. 4, the apparatus includes:

The parameter obtaining module 410 is configured to periodically send a parameter obtaining request to at least one master node, and receive a task type and a task load total amount fed back by the master node for local storage;

The construction module 420 is configured to calculate, when a new task processing condition is detected, a requested task number according to the in-process task number and a task load total amount, and construct a task acquisition request according to the requested task number and a task type;

The sending module 430 is configured to send a task acquisition request to a first target master node determined among the multiple master nodes, where the task acquisition request is used to instruct the first target master node to acquire a task that matches the number of requested tasks and the task type for feedback.

Optionally, the method further comprises: the state reporting module is used for acquiring task state information of at least one currently processed task when the task state reporting condition is detected, wherein the task state information comprises: mapping relation between task identification and task state; and sending task state information to a second target master node determined in the plurality of master nodes, wherein the task state information is used for indicating the second target master node to store the received task state information.

Optionally, the method further comprises: a retransmission module, configured to, after sending a task acquisition request to a first target master node determined among the multiple master nodes, re-determine a new target master node among the multiple master nodes if a task fed back by the first target master node is not received within a first waiting time period, and re-send the task acquisition request to the new target master node; and/or

The distributed task scheduling device provided by the embodiment of the invention can execute the distributed task scheduling method applied to the working node provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Example five

Fig. 5 is a schematic structural diagram of a distributed task scheduling device in a fifth embodiment of the present invention. The embodiment is applicable to the situation that the master node performs task scheduling according to task requests of the working nodes to achieve high-concurrency task scheduling, and the device can be implemented by software and/or hardware and can be generally integrated in the master node included in the distributed scheduling system. As shown in fig. 5, the apparatus includes:

The parameter feedback module 510 is configured to query the metadata database according to a parameter acquisition request sent by the first target working node, and acquire a task type and a task load total amount matched with the parameter acquisition request to feed back to the first target working node;

The parameter extraction module 520 is configured to extract, according to the task acquisition request sent by the second target working node, a target task request number and a target task type that are matched with the task acquisition request;

the task feedback module 530 is configured to obtain a task feedback matching the target task request number and the target task type, and feed the task feedback to the second target node.

Optionally, the task feedback module 530 is specifically configured to: inquiring a state database according to the target task type, and acquiring a target task list to be allocated, which is matched with the target task type; the state database stores task identification lists to be allocated, which correspond to the task types respectively; acquiring a task identification set matched with the target task request number from a target task list to be allocated; requesting metadata of the tasks matched with the task identification set from a metadata base, feeding back the metadata of the tasks to the second target working node, and storing a mapping relation between the task identifications and the metadata of the tasks in the metadata base.

Optionally, the method further comprises: the first updating module is used for inquiring the state database according to the state report information sent by the third target working node and acquiring a target current execution task identification set and a target allocated task state set which are matched with the third target working node; the state database stores a current execution task identification set and an allocated task state set which correspond to each working node respectively; extracting task identifiers with the task state being finished from the state report information, and removing the task identifiers from the target current execution task identifier set; updating a target assigned task state set according to task states of all task identifiers included in the state reporting information, and adding a heartbeat time stamp of the state reporting information into an updating result; and/or

The second updating module is used for updating the current execution task identification set and the allocated task state set corresponding to the second target node according to the task identification set after requesting the metadata of the task matched with the task identification set from the metadata base and feeding back the metadata of the task to the second target working node.

The distributed task scheduling device provided by the embodiment of the invention can execute the distributed task scheduling method applied to the main node provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Example six

Fig. 6 is a schematic structural diagram of a distributed task scheduling system in a sixth embodiment of the present invention, where the present embodiment is applicable to a case of performing high concurrency scheduling on tasks. As shown in fig. 6, the system includes: a plurality of worker nodes 610, a plurality of master nodes 620, a state database 630, and a metadata database 640, the master nodes 620 being communicatively coupled to each of the worker nodes 610, the state database 630, and the metadata database 640, respectively, wherein:

a working node 610 for executing the distributed task scheduling method applied to the working node as provided in any embodiment of the present invention;

a master node 620 for performing the distributed task scheduling method applied to the master node as provided in any embodiment of the present invention;

a status database 630, configured to store a task identifier list to be allocated corresponding to each task type, a current task identifier set to be executed corresponding to each work node, and an allocated task status set;

The metadata base 640 is configured to store a mapping relationship between a task identifier and metadata of a task, and a mapping relationship between a working node and a node parameter, where the node parameter includes: task type and task load total.

In this embodiment, by setting a plurality of master nodes, and each master node can provide services for any working node that communicates with the master node, the situation that the whole system is paralyzed and cannot work normally due to the failure of a single master node can be avoided, and the stability of the system is improved.

Optionally, the working node 610 is configured to periodically send a parameter acquisition request to at least one master node, and receive a task type and a task load total amount fed back by the master node for local storage; when a new task processing condition is detected, calculating the number of requested tasks according to the number of tasks in processing and the total task load, and constructing a task acquisition request according to the number of requested tasks and the task type; and sending the task acquisition request to a first target master node determined in the plurality of master nodes, wherein the task acquisition request is used for indicating the first target master node to acquire tasks matched with the requested task number and the task type for feedback.

Optionally, the working node 610 is further configured to: when a task state reporting condition is detected, task state information of at least one currently processed task is acquired, wherein the task state information comprises: mapping relation between task identification and task state; and sending the task state information to a second target master node determined in the plurality of master nodes, wherein the task state information is used for indicating the second target master node to store the received task state information.

Optionally, the working node 610 is further configured to: after the task acquisition request is sent to a first target master node determined in the plurality of master nodes, if a task fed back by the first target master node is not received within a first waiting time period, determining a new target master node in the plurality of master nodes again, and sending the task acquisition request to the new target master node again; and/or

If the task state updating successful response fed back by the second target master node is not received within the second waiting time period, determining a new target master node in the plurality of master nodes again, and retransmitting current task state information to the new target master node.

Optionally, the master node 620 is configured to query a metadata base according to a parameter acquisition request sent by a first target working node, and acquire a task type and a total task load amount matched with the parameter acquisition request to feed back to the first target working node; the method comprises the steps that a mapping relation between a working node and a node parameter is stored in a metadata base, and the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and task load total; extracting a target task request number and a target task type matched with a task acquisition request according to the task acquisition request sent by a second target working node; and acquiring the task matching with the target task request number and the target task type and feeding back the task to the second target node.

Optionally, the master node 620 is specifically configured to: inquiring a state database according to the target task type, and acquiring a target task list to be allocated, which is matched with the target task type; the state database stores task identification lists to be allocated, which correspond to the task types respectively; acquiring a task identification set matched with the target task request number from the target task list to be allocated; requesting metadata of the tasks matched with the task identification set from a metadata base, feeding back the metadata of the tasks to the second target working node, and storing a mapping relation between the task identifications and the metadata of the tasks in the metadata base.

Optionally, the master node 620 is further configured to: inquiring the state database according to state report information sent by a third target working node, and acquiring a target current execution task identification set and a target allocated task state set which are matched with the third target working node; the state database stores a current execution task identification set and an allocated task state set which correspond to each working node respectively; extracting task identifiers with the task state being finished from the state report information, and removing the task identifiers from the target current execution task identifier set; updating the target assigned task state set according to the task state of each task identifier included in the state report information, and adding a heartbeat time stamp of the state report information into an updating result; and/or

The master node 620 is further configured to: and after requesting metadata of the task matched with the task identification set from a metadata base and feeding back the metadata of the task to the second target working node, updating the current execution task identification set and the assigned task state set corresponding to the second target node according to the task identification set.

In this embodiment, the status database 630 includes task identifier lists to be allocated corresponding to each task type, where each task identifier list to be allocated stores task identifiers of tasks to be processed corresponding to specified task types, where the task identifiers are used to distinguish different tasks. The metadata base 640 is configured to store metadata of a task corresponding to a task identifier of a task to be processed and node parameters corresponding to a node identifier of a working node, where the metadata of the task may include task parameters configured by initializing, such as a task acquisition address and a task type, corresponding to the task identifier, and may also include task submission information, such as time when a user submits the task.

In this embodiment, in order to enable each of the master nodes 620 to provide services for the working node 610, a plurality of master nodes 620 are configured to share the state database 630 and the metadata database 640, so that each of the master nodes 610 may obtain relevant data from the state database and/or the metadata database according to the task identifier.

In this embodiment, after extracting the target task type and the target task request number from the received task acquisition request, the master node 620 may search, according to the target task type, a target task list to be allocated that matches the target task type in the state database 630, where the target task list to be allocated includes a plurality of task identifiers in a ready state, select, from the target task list to be allocated, a plurality of task identifiers of the target task request as target task identifiers, then search, from the metadata database 640, metadata of tasks corresponding to the target task identifiers, and return the metadata of the tasks to the second target working node, so that the second target working node obtains corresponding tasks according to the metadata of the tasks to process.

Optionally, the method further comprises: front-end operations platform 650, front-end operations platform 650 is communicatively coupled to metadata database 640 and status database 630, respectively;

The front-end operation platform 650 is configured to generate metadata of a plurality of tasks matched with the task parameters according to the task parameters configured in the visualized task operation interface by the user, and store the correspondence between the metadata of the tasks and the task identifiers in the metadata database; according to at least one task type configured by a user on a task operation interface, determining task types corresponding to the tasks respectively, and storing the corresponding relation between the task types and the task identifications in a state database.

In this embodiment, the front-end operation platform 650 is user-oriented, and can provide a visual task operation interface for a user, and the user can implement relevant configuration of a task only by performing a simple click operation on the task operation interface, for example, setting a task type, selecting a suitable working node to process the task, and the like, and meanwhile, the development of the complicated functions of the distributed task scheduling system by the user is shielded, so that the system has usability.

In this embodiment, the front-end operation platform 650 may obtain, through a visual task operation interface, initialized task parameters configured by a user for an online task, for example, an obtained address of the task, a task type of the task of the first type, D-F processing of the working node, and the like, and then construct metadata of a corresponding task according to the task parameters configured by the user and information such as time when the user submits the task, and store the metadata of the task and the task identifier corresponding to the metadata in the metadata base 640. Meanwhile, in order to facilitate the master node to query the task to be processed corresponding to the designated task type, after the user goes online on the task, the task type corresponding to each task is determined, and the correspondence between the task type and the task identifier is stored in the state database 630.

Optionally, the front-end operating platform 650 is further configured to: generating a mapping relation between the working node and the node parameter according to node parameter configuration information configured by a user on a task operation interface, and storing the mapping relation in a metadata base; wherein the node parameters include: task type and task load total.

In this embodiment, when initializing task parameters for task configuration through the visualized task operation interface, a user may select a designated working node to process the task, and configure node parameters for the selected working node, for example, a task type that the working node may process, a total amount of task load, and the like, so that the working node may be uniformly managed and divided, so that the working node may acquire the task to process after initialization, and avoid the master node searching for the working node that may process for each task.

Optionally, the front-end operating platform 650 is further configured to: responding to a state query request for a target task input by a user in a task operation interface, acquiring target metadata matched with the target task from a metadata base, and acquiring a current task state matched with the target task and a heartbeat time stamp in the latest update state from each assigned task state set of a state database to perform visual display; and/or

The front-end operations platform 650 is also configured to: and responding to a state query request input by a user in a task operation interface and aiming at a target working node, acquiring a current execution task identification set matched with the target working node and an allocated task state set from a state database, and performing visual display.

In this embodiment, when the front-end operation platform 650 detects a state query request of a user for a target task through a visual task operation interface, a target task identifier may be extracted from the state query request, target metadata matched with the target task may be obtained from a metadata base according to the target task identifier, and a heartbeat timestamp of a current task state and a latest update state matched with the target task identifier may be obtained from each assigned task state set of the state database, and the obtained data may be displayed to the user through the visual task operation interface.

In this embodiment, when the front-end operation platform 650 detects a status query request input by a user for a target working node through a visual task operation interface, a current execution task identifier set and an allocated task status set matched with the target working node may be obtained from a status database according to a node identifier extracted from the status query request, and the identifier of a current processing task in the current execution task identifier set and task status information generated in a processing process of each task included in the allocated task status set are visually displayed.

In this embodiment, the status database 640 may be a Redis database. The Redis database stores data in the memory in the form of key value pairs, and periodically writes updated data into a disk or writes modification operation into an additional record file, thereby realizing master-slave synchronization. In consideration of the fact that task state information has high requirements on the performance of storage and the metadata of tasks has low requirements on the performance of storage, the task state information and the task metadata are stored separately, the task state information is stored in a Redis database, and the task metadata is stored in a metadata database.

In the embodiment, by using the Redis database as a caching tool, the effect that a plurality of main nodes and working nodes share the cache is achieved, distributed task scheduling is further achieved, and concurrency of task scheduling is improved.

According to the distributed task scheduling system provided by the embodiment of the invention, one master node is randomly selected from a plurality of master nodes to be connected through the working node, the task acquisition request is sent to the master node successfully connected, the target task matched with the received task acquisition request is acquired through the master node, and the target task is returned to the working node sending the task acquisition request, so that the problems that the calculated amount is large and the high concurrency requirement cannot be met when the master node actively distributes tasks for the working node in the prior art are solved, the task types and the task number are actively requested to the master node through the working node, the calculated amount of the master node is reduced, and the task concurrency degree supported by the system is improved.

Example seven

Fig. 7 is a schematic structural diagram of a node device in a seventh embodiment of the present invention. Fig. 7 illustrates a block diagram of an exemplary node device 12 suitable for use in implementing embodiments of the present invention. The node device 12 shown in fig. 7 is only an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.

As shown in fig. 7, node device 12 is in the form of a general purpose computing device. The components of node device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.

Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Node device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by node device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. Node device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 7, commonly referred to as a "hard disk drive"). Although not shown in fig. 7, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.

Node device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), one or more devices that enable a user to interact with node device 12, and/or any devices (e.g., network card, modem, etc.) that enable node device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. Also, node device 12 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, via network adapter 20. As shown, network adapter 20 communicates with other modules of node device 12 via bus 18. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with node device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, to implement the distributed task scheduling method provided by the embodiment of the present invention.

Namely: a distributed task scheduling method is realized, the method is executed by a working node included in a distributed scheduling system, the distributed scheduling system includes a plurality of master nodes, the method includes:

Or realizing a distributed task scheduling method, wherein the method is executed by a master node included in a distributed scheduling system, the distributed scheduling system includes a plurality of master nodes, and the method includes:

Example eight

The eighth embodiment of the invention also discloses a computer storage medium, on which a computer program is stored, which when executed by a processor, implements a distributed task scheduling method.

The computer storage media of embodiments of the invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present invention may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims

1. A distributed task scheduling method, wherein the method is performed by a working node included in a distributed scheduling system including a plurality of master nodes, the method comprising:

The new task processing conditions include: the currently processed task is processed and completed, or the currently processed task is in an ending state without continuing processing;

The task acquisition request is sent to a first target master node determined in the plurality of master nodes, and the task acquisition request is used for indicating the first target master node to acquire tasks matched with the requested task number and the task type for feedback;

the method further comprises the steps of:

when a task state reporting condition is detected, task state information of at least one currently processed task is acquired, wherein the task state information comprises: mapping relation between task identification and task state;

And sending the task state information to a second target master node determined in the plurality of master nodes, wherein the task state information is used for indicating the second target master node to store the received task state information.

2. The method of claim 1, further comprising, after sending the task acquisition request to a first target master node determined among the plurality of master nodes:

if the task fed back by the first target master node is not received within the first waiting time period, determining a new target master node in the plurality of master nodes again, and sending the task acquisition request to the new target master node again; and/or

3. A distributed task scheduling method, wherein the method is performed by a master node included in a distributed scheduling system including a plurality of master nodes, the method comprising:

Inquiring a metadata base according to a parameter acquisition request sent by a first target working node, acquiring a task type matched with the parameter acquisition request and feeding back the total task load to the first target working node;

The method comprises the steps that a mapping relation between a working node and a node parameter is stored in a metadata base, and the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and task load total;

extracting a target task request number and a target task type matched with a task acquisition request according to the task acquisition request sent by a second target working node;

Acquiring task feedback matched with the target task request number and the target task type to the second target node;

the obtaining task feedback matched with the target task request number and the target task type to the second target node comprises the following steps:

Inquiring a state database according to the target task type, and acquiring a target task list to be allocated, which is matched with the target task type; and the state database stores task identification lists to be allocated, which correspond to the task types respectively.

4. A method according to claim 3, wherein obtaining task feedback matching the target task request number and target task type to the second target node comprises: acquiring a task identification set matched with the target task request number from the target task list to be allocated;

5. The method as recited in claim 4, further comprising:

Inquiring the state database according to state report information sent by a third target working node, and acquiring a target current execution task identification set and a target allocated task state set which are matched with the third target working node;

updating the target assigned task state set according to the task state of each task identifier included in the state report information, and adding a heartbeat time stamp of the state report information into an updating result;

And/or

After requesting metadata of the task matching the task identification set from a metadata base and feeding back the metadata of the task to the second target working node, the method further comprises:

6. A distributed task scheduling system, comprising: the system comprises a plurality of working nodes, a plurality of master nodes, a state database and a metadata base, wherein the master nodes are respectively in communication connection with the working nodes, the state database and the metadata base, and the master nodes are respectively connected with the working nodes, the state database and the metadata base in a communication way, wherein:

the working node for performing the method of any of claims 1-2;

The master node being adapted to perform the method of any of claims 3-5;

the state database is used for storing a task identification list to be allocated, a current execution task identification set and an allocated task state set, wherein the task identification list to be allocated corresponds to each task type, the current execution task identification set corresponds to each working node, and the allocated task state set;

The metadata base is used for storing a mapping relation between a task identifier and metadata of a task and a mapping relation between a working node and a node parameter, and the node parameter comprises: task type and task load total.

7. The system of claim 6, further comprising: the front-end operation platform is respectively in communication connection with the metadata database and the state database;

8. The system of claim 7, wherein the front-end operations platform is further configured to:

Generating a mapping relation between a working node and node parameters according to node parameter configuration information configured by a user on the task operation interface, and storing the mapping relation in the metadata base;

wherein the node parameters include: task type and task load total.

9. The system of claim 7, wherein the front-end operations platform is further configured to:

responding to a state query request for a target task input by a user in the task operation interface, acquiring target metadata matched with the target task from the metadata base, and acquiring a current task state matched with the target task and a heartbeat time stamp in the latest update state from each assigned task state set of the state database to perform visual display; and/or

The front-end operating platform is further configured to: and responding to a state query request input by a user in the task operation interface and aiming at a target working node, acquiring a current execution task identification set matched with the target working node and an allocated task state set from the state database, and performing visual display.