CN111897638A - Distributed task scheduling method and system - Google Patents

Distributed task scheduling method and system Download PDF

Info

Publication number
CN111897638A
CN111897638A CN202010732336.2A CN202010732336A CN111897638A CN 111897638 A CN111897638 A CN 111897638A CN 202010732336 A CN202010732336 A CN 202010732336A CN 111897638 A CN111897638 A CN 111897638A
Authority
CN
China
Prior art keywords
task
node
target
state
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010732336.2A
Other languages
Chinese (zh)
Other versions
CN111897638B (en
Inventor
黄强
曾耀武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huya Technology Co Ltd
Original Assignee
Guangzhou Huya Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huya Technology Co Ltd filed Critical Guangzhou Huya Technology Co Ltd
Priority to CN202010732336.2A priority Critical patent/CN111897638B/en
Priority claimed from CN202010732336.2A external-priority patent/CN111897638B/en
Publication of CN111897638A publication Critical patent/CN111897638A/en
Application granted granted Critical
Publication of CN111897638B publication Critical patent/CN111897638B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load

Abstract

The embodiment of the invention discloses a distributed task scheduling method and a distributed task scheduling system. The method is executed by a working node included in a distributed scheduling system, the distributed scheduling system includes a plurality of main nodes, and the method includes: sending a parameter acquisition request to at least one main node periodically, and receiving a task type and a task load total amount fed back by the main node for local storage; when a new task processing condition is detected, calculating the number of requested tasks according to the number of tasks in processing and the total amount of task loads, and constructing a task acquisition request according to the number of requested tasks and the type of the tasks; and sending a task obtaining request to a first target main node determined in the plurality of main nodes, wherein the task obtaining request is used for indicating the first target main node to obtain and feed back tasks matched with the requested task number and the task type. According to the technical scheme of the embodiment of the invention, the task type and the task quantity are actively requested to the main node by the working node, the calculation amount of the main node is reduced, and the task concurrency supported by the system is improved.

Description

Distributed task scheduling method and system
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a distributed task scheduling method and system.
Background
At present, when a user develops an application program, the application program needs to be run on a server cluster in parallel through a task scheduling system so as to complete application development.
In the prior art, a commonly used task scheduling system is a Yarn with a master-slave structure, but the Yarn is difficult to use and is not friendly to a user with poor development power. Meanwhile, since the Yarn needs to apply for resources first when starting a task, the calculation amount of the main node is heavy, and not only can the high concurrency requirement not be met, but also the system can be down due to the fault of the main node.
Disclosure of Invention
The embodiment of the invention provides a distributed task scheduling method and system, which can realize high concurrency of task scheduling by actively requesting task types and task quantity from a main node by a working node.
In a first aspect, an embodiment of the present invention provides a distributed task scheduling method, where the method is executed by a work node included in a distributed scheduling system, and the distributed scheduling system includes a plurality of master nodes, and the method includes:
sending a parameter acquisition request to at least one main node periodically, and receiving a task type and a task load total amount fed back by the main node for local storage;
when a new task processing condition is detected, calculating the number of requested tasks according to the number of tasks in processing and the total amount of task loads, and constructing a task acquisition request according to the number of requested tasks and the type of the tasks;
and sending a task obtaining request to a first target main node determined in the plurality of main nodes, wherein the task obtaining request is used for indicating the first target main node to obtain and feed back tasks matched with the requested task number and the task type.
Optionally, the method further includes:
when a task state reporting condition is detected, task state information of at least one task currently processed is acquired, wherein the task state information comprises: mapping relation between task identification and task state;
and sending task state information to a second target main node determined in the plurality of main nodes, wherein the task state information is used for indicating the second target main node to store the received task state information.
Optionally, after sending the task obtaining request to the first target master node determined in the plurality of master nodes, the method further includes:
if the task fed back by the first target main node is not received within the first waiting time, determining a new target main node in the plurality of main nodes again, and sending a task acquisition request to the new target main node again; and/or
And if the task state updating success response fed back by the second target main node is not received within the second waiting time, re-determining a new target main node from the plurality of main nodes, and re-sending the current task state information to the new target main node.
In a second aspect, an embodiment of the present invention further provides a distributed task scheduling method, where the method is executed by a master node included in a distributed scheduling system, the distributed scheduling system includes multiple master nodes, and the method includes:
according to a parameter acquisition request sent by a first target working node, querying a metadata base, and acquiring a task type matched with the parameter acquisition request and a total task load which are fed back to the first target working node;
the metadata base stores a mapping relation between the working nodes and the node parameters, and the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and total task load;
extracting a target task request number and a target task type matched with the task acquisition request according to the task acquisition request sent by the second target working node;
and acquiring the task matched with the target task request number and the target task type and feeding back the task to the second target node.
Optionally, obtaining the task feedback matched with the target task request number and the target task type to the second target node includes:
inquiring a state database according to the target task type, and acquiring a target task list to be distributed, which is matched with the target task type; a task identifier list to be distributed corresponding to each task type is stored in the state database;
acquiring a task identifier set matched with the target task request number in a target task list to be distributed;
and requesting the metadata of the task matched with the task identifier set from a metadata database, and feeding the metadata of the task back to the second target working node, wherein the metadata database stores the mapping relation between the task identifier and the metadata of the task.
Optionally, the method further includes:
inquiring a state database according to state reporting information sent by a third target working node, and acquiring a target current execution task identifier set and a target distributed task state set which are matched with the third target working node;
the state database stores a current execution task identification set and an allocated task state set which respectively correspond to each working node;
extracting each task identifier of which the task state is finished from the state reporting information, and removing each task identifier from a target current execution task identifier set;
updating a target distributed task state set according to the task state of each task identifier included in the state report information, and adding a heartbeat timestamp of the state report information into an updating result;
and/or
After the metadata of the task matched with the task identification set is requested from the metadata database and fed back to the second target working node, the method further comprises the following steps:
and updating the current execution task identifier set and the distributed task state set corresponding to the second target node according to the task identifier set.
In a third aspect, an embodiment of the present invention further provides a distributed task scheduling system, including: the system comprises a plurality of working nodes, a plurality of main nodes, a state database and a metadata database, wherein the main nodes are respectively in communication connection with the working nodes, the state database and the metadata database, and the main nodes are respectively in communication connection with the working nodes, the state database and the metadata database, wherein:
the working node is used for executing the distributed task scheduling method applied to the working node provided by any embodiment of the invention;
the master node is used for executing the distributed task scheduling method applied to the master node provided by any embodiment of the invention;
the state database is used for storing a task identifier list to be distributed corresponding to each task type, a current execution task identifier set corresponding to each working node and a distributed task state set;
the metadata base is used for storing the mapping relation between the task identification and the metadata of the task and the mapping relation between the working node and the node parameter, and the node parameter comprises: task type and total amount of task load.
Optionally, the method further includes: the front-end operation platform is in communication connection with the metadata database and the state database respectively;
the front-end operation platform is used for generating metadata of a plurality of tasks matched with the task parameters according to the task parameters configured in the visual task operation interface by the user and storing the corresponding relation between the metadata of the tasks and the task identifiers in a metadata database;
and determining task types respectively corresponding to the tasks according to at least one task type configured on the task operation interface by the user, and storing the corresponding relation between the task types and the task identifiers in a state database.
Optionally, the front-end operating platform is further configured to:
generating a mapping relation between the working nodes and the node parameters according to the node parameter configuration information configured by the user on the task operation interface, and storing the mapping relation in a metadata base;
wherein, the node parameter includes: task type and total amount of task load.
Optionally, the front-end operating platform is further configured to:
responding to a state query request aiming at a target task input by a user in a task operation interface, acquiring target metadata matched with the target task from a metadata database, acquiring a current task state matched with the target task and a heartbeat timestamp in the latest updating state from each distributed task state set of the state database, and performing visual display; and/or
The front-end operating platform is further configured to: and responding to a state query request input by a user in a task operation interface and aiming at the target working node, acquiring a current execution task identification set and an allocated task state set matched with the target working node from a state database, and performing visual display.
The embodiment of the invention provides a distributed task scheduling method and a distributed task scheduling system, wherein a working node periodically sends a parameter acquisition request to at least one main node, receives a task type and a task load total amount fed back by the main node for local storage, calculates the number of requested tasks according to the number of tasks and the task load total amount in processing when a new task processing condition is detected, and constructs a task acquisition request according to the number of requested tasks and the task type; the task obtaining method comprises the steps that task obtaining requests are sent to a first target main node determined in a plurality of main nodes, the task obtaining requests are used for indicating the first target main node to obtain tasks matched with the requested task number and the task types for feedback, the problems that in the prior art, when the main node actively distributes the tasks to the working nodes, the calculation amount is large, and high concurrency requirements cannot be met are solved, the task types and the task numbers are actively requested to the main nodes through the working nodes, the calculation amount of the main nodes is reduced, and the task concurrency supported by a system is improved.
Drawings
Fig. 1 is a flowchart of a distributed task scheduling method according to a first embodiment of the present invention;
fig. 2 is a flowchart of a distributed task scheduling method according to a second embodiment of the present invention;
FIG. 3 is a timing diagram of a distributed task scheduling method according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a distributed task scheduling apparatus according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a distributed task scheduling apparatus according to a fifth embodiment of the present invention;
fig. 6 is a schematic structural diagram of a distributed task scheduling system in a sixth embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device in a seventh embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a distributed task scheduling method in an embodiment of the present invention, where this embodiment is applicable to a case where a work node actively requests a task from a master node to implement task-oriented high-concurrency scheduling, and the method may be executed by a distributed task scheduling apparatus, where the apparatus may be implemented by software and/or hardware, and may generally be integrated in a work node included in a distributed scheduling system. As shown in fig. 1, the method is performed by a work node included in a distributed scheduling system, the distributed scheduling system including a plurality of master nodes, and the method includes:
and step 110, periodically sending a parameter acquisition request to at least one main node, and receiving the task type and the total task load fed back by the main node for local storage.
It should be noted that the master node and the working nodes provided in the embodiment of the present invention may be servers, and the master node and the working nodes have a master-slave structure, and each working node may communicate with any master node, and each master node may also provide services to any working node communicating with the master node.
In this embodiment, because the working node does not know the node parameter of the node during initialization, and the node parameter of each working node is periodically updated, in order to enable the working node to inform the master node of the task type and the task number that the node can process when requesting a task from the master node, the working node needs to periodically select at least one master node from a plurality of master nodes to establish communication connection, send a parameter acquisition request carrying a node identifier of the working node to the master node that successfully establishes communication connection, and then receive the task type and the total task load fed back by the master node in response to the parameter acquisition request and store the task type and the total task load in the local.
In this embodiment, for a given working node, one master node may be randomly selected from a plurality of master nodes to connect. For example, the working node a randomly selects the master node B and sends a communication connection establishment request to the master node B, and if the master node B and the working node a successfully establish a communication connection, it indicates that the master node B is available or idle; if the connection fails, the main node B is not available currently. At this time, the working node a randomly selects a master node again to send a communication connection establishment request until a communication connection is established with an available master node. And then, the working node sends a parameter acquisition request to the selected available main node through the established communication connection so as to acquire the node parameters of the node.
And 120, when a new task processing condition is detected, calculating the number of requested tasks according to the number of tasks in processing and the total amount of task loads, and constructing a task acquisition request according to the number of requested tasks and the type of the tasks.
In this embodiment, after acquiring the node parameter of the node, if a new task processing condition is detected, for example, a currently processed task is already processed and completed, or the currently processed task is in an end state and does not need to be processed continuously, the working node calculates the number of tasks that can be currently processed by the node, that is, the number of requested tasks, according to the total amount of task loads in the node parameter and the number of tasks in processing of the node, and generates a task acquisition request according to the task type and the number of requested tasks.
Step 130, sending a task obtaining request to a first target main node determined in the plurality of main nodes, where the task obtaining request is used to instruct the first target main node to obtain and feed back tasks matching the requested task number and task type.
For the same working node, the master node that sends the parameter obtaining request and the first target master node that sends the task obtaining request may not be the same master node.
In the embodiment, the working node actively requests the designated type and the designated number of tasks from the main node, so that the main node can directly select the designated number of target tasks of the designated type from the tasks to be processed and feed the tasks back to the working node, the working node which can be processed is not required to be searched for each task to be processed through calculation, the calculation amount of the main node is reduced, and high task concurrency can be easily realized.
Optionally, the method may further include: when a task state reporting condition is detected, task state information of at least one task currently processed is acquired, wherein the task state information comprises: mapping relation between task identification and task state; and sending task state information to a second target main node determined in the plurality of main nodes, wherein the task state information is used for indicating the second target main node to store the received task state information.
In this embodiment, in the process of processing a task, if a task state reporting condition is detected by a working node, for example, when a task state reporting time is reached, a task identifier of a current processing task and task state data generated in the task processing process are acquired to form task state information, at least one host node is randomly selected from a plurality of host nodes to establish a connection, and the task state information is sent to a second target host node which is successfully connected, so that the task state information is stored and updated by the second target host node.
In this embodiment, when the working node reports the task state information, it only needs to report the task identifier and the task state data, so that lightweight data transmission is realized, the task state information that the master node needs to store and update is relatively simplified, and high-concurrency scheduling of tasks is facilitated.
Optionally, after sending the task obtaining request to the first target master node determined in the plurality of master nodes, the method may further include: if the task fed back by the first target main node is not received within the first waiting time, determining a new target main node in the plurality of main nodes again, and sending a task acquisition request to the new target main node again; and/or if a task state updating success response fed back by the second target main node is not received within the second waiting time, re-determining a new target main node from the plurality of main nodes, and re-sending the current task state information to the new target main node.
In this embodiment, after the working node sends the task obtaining request to the first target master node that successfully establishes the communication connection, if the task fed back by the first target master node is not received within the first waiting time, it is considered that the first target master node is failed, and the connection between the first target master node and the first target master node is disconnected, so that the master node that is successfully connected is re-randomly selected from other master nodes to establish the communication connection, and the master node that is successfully connected is used as a new target master node, and the task obtaining request is re-sent to the new target master node to obtain the task through the new target master node.
And after the working node sends the task state information to the second target main node, if a task state update success response fed back by the second target main node is not received within a second waiting time, the second target main node is considered to be failed, and the connection between the second target main node and the second target main node is disconnected, so that main nodes are randomly selected again from other main nodes to establish communication connection, the main nodes which are successfully connected are used as new target main nodes, and the task state information is retransmitted to the new target main nodes to store the task state information through the new target main nodes. The first waiting duration and the second waiting duration may be adjusted according to a service requirement, and are not specifically limited.
In this embodiment, in response to a failure of the master node during communication between the working node and the master node, the working node may retransmit a request to a new master node in time, which is helpful for implementing high-concurrency scheduling of tasks.
According to the technical scheme of the embodiment of the invention, a working node periodically sends a parameter acquisition request to at least one main node, receives the task type and the total task load fed back by the main node for local storage, calculates the number of requested tasks according to the number of tasks and the total task load in processing when a new task processing condition is detected, and constructs a task acquisition request according to the number of requested tasks and the task type; the task obtaining method comprises the steps that task obtaining requests are sent to a first target main node determined in a plurality of main nodes, the task obtaining requests are used for indicating the first target main node to obtain tasks matched with the requested task number and the task types for feedback, the problems that in the prior art, when the main node actively distributes the tasks to the working nodes, the calculation amount is large, and high concurrency requirements cannot be met are solved, the task types and the task numbers are actively requested to the main nodes through the working nodes, the calculation amount of the main nodes is reduced, and the task concurrency supported by a system is improved.
Example two
Fig. 2 is a flowchart of a distributed task scheduling method in an embodiment of the present invention, where this embodiment is applicable to a case where a master node performs task scheduling according to a task request of a work node to implement task highly-concurrent scheduling, and the method may be executed by a distributed task scheduling apparatus, where the apparatus may be implemented by software and/or hardware, and may generally be integrated in the master node included in a distributed scheduling system. As shown in fig. 2, the method is performed by a master node included in a distributed scheduling system, the distributed scheduling system including a plurality of master nodes, and the method includes:
step 210, according to the parameter obtaining request sent by the first target working node, querying a metadata base, and obtaining the task type and the total task load amount matched with the parameter obtaining request and feeding back the task type and the total task load amount to the first target working node.
The metadata base stores a mapping relation between the working nodes and the node parameters, and the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and total amount of task load.
In this embodiment, the mapping relationship between the working node and the node parameter is written into the metadata base by the user through a visual task operation interface provided by the front-end operation platform. By configuring node parameters such as task types and task load total amount which can be processed for each working node, the working nodes are uniformly managed and divided, so that the working nodes can actively acquire tasks for processing after initialization, and a master node does not need to search for the working nodes which can be processed for each task to be processed.
In this embodiment, after receiving the parameter obtaining request sent by the first target working node, the master node may extract the node identifier of the first target working node from the parameter obtaining request, further query a mapping relationship matching the node identifier from the metadata base according to the node identifier, determine the node parameter of the first target working node according to the mapping relationship, and feed back the node parameter to the first target working node. The node parameters comprise task types and task load total, the task types represent the types of tasks which can be processed by the working nodes, and the task types can be divided according to the concurrent number of the tasks, or the number of resources occupied by the tasks, or other factors; the total amount of task load represents the maximum number of tasks that a worker node can process in parallel.
Step 220, according to the task obtaining request sent by the second target working node, extracting the target task request number and the target task type matched with the task obtaining request.
In this embodiment, after receiving a task obtaining request sent by a second target working node, the master node extracts a node parameter corresponding to the second target working node from the task obtaining request, and determines a task type and a task request number required by the second target working node.
And step 230, acquiring the task feedback matched with the target task request number and the target task type to the second target node.
Optionally, obtaining the task feedback matched with the target task request number and the target task type to the second target node may include: inquiring a state database according to the target task type, and acquiring a target task list to be distributed, which is matched with the target task type; a task identifier list to be distributed corresponding to each task type is stored in the state database; acquiring a task identifier set matched with the target task request number in a target task list to be distributed; and requesting the metadata of the task matched with the task identifier set from a metadata database, and feeding the metadata of the task back to the second target working node, wherein the metadata database stores the mapping relation between the task identifier and the metadata of the task.
In this embodiment, after extracting the target task type and the target task request number from the received task obtaining request, the master node may search, according to the target task type, a target to-be-allocated task list matched with the target task type in the state database, where the target to-be-allocated task list includes a plurality of task identifiers in a ready state, select a plurality of task identifiers of the target task request from the target to-be-allocated task list as target task identifiers, then search, from the metadata database, task metadata corresponding to each target task identifier, and return each task metadata to the second target work node, so that the second target work node obtains a corresponding task according to each task metadata to process the corresponding task.
In this embodiment, only the task identifiers to be allocated corresponding to the respective task types are stored in the state database, the stored data is relatively simplified, and the method belongs to lightweight data storage and is beneficial to realizing high-concurrency scheduling of tasks. In addition, the main node can directly distribute the tasks to be processed matched with the task acquisition requests to the working nodes by receiving the task acquisition requests actively sent by the working nodes, and can distribute the tasks to be processed without polling the working nodes to perform related calculation, so that the calculation amount of the main node is reduced, and high task concurrency can be easily realized.
Optionally, the method may further include: inquiring a state database according to state reporting information sent by a third target working node, and acquiring a target current execution task identifier set and a target distributed task state set which are matched with the third target working node; the state database stores a current execution task identification set and an allocated task state set which respectively correspond to each working node; extracting each task identifier of which the task state is finished from the state reporting information, and removing each task identifier from a target current execution task identifier set; updating a target distributed task state set according to the task state of each task identifier included in the state report information, and adding a heartbeat timestamp of the state report information into an updating result; and/or
After the metadata of the task matched with the task identification set is requested from the metadata database and fed back to the second target work node, the method may further include: and updating the current execution task identifier set and the distributed task state set corresponding to the second target node according to the task identifier set.
In this embodiment, the state database further stores a current execution task identifier set and an allocated task state set corresponding to each working node, and records a task identifier of a current processing task of each working node and task state information generated in a process of processing the task and reported by each working node. After receiving the state reporting information sent by the third target working node, the master node acquires the node identifier of the third target working node from the state reporting information, queries a state database according to the node identifier, acquires a target current execution task identifier set and a target assigned task state set which are matched with the third target working node, extracts the task state of each task identifier from the state reporting information, removes the task identifier of the end state from the target current execution task identifier set, and indicates that the task is processed completely or the task does not need to be processed. And updating the task state information corresponding to each task identifier included in the state reporting information into a target distributed task state set, adding a heartbeat timestamp of the state reporting information into an updating result, and recording the execution condition of each task in each time period.
In this embodiment, in order to record the tasks currently processed by each working node and the assignment of the tasks in the system, after feeding back the metadata of the tasks matched with the task identifier sets to the second target working node, the master node updates each task identifier included in the task identifier sets to the currently executed task identifier set and the assigned task state set corresponding to the second target node, so as to record the task identifier of the task currently processed by the second target node and the task state information generated by the second target node in the process of processing the task.
In this embodiment, the state database stores the current execution task identifier set and the assigned task state set corresponding to each working node, and the contents stored in the state database are very simplified, and belong to lightweight data storage.
According to the technical scheme of the embodiment of the invention, a main node queries a metadata base according to a parameter acquisition request sent by a first target working node, and acquires a task type matched with the parameter acquisition request and a total task load which are fed back to the first target working node; the metadata base stores a mapping relation between the working nodes and the node parameters, and the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and total task load; extracting a target task request number and a target task type matched with the task acquisition request according to the task acquisition request sent by the second target working node; the task request number and the target task type are matched, the task feedback is obtained and fed back to a second target node, the problems that in the prior art, when a main node actively distributes tasks to working nodes, the calculation amount is large and high concurrency requirements cannot be met are solved, the task types and the task number are actively requested to the main node through the working nodes, the calculation amount of the main node is reduced, the task concurrency degree supported by a system is improved, in addition, the light weight of task state report is realized through simplifying data stored in a state database, the communication speed is improved, and the task concurrency degree supported by the system is further improved.
EXAMPLE III
Fig. 3 is a timing chart of a distributed task scheduling method in a third embodiment of the present invention, which may be combined with various alternatives in the above embodiments. Specifically, referring to fig. 3, the method may include the steps of:
firstly, a user configures initialization task parameters for a specified task on a visual task operation interface provided by a front-end operation platform, for example, a work node for processing the task, a task type that can be processed by the work node, a task load amount, and the like.
And then, the front-end operating platform sets the task to be in a ready state, and stores the task identifier of the task and the corresponding task state into a state database. After receiving a confirmation response returned by the state database, constructing corresponding task metadata and storing the task metadata into a metadata database according to the initialized task parameters configured for the task by the user, such as the acquisition address of the task, the task type and the like. After receiving a confirmation response returned by the metadata database, adding the task identifier of the task into a task identifier list to be distributed, matched with the task type, in the state database, waiting to be processed by the working node, and finishing the task online process.
For example, when a script of a crawl webpage is used for grabbing a target entry, links of various new entries are grabbed from classified entries of the target webpage, each link corresponds to a page needing to be grabbed, so that a plurality of subtasks are generated, and each subtask to be executed is added into a task identification list to be allocated in a state database to be processed.
Then, the working node randomly selects a master node from the plurality of master nodes to establish a connection, applies for registration to the successfully connected master node, and obtains node parameters set by the user for the working node, such as a task type and a total task load that the working node can process.
After the master node is started, an interface is opened to receive a request sent by the working node, and if the master node is idle and available, the working node can successfully establish a connection with the master node. And when the main node fails, the connection between the working node and the failed main node is automatically disconnected, and the working node reselects other idle main nodes for communication.
And then, the working node randomly selects the master nodes from the plurality of master nodes again to establish connection, sends task obtaining requests carrying task types and task request numbers to the successfully connected master nodes, obtains target task identifiers matched with the received task obtaining requests from a task identifier list to be distributed in the state database, obtains target metadata matched with each target task identifier from the metadata database, and returns the target metadata to the working node sending the task obtaining requests.
And then, the working node acquires the target task according to the task acquisition address in the target metadata and processes the target task according to the task detailed information in the target metadata. The working nodes periodically select the main nodes to establish connection, and the task state information generated by each current processing task in the current period is fed back to the successfully connected main nodes. And the main node stores the received task state information into a state database, updates the task state information, and sends a task ending instruction to the working node sending the task state information to terminate the processing of the working node on the current task if the main node determines that the state of the current task is an ending state according to the task state information.
After finishing the current task, the working node triggers a new task processing condition, randomly selects a main node from the plurality of main nodes again to establish connection, and sends a task acquisition request to the successfully connected main node to continue processing the task.
In this embodiment, for the case that the user takes the task offline, first, after the user takes the current task offline on the front-end operating platform, the front-end operating platform changes the task state of the current task in the state database to the end state, sends a task end instruction to the working node that processes the current task, and moves the task identifier of the current task to the end queue, thereby completing the task offline.
The front-end operation platform and the main nodes share the same state database.
After receiving the task ending instruction, the working node randomly selects the master nodes from the plurality of master nodes to establish connection and sends the current task state information to the master nodes which are successfully connected, and the master nodes send the received current task state information to the state database and send the task ending instruction to the working node.
And finally, the main node respectively acquires effective information corresponding to the current task from the state database and the metadatabase according to the task identifier of the current task, generates an effective task record, stores the effective task record into a display list so that a user can check the running state information of the current task, and then deletes the data of the current task stored in the state database and the metadatabase so as to reduce the memory occupation.
Example four
Fig. 4 is a schematic structural diagram of a distributed task scheduling apparatus in a fourth embodiment of the present invention. The present embodiment is applicable to a case where the work node actively requests the task from the master node to implement task-oriented high-concurrency scheduling, and the apparatus may be implemented by software and/or hardware, and may be generally integrated in the work node included in the distributed scheduling system. As shown in fig. 4, the apparatus includes:
the parameter obtaining module 410 is configured to periodically send a parameter obtaining request to at least one host node, and receive a task type and a total task load fed back by the host node for local storage;
a constructing module 420, configured to calculate the number of requested tasks according to the number of tasks in processing and the total amount of task loads when a new task processing condition is detected, and construct a task obtaining request according to the number of requested tasks and the task type;
a sending module 430, configured to send a task obtaining request to a first target master node determined among the multiple master nodes, where the task obtaining request is used to instruct the first target master node to obtain a task feedback matching the requested task number and task type.
According to the technical scheme of the embodiment of the invention, a working node periodically sends a parameter acquisition request to at least one main node, receives the task type and the total task load fed back by the main node for local storage, calculates the number of requested tasks according to the number of tasks and the total task load in processing when a new task processing condition is detected, and constructs a task acquisition request according to the number of requested tasks and the task type; the task obtaining method comprises the steps that task obtaining requests are sent to a first target main node determined in a plurality of main nodes, the task obtaining requests are used for indicating the first target main node to obtain tasks matched with the requested task number and the task types for feedback, the problems that in the prior art, when the main node actively distributes the tasks to the working nodes, the calculation amount is large, and high concurrency requirements cannot be met are solved, the task types and the task numbers are actively requested to the main nodes through the working nodes, the calculation amount of the main nodes is reduced, and the task concurrency supported by a system is improved.
Optionally, the method further includes: a state reporting module, configured to, when a task state reporting condition is detected, obtain task state information of at least one currently processed task, where the task state information includes: mapping relation between task identification and task state; and sending task state information to a second target main node determined in the plurality of main nodes, wherein the task state information is used for indicating the second target main node to store the received task state information.
Optionally, the method further includes: the retransmission module is used for determining a new target main node in the plurality of main nodes again and sending a task acquisition request to the new target main node again if a task fed back by the first target main node is not received within a first waiting time after the task acquisition request is sent to the first target main node determined in the plurality of main nodes; and/or
And if the task state updating success response fed back by the second target main node is not received within the second waiting time, re-determining a new target main node from the plurality of main nodes, and re-sending the current task state information to the new target main node.
The distributed task scheduling device provided by the embodiment of the invention can execute the distributed task scheduling method applied to the working node provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
EXAMPLE five
Fig. 5 is a schematic structural diagram of a distributed task scheduling apparatus in a fifth embodiment of the present invention. The present embodiment is applicable to the case where the master node performs task scheduling according to the task request of the work node to implement task highly-concurrent scheduling, and the apparatus may be implemented by software and/or hardware, and may generally be integrated in the master node included in the distributed scheduling system. As shown in fig. 5, the apparatus includes:
the parameter feedback module 510 is configured to query a metadata base according to a parameter acquisition request sent by a first target working node, and acquire a task type and a total task load amount that are matched with the parameter acquisition request and feed back the task type and the total task load amount to the first target working node;
the metadata base stores a mapping relation between the working nodes and the node parameters, and the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and total task load;
a parameter extraction module 520, configured to extract, according to a task obtaining request sent by a second target working node, a target task request number and a target task type that are matched with the task obtaining request;
and a task feedback module 530, configured to obtain a task feedback matching the target task request number and the target task type to the second target node.
According to the technical scheme of the embodiment of the invention, a main node queries a metadata base according to a parameter acquisition request sent by a first target working node, and acquires a task type matched with the parameter acquisition request and a total task load which are fed back to the first target working node; the metadata base stores a mapping relation between the working nodes and the node parameters, and the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and total task load; extracting a target task request number and a target task type matched with the task acquisition request according to the task acquisition request sent by the second target working node; the task request number and the target task type are matched, the task feedback is obtained and fed back to a second target node, the problems that in the prior art, when a main node actively distributes tasks to working nodes, the calculation amount is large and high concurrency requirements cannot be met are solved, the task types and the task number are actively requested to the main node through the working nodes, the calculation amount of the main node is reduced, the task concurrency degree supported by a system is improved, in addition, the light weight of task state report is realized through simplifying data stored in a state database, the communication speed is improved, and the task concurrency degree supported by the system is further improved.
Optionally, the task feedback module 530 is specifically configured to: inquiring a state database according to the target task type, and acquiring a target task list to be distributed, which is matched with the target task type; a task identifier list to be distributed corresponding to each task type is stored in the state database; acquiring a task identifier set matched with the target task request number in a target task list to be distributed; and requesting the metadata of the task matched with the task identifier set from a metadata database, and feeding the metadata of the task back to the second target working node, wherein the metadata database stores the mapping relation between the task identifier and the metadata of the task.
Optionally, the method further includes: the first updating module is used for inquiring the state database according to the state reporting information sent by the third target working node and acquiring a target current execution task identifier set and a target distributed task state set which are matched with the third target working node; the state database stores a current execution task identification set and an allocated task state set which respectively correspond to each working node; extracting each task identifier of which the task state is finished from the state reporting information, and removing each task identifier from a target current execution task identifier set; updating a target distributed task state set according to the task state of each task identifier included in the state report information, and adding a heartbeat timestamp of the state report information into an updating result; and/or
The system further comprises a second updating module, which is used for updating the currently executed task identifier set and the assigned task state set corresponding to the second target node according to the task identifier set after requesting the metadata of the task matched with the task identifier set from the metadata database and feeding the metadata of the task back to the second target working node.
The distributed task scheduling device provided by the embodiment of the invention can execute the distributed task scheduling method applied to the main node provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
EXAMPLE six
Fig. 6 is a schematic structural diagram of a distributed task scheduling system in a sixth embodiment of the present invention, which is applicable to a situation of performing high-concurrency scheduling on a task. As shown in fig. 6, the system includes: the system comprises a plurality of working nodes 610, a plurality of master nodes 620, a state database 630 and a metadata database 640, wherein the master node 620 is respectively connected with the working nodes 610, the state database 630 and the metadata database 640 in a communication mode, and the master node 620 comprises:
a worker node 610 configured to perform a distributed task scheduling method applied to the worker node according to any embodiment of the present invention;
the master node 620 is configured to execute the distributed task scheduling method applied to the master node according to any embodiment of the present invention;
a state database 630, configured to store a list of task identifiers to be allocated corresponding to each task type, a set of currently-executed task identifiers corresponding to each work node, and a set of allocated task states;
the metadata database 640 is used for storing mapping relations between task identifications and metadata of tasks, and mapping relations between work nodes and node parameters, wherein the node parameters include: task type and total amount of task load.
In this embodiment, by setting a plurality of master nodes, each master node can provide service to any working node communicating with the master node, thereby avoiding the situation that the whole system is paralyzed and cannot work normally due to the failure of a single master node, and improving the stability of the system.
Optionally, the working node 610 is configured to periodically send a parameter obtaining request to at least one master node, and receive a task type and a total task load amount fed back by the master node to perform local storage; when a new task processing condition is detected, calculating the number of requested tasks according to the number of tasks in processing and the total amount of task loads, and constructing a task obtaining request according to the number of requested tasks and the task type; and sending the task obtaining request to a first target main node determined in the plurality of main nodes, wherein the task obtaining request is used for indicating the first target main node to obtain and feed back tasks matched with the requested task number and the task type.
Optionally, the working node 610 is further configured to: when a task state reporting condition is detected, task state information of at least one task currently processed is acquired, wherein the task state information comprises: mapping relation between task identification and task state; and sending the task state information to a second target main node determined in the plurality of main nodes, wherein the task state information is used for indicating the second target main node to store the received task state information.
Optionally, the working node 610 is further configured to: after the task acquisition request is sent to a first target main node determined in the main nodes, if a task fed back by the first target main node is not received within a first waiting time, a new target main node is determined in the main nodes again, and the task acquisition request is sent to the new target main node again; and/or
And if the task state updating success response fed back by the second target main node is not received within a second waiting time, re-determining a new target main node in the plurality of main nodes, and re-sending the current task state information to the new target main node.
Optionally, the master node 620 is configured to query a metadata database according to a parameter obtaining request sent by a first target working node, and obtain a task type and a total task load amount that are matched with the parameter obtaining request and feed back the task type and the total task load amount to the first target working node; the method comprises the steps that a metadatabase stores a mapping relation between working nodes and node parameters, wherein the mapping relation is written into the metadatabase by a user through a front-end operation platform; the node parameters include: task type and total task load; according to a task obtaining request sent by a second target working node, extracting the target task request number and the target task type matched with the task obtaining request; and acquiring the task feedback matched with the target task request number and the target task type to the second target node.
Optionally, the master node 620 is specifically configured to: querying a state database according to the target task type, and acquiring a target task list to be distributed, which is matched with the target task type; a task identifier list to be distributed corresponding to each task type is stored in the state database; acquiring a task identifier set matched with the target task request number in the target task list to be distributed; and requesting metadata of the task matched with the task identification set from a metadata database, and feeding the metadata of the task back to the second target working node, wherein the metadata database stores the mapping relation between the task identification and the metadata of the task.
Optionally, the master node 620 is further configured to: inquiring the state database according to state reporting information sent by a third target working node, and acquiring a target current execution task identifier set and a target distributed task state set which are matched with the third target working node; the state database stores a current execution task identification set and an allocated task state set which respectively correspond to each working node; extracting each task identifier of which the task state is finished from the state reporting information, and removing each task identifier from the target current execution task identifier set; updating the target distributed task state set according to the task state of each task identifier included in the state reporting information, and adding a heartbeat timestamp of the state reporting information into an updating result; and/or
The master node 620 is further configured to: and after the metadata of the task matched with the task identification set is requested from a metadata base and fed back to the second target working node, updating a currently executed task identification set and an allocated task state set corresponding to the second target node according to the task identification set.
In this embodiment, the status database 630 includes a list of task identifiers to be allocated corresponding to each task type, where each list of task identifiers to be allocated stores task identifiers of tasks to be processed corresponding to a specified task type, where the task identifiers are used to distinguish different tasks. The metadata base 640 is configured to store metadata of a task corresponding to a task identifier of a task to be processed, and node parameters corresponding to node identifiers of work nodes, where the metadata of the task may include task parameters of initialization configuration such as a task acquisition address and a task type corresponding to the task identifier, and may also include task submission information such as time for a user to submit the task.
In this embodiment, in order to enable each master node 620 to provide services for the work nodes 610, a plurality of master nodes 620 are configured to share the status database 630 and the metadata database 640, so that each master node 610 can obtain relevant data from the status database and/or the metadata database according to the task identifier.
In this embodiment, after extracting the target task type and the target task request number from the received task obtaining request, the master node 620 may search a target to-be-allocated task list matched with the target task type in the state database 630 according to the target task type, where the target to-be-allocated task list includes a plurality of task identifiers in a ready state, select a plurality of task identifiers of the target task request from the target to-be-allocated task list as target task identifiers, then search metadata of tasks corresponding to the target task identifiers from the metadata database 640, and return the metadata of the tasks to the second target work node, so that the second target work node obtains corresponding tasks according to the task metadata to process the corresponding tasks.
In this embodiment, only the task identifiers to be allocated corresponding to the respective task types are stored in the state database, the stored data is relatively simplified, and the method belongs to lightweight data storage and is beneficial to realizing high-concurrency scheduling of tasks. In addition, the main node can directly distribute the tasks to be processed matched with the task acquisition requests to the working nodes by receiving the task acquisition requests actively sent by the working nodes, and can distribute the tasks to be processed without polling the working nodes to perform related calculation, so that the calculation amount of the main node is reduced, and high task concurrency can be easily realized.
Optionally, the method further includes: the front-end operating platform 650, the front-end operating platform 650 is respectively connected with the metadata database 640 and the state database 630 in a communication way;
the front-end operating platform 650 is configured to generate metadata of the plurality of tasks matching the task parameters according to the task parameters configured in the visual task operating interface by the user, and store the corresponding relationship between the metadata of the tasks and the task identifiers in the metadata database; and determining task types respectively corresponding to the tasks according to at least one task type configured on the task operation interface by the user, and storing the corresponding relation between the task types and the task identifiers in a state database.
In this embodiment, the front-end operating platform 650 is user-oriented, and can provide a visual task operating interface for a user, and the user can implement related configuration of a task, for example, setting the type of the task, selecting a suitable work node to process the task, and the like, by performing a simple click operation on the task operating interface, and meanwhile, development of a complex function of the distributed task scheduling system by the user is shielded, so that the system has usability.
In this embodiment, the front-end operating platform 650 may obtain, through a visual task operating interface, initialization task parameters configured for the online task by the user, for example, an obtaining address of the task, a task type of the task being a first type of task, and a work node D-F process, and then construct metadata of the corresponding task according to information such as the task parameters configured by the user and a time when the user submits the task, and correspondingly store the metadata and the task identifier of the task in the metadata database 640. Meanwhile, in order to facilitate the main node to query the to-be-processed task corresponding to the specified task type, after the user logs on the task, the task types corresponding to the tasks are determined, and the corresponding relationship between the task types and the task identifiers is stored in the state database 630.
Optionally, the front-end operation platform 650 is further configured to: generating a mapping relation between the working nodes and the node parameters according to the node parameter configuration information configured by the user on the task operation interface, and storing the mapping relation in a metadata base; wherein, the node parameter includes: task type and total amount of task load.
In this embodiment, when a user configures initialization task parameters for a task through a visual task operation interface, the user may select a designated work node to process the task, and configure node parameters for the selected work node, for example, the types of tasks that can be processed by the work node, the total amount of task loads, and the like, so that the work node may be uniformly managed and divided, so that the work node may obtain the task after initialization to process, and the master node is prevented from searching processable work nodes for each task.
Optionally, the front-end operation platform 650 is further configured to: responding to a state query request aiming at a target task input by a user in a task operation interface, acquiring target metadata matched with the target task from a metadata database, acquiring a current task state matched with the target task and a heartbeat timestamp in the latest updating state from each distributed task state set of the state database, and performing visual display; and/or
The front-end operations platform 650 is also used to: and responding to a state query request input by a user in a task operation interface and aiming at the target working node, acquiring a current execution task identification set and an allocated task state set matched with the target working node from a state database, and performing visual display.
In this embodiment, when the front-end operating platform 650 detects a state query request of a user for a target task through a visual task operating interface, the front-end operating platform may extract a target task identifier from the state query request, obtain target metadata matched with the target task from a metadata base according to the target task identifier, obtain a current task state matched with the target task identifier and a heartbeat timestamp of a latest update state from each assigned task state set of the state database, and display the obtained data to the user through the visual task operating interface.
In this embodiment, when the front-end operating platform 650 detects, through the visual task operating interface, a status query request input by a user for a target work node, the current execution task identifier set and the assigned task status set that are matched with the target work node may be obtained from the status database according to a node identifier extracted from the status query request, and the identifier of the current processing task in the current execution task identifier set and the task status information generated in the processing process of each task included in the assigned task status set are visually displayed.
In this embodiment, the state database 640 may be a Redis database. The Redis database stores data into a memory in a key value pair mode, and periodically writes updated data into a disk or writes modification operation into an additional recording file, so that master-slave synchronization is realized on the basis. In consideration of the fact that task state information has high performance requirements on storage and the task metadata has low performance requirements on storage, the task state information is stored separately from the task metadata, the task state information is stored in a Redis database, and the task metadata is stored in a metadata database.
In this embodiment, the Redis database is used as a cache tool, so that the effect that a plurality of host nodes and working nodes share a cache is achieved, distributed task scheduling is further achieved, and concurrency of task scheduling is improved.
The distributed task scheduling system provided by the embodiment of the invention randomly selects one main node from the plurality of main nodes to connect through the working nodes, sends the task acquisition request to the successfully connected main node, acquires the target task matched with the received task acquisition request through the main node, and returns the target task to the working node sending the task acquisition request.
EXAMPLE seven
Fig. 7 is a schematic structural diagram of a node device in the seventh embodiment of the present invention. FIG. 7 illustrates a block diagram of an exemplary node device 12 suitable for use in implementing embodiments of the present invention. The node device 12 shown in fig. 7 is only an example, and should not bring any limitation to the function and the scope of use of the embodiment of the present invention.
As shown in FIG. 7, node device 12 is in the form of a general purpose computing device. The components of node device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Node devices 12 typically include a variety of computer system readable media. Such media may be any available media that is accessible by node device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Node devices 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 7, and commonly referred to as a "hard drive"). Although not shown in FIG. 7, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Node device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with the node device 12, and/or with any devices (e.g., network card, modem, etc.) that enable the node device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, node device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via network adapter 20. As shown, the network adapter 20 communicates with the other modules of the node device 12 via the bus 18. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with node device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the system memory 28, for example, to implement the distributed task scheduling method provided by the embodiment of the present invention.
Namely: a distributed task scheduling method is realized, the method is executed by a working node in a distributed scheduling system, the distributed scheduling system comprises a plurality of main nodes, and the method comprises the following steps:
sending a parameter acquisition request to at least one main node periodically, and receiving a task type and a task load total amount fed back by the main node for local storage;
when a new task processing condition is detected, calculating the number of requested tasks according to the number of tasks in processing and the total amount of task loads, and constructing a task acquisition request according to the number of requested tasks and the type of the tasks;
and sending a task obtaining request to a first target main node determined in the plurality of main nodes, wherein the task obtaining request is used for indicating the first target main node to obtain and feed back tasks matched with the requested task number and the task type.
Or, a distributed task scheduling method is realized, the method is executed by a main node included in a distributed scheduling system, the distributed scheduling system includes a plurality of main nodes, and the method includes:
according to a parameter acquisition request sent by a first target working node, querying a metadata base, and acquiring a task type matched with the parameter acquisition request and a total task load which are fed back to the first target working node;
the metadata base stores a mapping relation between the working nodes and the node parameters, and the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and total task load;
extracting a target task request number and a target task type matched with the task acquisition request according to the task acquisition request sent by the second target working node;
and acquiring the task matched with the target task request number and the target task type and feeding back the task to the second target node.
Example eight
The eighth embodiment of the invention also discloses a computer storage medium, wherein a computer program is stored on the computer storage medium, and the program is executed by a processor to realize the distributed task scheduling method.
Namely: a distributed task scheduling method is realized, the method is executed by a working node in a distributed scheduling system, the distributed scheduling system comprises a plurality of main nodes, and the method comprises the following steps:
sending a parameter acquisition request to at least one main node periodically, and receiving a task type and a task load total amount fed back by the main node for local storage;
when a new task processing condition is detected, calculating the number of requested tasks according to the number of tasks in processing and the total amount of task loads, and constructing a task acquisition request according to the number of requested tasks and the type of the tasks;
and sending a task obtaining request to a first target main node determined in the plurality of main nodes, wherein the task obtaining request is used for indicating the first target main node to obtain and feed back tasks matched with the requested task number and the task type.
Or, a distributed task scheduling method is realized, the method is executed by a main node included in a distributed scheduling system, the distributed scheduling system includes a plurality of main nodes, and the method includes:
according to a parameter acquisition request sent by a first target working node, querying a metadata base, and acquiring a task type matched with the parameter acquisition request and a total task load which are fed back to the first target working node;
the metadata base stores a mapping relation between the working nodes and the node parameters, and the mapping relation is written into the metadata base by a user through a front-end operation platform; the node parameters include: task type and total task load;
extracting a target task request number and a target task type matched with the task acquisition request according to the task acquisition request sent by the second target working node;
and acquiring the task matched with the target task request number and the target task type and feeding back the task to the second target node.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A distributed task scheduling method, wherein the method is executed by a work node included in a distributed scheduling system, the distributed scheduling system including a plurality of master nodes, the method comprising:
sending a parameter acquisition request to at least one main node periodically, and receiving a task type and a task load total amount fed back by the main node for local storage;
when a new task processing condition is detected, calculating the number of requested tasks according to the number of tasks in processing and the total amount of task loads, and constructing a task obtaining request according to the number of requested tasks and the task type;
and sending the task obtaining request to a first target main node determined in the plurality of main nodes, wherein the task obtaining request is used for indicating the first target main node to obtain and feed back tasks matched with the requested task number and the task type.
2. The method of claim 1, further comprising:
when a task state reporting condition is detected, task state information of at least one task currently processed is acquired, wherein the task state information comprises: mapping relation between task identification and task state;
and sending the task state information to a second target main node determined in the plurality of main nodes, wherein the task state information is used for indicating the second target main node to store the received task state information.
3. The method of claim 1 or 2, further comprising, after sending the task fetch request to a first target master node determined among the plurality of master nodes:
if the task fed back by the first target main node is not received within a first waiting time, determining a new target main node in the plurality of main nodes again, and sending the task acquisition request to the new target main node again; and/or
And if the task state updating success response fed back by the second target main node is not received within a second waiting time, re-determining a new target main node in the plurality of main nodes, and re-sending the current task state information to the new target main node.
4. A distributed task scheduling method is characterized in that the method is executed by a main node included in a distributed scheduling system, the distributed scheduling system includes a plurality of main nodes, and the method includes:
according to a parameter acquisition request sent by a first target working node, querying a metadata base, and acquiring a task type and a task load total amount which are matched with the parameter acquisition request and feeding back the task type and the task load total amount to the first target working node;
the method comprises the steps that a metadatabase stores a mapping relation between working nodes and node parameters, wherein the mapping relation is written into the metadatabase by a user through a front-end operation platform; the node parameters include: task type and total task load;
according to a task obtaining request sent by a second target working node, extracting the target task request number and the target task type matched with the task obtaining request;
and acquiring the task feedback matched with the target task request number and the target task type to the second target node.
5. The method of claim 4, wherein obtaining task feedback matching the target task request count and target task type to the second target node comprises:
querying a state database according to the target task type, and acquiring a target task list to be distributed, which is matched with the target task type; a task identifier list to be distributed corresponding to each task type is stored in the state database;
acquiring a task identifier set matched with the target task request number in the target task list to be distributed;
and requesting metadata of the task matched with the task identification set from a metadata database, and feeding the metadata of the task back to the second target working node, wherein the metadata database stores the mapping relation between the task identification and the metadata of the task.
6. The method of claim 5, further comprising:
inquiring the state database according to state reporting information sent by a third target working node, and acquiring a target current execution task identifier set and a target distributed task state set which are matched with the third target working node;
the state database stores a current execution task identification set and an allocated task state set which respectively correspond to each working node;
extracting each task identifier of which the task state is finished from the state reporting information, and removing each task identifier from the target current execution task identifier set;
updating the target distributed task state set according to the task state of each task identifier included in the state reporting information, and adding a heartbeat timestamp of the state reporting information into an updating result;
and/or
After the metadata of the task matched with the task identification set is requested from a metadata base and fed back to the second target work node, the method further comprises the following steps:
and updating the current execution task identifier set and the distributed task state set corresponding to the second target node according to the task identifier set.
7. A distributed task scheduling system, comprising: the system comprises a plurality of working nodes, a plurality of main nodes, a state database and a metadata database, wherein the main nodes are respectively in communication connection with the working nodes, the state database and the metadata database, and the system comprises:
the working node for performing the method of any one of claims 1-3;
the master node for performing the method of any one of claims 4-6;
the state database is used for storing a task identifier list to be distributed corresponding to each task type, a current execution task identifier set corresponding to each working node and a distributed task state set;
the metadata base is used for storing mapping relations between task identifications and metadata of tasks and mapping relations between working nodes and node parameters, and the node parameters comprise: task type and total amount of task load.
8. The system of claim 7, further comprising: the front-end operation platform is in communication connection with the metadata database and the state database respectively;
the front-end operating platform is used for generating metadata of a plurality of tasks matched with the task parameters according to the task parameters configured in the visual task operating interface by the user, and storing the corresponding relationship between the metadata of the tasks and the task identifiers in a metadata database;
and determining task types respectively corresponding to the tasks according to at least one task type configured on a task operation interface by a user, and storing the corresponding relation between the task types and the task identifiers in a state database.
9. The system of claim 8, wherein the front-end operations platform is further configured to:
generating a mapping relation between a working node and a node parameter according to the node parameter configuration information configured by the user on the task operation interface, and storing the mapping relation in the metadata base;
wherein the node parameters include: task type and total amount of task load.
10. The system of claim 8, wherein the front-end operations platform is further configured to:
responding to a state query request aiming at a target task input in the task operation interface by a user, acquiring target metadata matched with the target task from the metadata database, acquiring a current task state matched with the target task and a heartbeat timestamp in the latest updating state from each distributed task state set of the state database, and visually displaying the current task state and the heartbeat timestamp; and/or
The front-end operating platform is further configured to: and responding to a state query request input by a user in the task operation interface and aiming at a target working node, and acquiring a current execution task identification set and an allocated task state set matched with the target working node from the state database for visual display.
CN202010732336.2A 2020-07-27 Distributed task scheduling method and system Active CN111897638B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010732336.2A CN111897638B (en) 2020-07-27 Distributed task scheduling method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010732336.2A CN111897638B (en) 2020-07-27 Distributed task scheduling method and system

Publications (2)

Publication Number Publication Date
CN111897638A true CN111897638A (en) 2020-11-06
CN111897638B CN111897638B (en) 2024-04-19

Family

ID=

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112561264A (en) * 2020-12-04 2021-03-26 中广核工程有限公司 Service method, service system and storage medium for enterprise content
CN112632193A (en) * 2021-03-09 2021-04-09 成都虚谷伟业科技有限公司 Data distribution method for distributed database cluster system
CN113448738A (en) * 2021-08-31 2021-09-28 成都派沃特科技股份有限公司 Method, device, equipment and storage medium for adjusting availability of server
CN113590298A (en) * 2021-08-09 2021-11-02 平安银行股份有限公司 Resource scheduling method, device, server and storage medium
CN113836186A (en) * 2021-09-28 2021-12-24 北京环境特性研究所 Simulation data query method and device based on ES search engine
CN114327295A (en) * 2021-12-31 2022-04-12 华云数据控股集团有限公司 Distributed data access method and system
WO2022105138A1 (en) * 2020-11-17 2022-05-27 平安科技(深圳)有限公司 Decentralized task scheduling method, apparatus, device, and medium
CN114884880A (en) * 2022-04-06 2022-08-09 阿里巴巴(中国)有限公司 Data transmission method and system
CN116737345A (en) * 2023-08-11 2023-09-12 之江实验室 Distributed task processing system, distributed task processing method, distributed task processing device, storage medium and storage device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324533A (en) * 2012-03-22 2013-09-25 华为技术有限公司 distributed data processing method, device and system
CN111008071A (en) * 2019-12-19 2020-04-14 北京金山云网络技术有限公司 Task scheduling system, method and server
WO2020140683A1 (en) * 2019-01-04 2020-07-09 深圳壹账通智能科技有限公司 Task scheduling method and apparatus, computer device, and storage medium
CN111427694A (en) * 2020-03-26 2020-07-17 北京金山云网络技术有限公司 Task execution method, device, system and server

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324533A (en) * 2012-03-22 2013-09-25 华为技术有限公司 distributed data processing method, device and system
WO2020140683A1 (en) * 2019-01-04 2020-07-09 深圳壹账通智能科技有限公司 Task scheduling method and apparatus, computer device, and storage medium
CN111008071A (en) * 2019-12-19 2020-04-14 北京金山云网络技术有限公司 Task scheduling system, method and server
CN111427694A (en) * 2020-03-26 2020-07-17 北京金山云网络技术有限公司 Task execution method, device, system and server

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022105138A1 (en) * 2020-11-17 2022-05-27 平安科技(深圳)有限公司 Decentralized task scheduling method, apparatus, device, and medium
CN112561264A (en) * 2020-12-04 2021-03-26 中广核工程有限公司 Service method, service system and storage medium for enterprise content
CN112632193A (en) * 2021-03-09 2021-04-09 成都虚谷伟业科技有限公司 Data distribution method for distributed database cluster system
CN113590298A (en) * 2021-08-09 2021-11-02 平安银行股份有限公司 Resource scheduling method, device, server and storage medium
CN113590298B (en) * 2021-08-09 2024-03-26 平安银行股份有限公司 Resource scheduling method, device, server and storage medium
CN113448738A (en) * 2021-08-31 2021-09-28 成都派沃特科技股份有限公司 Method, device, equipment and storage medium for adjusting availability of server
CN113836186A (en) * 2021-09-28 2021-12-24 北京环境特性研究所 Simulation data query method and device based on ES search engine
CN113836186B (en) * 2021-09-28 2023-10-10 北京环境特性研究所 Simulation data query method and device based on ES search engine
CN114327295A (en) * 2021-12-31 2022-04-12 华云数据控股集团有限公司 Distributed data access method and system
CN114884880A (en) * 2022-04-06 2022-08-09 阿里巴巴(中国)有限公司 Data transmission method and system
CN114884880B (en) * 2022-04-06 2024-03-08 阿里巴巴(中国)有限公司 Data transmission method and system
CN116737345A (en) * 2023-08-11 2023-09-12 之江实验室 Distributed task processing system, distributed task processing method, distributed task processing device, storage medium and storage device

Similar Documents

Publication Publication Date Title
US5751962A (en) Object-based systems management of computer networks
US8407237B1 (en) System and method of connecting legacy database applications and new database systems
US20150120645A1 (en) System and Method for Creating a Distributed Transaction Manager Supporting Repeatable Read Isolation level in a MPP Database
CN109857723B (en) Dynamic data migration method based on expandable database cluster and related equipment
CN109933585B (en) Data query method and data query system
CN110716793B (en) Method, device, equipment and storage medium for executing distributed transaction
CN115640110B (en) Distributed cloud computing system scheduling method and device
US20190228009A1 (en) Information processing system and information processing method
CN111818145B (en) File transmission method, device, system, equipment and storage medium
CN111400350B (en) Configuration data reading method, system, electronic device and storage medium
CN110704376A (en) Log file saving method and device
EP3069272B1 (en) Managing job status
US7752225B2 (en) Replication and mapping mechanism for recreating memory durations
CN110706148A (en) Face image processing method, device, equipment and storage medium
CN110069406A (en) The TPC-DS test method and system of automatic trigger
CN113760638A (en) Log service method and device based on kubernets cluster
CN113127444B (en) Data migration method, device, server and storage medium
US20080178182A1 (en) Work state returning apparatus, work state returning method, and computer product
CN111767126A (en) System and method for distributed batch processing
US7680921B2 (en) Management system, management computer, managed computer, management method and program
CN111897638B (en) Distributed task scheduling method and system
US9898490B2 (en) Systems and methods for supporting multiple database server versions on a database machine
JP2008293278A (en) Distributed processing program, distributed processor, and the distributed processing method
CN107193654B (en) Resource operation method and device of distributed system and distributed system
CN111897638A (en) Distributed task scheduling method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant