CN111708627A

CN111708627A - Task scheduling method and device based on distributed scheduling framework

Info

Publication number: CN111708627A
Application number: CN202010575887.2A
Authority: CN
Inventors: 吴永晖
Original assignee: Ping An Property and Casualty Insurance Company of China Ltd
Current assignee: Ping An Property and Casualty Insurance Company of China Ltd
Priority date: 2020-06-22
Filing date: 2020-06-22
Publication date: 2020-09-25
Anticipated expiration: 2040-06-22
Also published as: CN111708627B

Abstract

The present application relates to the field of big data technologies, and in particular, to a task scheduling method and apparatus based on a distributed scheduling framework. The method comprises the following steps: receiving a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled; acquiring the task size of a task to be scheduled corresponding to the task identifier to be scheduled; performing fragmentation processing on a task to be scheduled according to the size of the task to obtain fragmentation tasks, and acquiring the priority of each fragmentation task according to preset logic; calculating the task load rate of each execution node in the distributed scheduling framework; and distributing each fragmentation task to each execution node according to the task load rate and the priority so as to instruct each execution node to perform task scheduling on the distributed fragmentation tasks. By adopting the method, the task scheduling efficiency can be improved. In addition, the invention also relates to a block chain technology, and the working state of each execution node is stored in the block chain.

Description

Task scheduling method and device based on distributed scheduling framework

Technical Field

The present application relates to the field of big data technologies, and in particular, to a task scheduling method and apparatus based on a distributed scheduling framework.

Background

In the field of database asynchronous scheduling, an asynchronous scheduling framework supports asynchronous scheduling, and a time expression can be set for scheduling, so that when no task exists, scheduling can be performed when time is met, and system computing resources are wasted.

In the conventional technology, task scheduling can be performed in a distributed environment, but the same task needs to be controlled to only run by a single node in the task scheduling process, and the computing mode seriously wastes the computing capacity of the distributed environment.

Disclosure of Invention

In view of the above, it is desirable to provide a task scheduling method and device based on a distributed scheduling framework, which can improve task scheduling efficiency.

A task scheduling method based on a distributed scheduling framework comprises the following steps:

receiving a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled;

acquiring the task size of a task to be scheduled corresponding to the task identifier to be scheduled;

performing fragmentation processing on a task to be scheduled according to the size of the task to obtain fragmentation tasks, and acquiring the priority of each fragmentation task according to preset logic;

calculating the task load rate of each execution node in the distributed scheduling framework;

and distributing each fragmentation task to each execution node according to the task load rate and the priority so as to instruct each execution node to perform task scheduling on the distributed fragmentation tasks.

In one embodiment, calculating the task load rate of each executing node in the distributed scheduling framework includes:

acquiring task states corresponding to all fragmentation tasks in execution nodes in a distributed scheduling framework, wherein the task states comprise a completed state and an uncompleted state;

acquiring a first number of fragmentation tasks in a finished state and a second number of fragmentation tasks in an unfinished state;

calculating a ratio of the first quantity to the second quantity;

and obtaining the task load rate of each execution node according to the ratio.

In one embodiment, allocating each fragmentation task to each execution node according to each task load rate and each priority includes:

and sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the sequence of the priorities from high to low.

In one embodiment, after calculating the task load rate of each execution node in the distributed scheduling framework, the method further includes:

obtaining the calculation performance index of each execution node according to the task load rate;

when the calculation performance index cannot meet the requirement of processing all the fragmentation tasks, the processing number of the fragmentation tasks corresponding to the calculation performance index is obtained, the fragmentation tasks corresponding to the processing number are distributed to each execution node, the rest fragmentation tasks are stored in a message queue, and when the task state of the fragmentation tasks in the execution nodes corresponds to the finished state, the fragmentation tasks are extracted from the message queue and distributed to the execution nodes until all the fragmentation tasks are distributed to the execution nodes.

In one embodiment, after obtaining the calculation performance index of the execution node according to the task load rate, the method further includes:

when the calculated performance index cannot meet the requirement of processing all the fragmentation tasks, adding a preset number of execution nodes according to the calculated performance index;

distributing each slicing task to each execution node according to each task load rate, comprising:

and distributing each fragmentation task to each execution node and the newly added execution node according to the load rate of each task.

In one embodiment, allocating each fragmentation task to each execution node according to each task load rate includes:

acquiring the working state of each execution node, wherein the working state comprises a normal state and a fault state;

and distributing each slicing task to each execution node in a normal state according to the load rate of each task.

In one embodiment, the method further comprises:

establishing a proportional relation according to the task load rate of each execution machine;

and according to the proportional relation, performing fragmentation processing on the task to be scheduled to obtain a fragmentation task, and distributing the fragmentation task to the execution machine for task scheduling.

A task scheduling device based on a distributed scheduling framework comprises:

the request receiving module is used for receiving a task scheduling request, and the task scheduling request carries a task identifier to be scheduled;

the task size obtaining module is used for obtaining the task size of the task to be scheduled corresponding to the task identifier to be scheduled;

the fragment task module is used for carrying out fragment processing on the task to be scheduled according to the size of the task to obtain a fragment task and acquiring the priority of each fragment task according to preset logic;

the load rate calculation module is used for calculating the task load rate of each execution node in the distributed scheduling framework;

and the distribution module is used for distributing each fragmentation task to each execution node according to the task load rate and the priority so as to instruct each execution node to carry out task scheduling on the distributed fragmentation task.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the method when the processor executes the computer program.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.

The task scheduling method and device based on the distributed scheduling framework comprise the following steps: the method comprises the steps that a main node receives a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled; acquiring the task size of a task to be scheduled corresponding to the task identifier to be scheduled; the task to be scheduled is subjected to fragmentation processing according to the size of the task to obtain a fragmentation task, and the primary scheduling task can be subjected to task decomposition through the fragmentation processing; then calculating the task load rate of each execution node in the distributed scheduling framework and the priority of each fragment task; distributing each decomposed fragment task to each execution node according to each task load rate and each priority, and instructing each execution node to perform task scheduling on each distributed fragment task according to a preset rule. The task to be scheduled is decomposed and distributed to a plurality of execution nodes to simultaneously execute the scheduling of the task, and the task is executed according to the priority of each fragmented task in the process of executing the task, so that the task scheduling efficiency is improved.

Drawings

FIG. 1 is a diagram illustrating an application scenario of a task scheduling method based on a distributed scheduling framework according to an embodiment;

FIG. 2 is a flowchart illustrating a task scheduling method based on a distributed scheduling framework according to an embodiment;

FIG. 3 is a flowchart illustrating a method for calculating a task load rate of each executing node in a distributed scheduling framework according to an embodiment;

FIG. 4 is a block diagram of a task scheduler based on a distributed scheduling framework in one embodiment;

FIG. 5 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The task scheduling method based on the distributed scheduling framework can be applied to the application environment shown in fig. 1. Wherein the master node 102 and the executing node 103 communicate via a network. The main node 102 receives a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled; the main node 102 acquires the task size of the task to be scheduled corresponding to the task identifier to be scheduled; performing fragmentation processing on a task to be scheduled according to the size of the task to obtain fragmentation tasks, and acquiring the priority of each fragmentation task according to preset logic; the main node 102 calculates the task load rate of each execution node in the distributed scheduling framework; the main node 102 allocates each fragmentation task to each execution node 103 according to the task load rate and the priority, so as to instruct each execution node 103 to perform task scheduling on the allocated fragmentation task.

In one embodiment, as shown in fig. 2, a task scheduling method based on a distributed scheduling framework is provided, which is described by taking the method as an example applied to the master node 102 in fig. 1, and the method includes the following steps:

step 210, receiving a task scheduling request, where the task scheduling request carries an identifier of a task to be scheduled.

Task scheduling refers to a process of acquiring resources from a computer, for example, various enterprise applications may encounter the requirement of task scheduling, such as counting the integral ranking of forum users every morning, and doing specific things at specific time.

In particular, the scheduling framework to which task scheduling corresponds may belong to a distributed execution framework, and distributed applications may run on multiple systems of a network at a given time by coordinating them to complete a particular task quickly and efficiently. A group of systems in which the distributed application is running is called a cluster, each machine running in the cluster is called a node, the node can be divided into a master node (master node) and an execution node (worker node), and further, the node can also include a monitor node (slave node). More specifically, the distributed scheduling system is based on a zookeeper framework, a cluster environment (node > -3) needs to be built by a dependent zookeeper, and the condition that the entire scheduling system is directly crashed due to the fact that a single point of failure occurs in a zookeeper cluster is prevented. The Zookeeper cluster is responsible for electing a master node in the distributed scheduling cluster, and after the master node is elected, the master node is a main node, other nodes can be used as worker nodes, and the worker nodes are execution nodes. And the execution node can also monitor the health state of the main node, and when the health state of the main node is a fault, the main node is reselected according to a preset rule.

Specifically, the master node receives a task scheduling request sent by the user terminal, where the task scheduling request may include information of a task to be scheduled, an execution time of the task to be scheduled, a task size of the task to be scheduled, and the like, and then the master node performs scheduling of the task according to the task scheduling request.

Step 220, obtaining the task size of the task to be scheduled corresponding to the task identifier to be scheduled.

Specifically, the master node obtains task attributes of the task to be scheduled, where the task attributes may include information such as a task size and a task priority of the task to be scheduled.

And step 230, performing fragmentation processing on the task to be scheduled according to the size of the task to obtain a fragmentation task, and acquiring the priority of each fragmentation task according to preset logic.

Specifically, the main node extracts the size of the task in the task attribute, and performs fragmentation processing on the task to be scheduled according to the size of the task to obtain a plurality of fragmentation tasks. The fragmentation processing is to divide the task to be scheduled into a plurality of subtasks, for example, when the task to be scheduled is large, the task to be scheduled can be divided into a plurality of subtasks, and then the plurality of subtasks can be respectively allocated to different execution nodes to be processed in parallel, so that the processing efficiency of the server on the task to be scheduled is improved.

And 240, calculating the task load rate of each execution node in the distributed scheduling framework.

The master node obtains status information of each execution node, wherein the status information may include a task load rate of each execution node. Specifically, when the task load rate of the execution node is high, it indicates that the execution node has a low capability of executing the task, and when the task load rate of the execution node is low, it indicates that the execution node has a high capability of executing the task.

And step 250, distributing each fragmentation task to each execution node according to the task load rate and the priority so as to instruct each execution node to perform task scheduling on the distributed fragmentation tasks.

And the main node distributes each fragmentation task to each execution node according to the priority of each fragmentation task so as to instruct each execution node to schedule each distributed fragmentation task according to a preset rule. If the main node preferentially distributes the slicing tasks to the execution nodes with smaller task load rate, the task load rate of each execution node is balanced, and computer resources are distributed in a balanced manner.

Specifically, the master node is responsible for fragmenting tasks needing distributed scheduling, the number of the fragments is determined according to the size of the tasks, after the fragmentation tasks are completed, fragmentation task information and priority information are issued as temporary nodes of the zookeeper, and the number of the execution nodes is determined to be allocated according to the number of the running tasks on the execution nodes.

In this embodiment, in the process of the asynchronous scheduling framework, when the framework is integrated, the framework can be used in a distributed environment, and when task scheduling is performed, instead of scheduling a task on only one node, the task is fragmented according to the size of the task, and the fragmented tasks are uniformly distributed to all the execution nodes for execution. And then all execution nodes can be mobilized to participate in the task execution process, so that the computer resources are reasonably utilized, and the task execution efficiency is greatly improved. Furthermore, the distribution of the fragmentation tasks is carried out by the main node, which is more intelligent than a distributed scheduling system without the main node and does not need the contention of the tasks, so that the execution of the fragmentation tasks can be distributed more evenly, the tasks are divided into different fragmentation tasks according to the size of the tasks, a plurality of execution nodes can process the tasks in parallel, the processing efficiency of the tasks is improved, the execution nodes can be informed according to the priority of the fragmentation tasks, and the tasks with high priority can be executed preferentially.

In one embodiment, the task calling frame further comprises a monitoring node, the monitoring node monitors the task state of each execution node, and when a newly added fragmentation task on the execution node is monitored, the priority of the newly added fragmentation task is obtained; and the execution node schedules the newly added fragment task according to the priority indication.

Further, executing a node server scheduling newly-added fragment task according to the priority indication, comprising: and when the priority of the newly added fragmentation task is the highest priority, indicating the execution node to execute the newly added fragmentation task in time, modifying the execution state of the newly added fragmentation task to be that the task is executing, and modifying the execution state of the newly added fragmentation task to be that the task is completely executed until the newly added fragmentation task is completely executed, so as to prevent the execution node from repeatedly executing the newly added fragmentation task.

In the embodiment, task scheduling according to task priority is supported, so that the tasks can be processed according to the sequence of the priorities, and the task execution capacity is improved.

In one embodiment, as shown in fig. 3, a flowchart of a method for calculating a task load rate of each execution node in a distributed scheduling framework is provided, where the method includes:

step 310, acquiring a task state corresponding to each slicing task in the execution node, where the task state includes a completed state and an uncompleted state.

Specifically, the execution node is configured to execute the fragmentation task allocated by the master node, and the monitoring node may monitor an execution state of the fragmentation task on the execution node in real time and send the execution state to the master node, so that the master node may obtain the execution state of the task on each execution node in real time, where it needs to be noted that the task incomplete state includes a task executing state and a task not yet started to be executed.

In step 320, a first number of fragmentation tasks in a completed state and a second number of fragmentation tasks in an uncompleted state are obtained.

For example, the number of tasks already distributed on each execution node, the number of tasks corresponding to the completed state on each execution node, and the second number of fragmentation tasks in the uncompleted state are recorded.

Step 330, calculate the ratio of the first quantity to the second quantity.

And 340, obtaining the task load rate of each execution node according to the ratio.

And then obtaining the load rate of each execution node according to the ratio of the first quantity to the second quantity.

In one embodiment, allocating each fragmentation task to each execution node according to each task load rate and each priority includes: and sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the sequence of the priorities from high to low.

In this embodiment, the fragmentation task with a higher priority is preferentially allocated to the execution node with a lower task load rate, which not only fully utilizes the computing power of the execution node, but also ensures that the task with a higher priority can be preferentially executed, thereby ensuring effective execution of the task.

In one embodiment, after calculating the task load rate of each execution node in the distributed scheduling framework, the method further includes: obtaining the calculation performance index of each execution node according to the task load rate; when the calculation performance index cannot meet the requirement of processing all the fragmentation tasks, the processing number of the fragmentation tasks corresponding to the calculation performance index is obtained, the fragmentation tasks corresponding to the processing number are distributed to each execution node, the rest fragmentation tasks are stored in a message queue, and when the task state of the fragmentation tasks in the execution nodes corresponds to the finished state, the fragmentation tasks are extracted from the message queue and distributed to the execution nodes until all the fragmentation tasks are distributed to the execution nodes.

The calculation performance index is used for representing the calculation capacity of each execution node on the fragmentation task, the higher the calculation performance index is, the stronger the execution capacity of the execution node is, and specifically, the calculation performance index is in an inverse relation with the task load rate. Further, when the load rates of the execution nodes are all larger, the master node can control the task distribution speed at this time, for example, the task can be stored in a message queue in the master node first and then be sent later. Specifically, when the master node judges that the task load rate of the execution nodes cannot meet the requirement of processing all the fragmentation tasks, the fragmentation tasks which cannot be processed are stored in the message queue, the task execution condition in each execution node is monitored in real time, and when the state of the fragmentation tasks is the completed state, an appropriate number of fragmentation tasks are extracted from the message queue and are continuously distributed to the execution nodes to execute the tasks until the execution of all the fragmentation tasks in the message queue is completed. It should be noted that the state of the task on the execution node is monitored by the monitoring node, and the monitoring node may report the monitored task state to the master node, so that the master node better distributes the task to the execution node according to the received state information.

Further, when the master node determines that the calculation performance index of the execution node cannot meet the requirement for processing all the fragmentation tasks, the higher-priority fragmentation task can be preferentially allocated to the execution node to be executed, and the lower-priority fragmentation task is stored in the message queue.

In one embodiment, the method further comprises: establishing a proportional relation according to the task load rate of each execution machine; and according to the proportional relation, performing fragmentation processing on the task to be scheduled to obtain a fragmentation task, and distributing the fragmentation task to the execution machine for task scheduling.

Further, the main node may also construct a proportional relationship according to the task load rate of each of the execution machines, and then perform a slicing process on the tasks to be scheduled according to the constructed proportional relationship to obtain sliced tasks, so that the task size of each sliced task corresponds to the proportional relationship. Further, the slicing tasks are distributed to corresponding execution machines to execute task scheduling, for example, the larger slicing tasks are distributed to the execution machines with smaller task load rate, and the smaller slicing tasks are distributed to the execution machines with larger task load rate.

In this embodiment, the master node may further perform fragmentation processing on the task to be scheduled according to the task load rate of each execution node to obtain a plurality of fragmentation tasks meeting the performance index of each execution computer, so that the fragmentation tasks allocated to each execution computer just meet the performance index of each execution computer, thereby achieving reasonable allocation of the fragmentation tasks and improving the processing capability of the task to be scheduled.

In this embodiment, data interaction is performed among the master node, the monitoring node, and the execution nodes to complete task scheduling together, the monitoring node monitors the task execution state of each execution node to send the task execution state to the master node in time to help the master node to reasonably distribute the fragmentation tasks according to the task load rate, and when all the fragmentation tasks cannot be completely distributed, the master node can also store the fragmentation tasks in a message queue first, so that the situation that the computing capacity is reduced due to the fact that the fragmentation tasks in the execution nodes exceed the task load is prevented.

In one embodiment, after obtaining the calculation performance index of the execution node according to the task load rate, the method further includes: and when the calculation performance index cannot meet the requirement of processing all the fragmentation tasks, adding a preset number of execution nodes according to the calculation performance index. Distributing each slicing task to each execution node according to each task load rate, comprising: and distributing each fragmentation task to each execution node and the newly added execution node according to the load rate of each task.

Specifically, after the main node allocates each fragmented task to each execution node, the method further includes calculating a task load rate corresponding to each execution node, and when the task load rate exceeds the capacity of processing the scheduling task to be executed, the number of the execution nodes may also be increased until the task load rate of each execution node is within a preset range after each fragmented task is allocated to each execution node, and the execution of the scheduling task to be allocated can be completed.

In this embodiment, the scheduling framework belongs to distributed execution, not only all nodes participate in computation, the execution efficiency is high, but also lateral expansion is facilitated, and when the computation capability is insufficient, the computation capacity can be improved by increasing the number of execution nodes. By ensuring that each execution node can normally and efficiently execute each fragment task, the efficiency of task execution is improved.

In one embodiment, allocating each fragmentation task to each execution node according to each task load rate includes: acquiring the working state of each execution node, wherein the working state comprises a normal state and a fault state; distributing each slicing task to each execution node in a normal state according to each task load rate; and the working state of each execution node is stored in the block chain.

In order to ensure that the fragmentation task is correctly executed, the main node monitors the working state of the execution node, wherein the working state can be used for representing the health condition of the execution node, when the working state of the execution node is a normal state, the main node is healthy, the task to be scheduled can be normally executed according to a preset task allocation rule, and when the working state of the execution node is a fault state, the execution node is unhealthy and the fragmentation task on the unhealthy execution node needs to be reallocated according to the preset rule. Specifically, when the health state of the execution node is a fault, extracting the fragmentation task distributed to the execution node; and distributing the extracted fragmentation tasks to the execution nodes with healthy states.

It is emphasized that to further ensure the security of the state of the execution machine, the working state of each execution node may also be stored in a node of a block chain.

In this embodiment, the health state of the execution nodes is monitored in real time, so that each execution node can normally execute the task, and particularly when the execution node fails, a failed server can be found in time, so that the failed server is prevented from affecting the normal execution of the task. Specifically, when an executing node fails, for example, a temporary node disappears, the master node redistributes the incomplete fragmentation task distributed to the executing node and evenly distributes the incomplete fragmentation task to the executing nodes which do not fail, so that the fragmentation task cannot be processed all the time due to the failure of the node.

In one embodiment, the execution node monitors the health state of the master node; and when the health state of the main node is a fault, re-selecting the main node according to a preset rule. In this embodiment, the problem of single point failure does not exist, the scheduling system is based on zookeeper, when the master node and the executing node have failures, the master node is immediately monitored, when the master node has failures, elections are performed again, and when the executing node has failures, the master node reallocates the fragmentation tasks to the executing nodes which have not failed.

In summary, in the field of asynchronous scheduling based on traditional databases, there is currently a well-known quartz framework. The framework supports asynchronous scheduling, and can set a time expression cron for scheduling, so that the framework is very convenient and popular in enterprise-level development. But quartz suffers from the following disadvantages: the computing environment only suitable for the nodes needs to control the same task to run only by a single node even in a distributed environment, otherwise, the task is repeatedly executed. Such computing can be a significant waste of computing power in a distributed environment. Due to the single-node scheduling framework, when the node is abnormal, the inevitable overall environment is in failure. The scheduling mode is single, and only timing or timing cycle scheduling can be performed. Scheduling according to task priority is not supported, and a certain time interval exists between scheduling, which wastes system computing power. Task scheduling is simple scheduling according to a time expression, so that when no task exists, scheduling can be performed when time cron is met, and therefore system computing resources are wasted.

According to the task scheduling method based on the distributed scheduling framework, the scheduling framework belongs to distributed execution, not only is the efficiency of calculation execution of all nodes participated in is high, but also the transverse expansion is convenient, and when the calculation capacity is insufficient, the calculation capacity can be improved in a mode of increasing the number of execution nodes. The problem of single point failure does not exist, the scheduling system is based on zookeeper, when the main node and the execution node are failed, the main node can be monitored immediately, election can be conducted again when the main node is failed, and when the execution node is failed, the main node can redistribute the fragmentation task to the execution node which is not failed. The distribution of the fragmentation tasks is completely carried out by the main node, and the distribution is more intelligent than a distributed scheduling system without the main node, and the task contention is not needed, so that the execution of the fragmentation tasks can be distributed more evenly, the execution nodes can be informed according to the priority of the fragmentation tasks, and the tasks with high priority can be executed preferentially.

It should be understood that although the various steps in the flow charts of fig. 2-3 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-3 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 4, there is provided a task scheduling apparatus based on a distributed scheduling framework, including:

the request receiving module 410 is configured to receive a task scheduling request, where the task scheduling request carries an identifier to be scheduled for a task.

And a task size obtaining module 420, configured to obtain a task size of the task to be scheduled corresponding to the task identifier to be scheduled.

And the fragmentation task module 430 is configured to perform fragmentation processing on the task to be scheduled according to the size of the task to obtain a fragmentation task, and obtain the priority of each fragmentation task according to a preset logic.

And a load rate calculation module 440, configured to calculate a task load rate of each execution node in the distributed scheduling framework.

The allocating module 450 is configured to allocate each fragmentation task to each execution node according to the task load rate and the priority, so as to instruct each execution node to perform task scheduling on the allocated fragmentation task.

In one embodiment, the load factor calculating module 440 includes:

and the task state acquisition unit is used for acquiring the task states corresponding to all the slicing tasks in the execution node, wherein the task states comprise a finished state and an unfinished state.

The quantity acquiring unit is used for acquiring a first quantity of the fragmentation tasks in the completed state and a second quantity of the fragmentation tasks in the uncompleted state.

A ratio calculation unit for calculating a ratio of the first quantity to the second quantity.

And the load rate calculation unit is used for obtaining the task load rate of each execution node according to the ratio.

In one embodiment, the assignment module 450 includes:

and the first distribution unit is used for sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the sequence of the priorities from high to low.

In one embodiment, the task scheduling apparatus based on the distributed scheduling framework further includes:

and the index calculation module is used for obtaining the calculation performance index of each execution node according to the task load rate.

And the task extraction module is used for acquiring the processing quantity of the fragmentation tasks corresponding to the calculation performance index when the calculation performance index cannot meet the requirement of processing all the fragmentation tasks, distributing the fragmentation tasks corresponding to the processing quantity to each execution node, storing the rest fragmentation tasks to the message queue, and extracting the fragmentation tasks from the message queue and distributing the fragmentation tasks to the execution nodes when the task state of the fragmentation tasks in the execution nodes corresponds to the finished state until all the fragmentation tasks are distributed to the execution nodes.

and the node adding module is used for adding a preset number of execution nodes according to the calculation performance index when the calculation performance index cannot meet the requirement of processing all the fragmentation tasks.

A slicing task module comprising:

and the second distribution unit is used for distributing each slicing task to each execution node and the newly added execution node according to each task load rate.

In one embodiment, the allocation module includes:

and the working state acquisition unit is used for acquiring the working state of each execution node, and the working state comprises a normal state and a fault state.

The third distribution unit is used for distributing each slicing task to each execution node in a normal state according to each task load rate; and the working state of each execution node is stored in the block chain.

For specific limitations of the task scheduling device based on the distributed scheduling framework, reference may be made to the above limitations of the distributed scheduling framework, and details are not repeated here. The modules in the task scheduling device can be implemented in whole or in part by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 5. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing the relevant data of the tasks to be scheduled. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of task scheduling based on a distributed scheduling framework.

Those skilled in the art will appreciate that the architecture shown in fig. 5 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the following steps when the processor executes the computer program: receiving a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled; acquiring the task size of a task to be scheduled corresponding to the task identifier to be scheduled; performing fragmentation processing on a task to be scheduled according to the size of the task to obtain fragmentation tasks, and acquiring the priority of each fragmentation task according to preset logic; calculating the task load rate of each execution node in the distributed scheduling framework; and distributing each fragmentation task to each execution node according to the task load rate and the priority so as to instruct each execution node to perform task scheduling on the distributed fragmentation tasks.

In one embodiment, the step of calculating the task load rate of each executing node in the distributed scheduling framework when the processor executes the computer program is further configured to: acquiring task states corresponding to all slicing tasks in an execution node, wherein the task states comprise a completed state and an uncompleted state; acquiring a first number of fragmentation tasks in a finished state and a second number of fragmentation tasks in an unfinished state; calculating a ratio of the first quantity to the second quantity; and obtaining the task load rate of each execution node according to the ratio.

In one embodiment, the step of distributing the respective fragmentation tasks to the respective execution nodes according to the respective task load rates and the respective priorities when the processor executes the computer program is further configured to: and sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the sequence of the priorities from high to low.

In one embodiment, the step after calculating the task load rate of each executing node in the distributed scheduling framework is further performed when the processor executes the computer program: obtaining the calculation performance index of each execution node according to the task load rate; when the calculation performance index cannot meet the requirement of processing all the fragmentation tasks, the processing number of the fragmentation tasks corresponding to the calculation performance index is obtained, the fragmentation tasks corresponding to the processing number are distributed to each execution node, the rest fragmentation tasks are stored in a message queue, and when the task state of the fragmentation tasks in the execution nodes corresponds to the finished state, the fragmentation tasks are extracted from the message queue and distributed to the execution nodes until all the fragmentation tasks are distributed to the execution nodes.

In one embodiment, the step after the processor executes the computer program to obtain the calculation performance index of the execution node according to the task load rate is further configured to: when the calculated performance index cannot meet the requirement of processing all the fragmentation tasks, adding a preset number of execution nodes according to the calculated performance index; when the processor executes the computer program, the step of distributing each fragmentation task to each execution node according to each task load rate is further used for: and distributing each fragmentation task to each execution node and the newly added execution node according to the load rate of each task.

In one embodiment, the step of distributing each fragmentation task to each execution node according to each task load rate when the processor executes the computer program is further configured to: acquiring the working state of each execution node, wherein the working state comprises a normal state and a fault state; distributing each slicing task to each execution node in a normal state according to each task load rate; and the working state of each execution node is stored in the block chain.

In one embodiment, the processor, when executing the computer program, is further configured to: establishing a proportional relation according to the task load rate of each execution machine; and according to the proportional relation, performing fragmentation processing on the task to be scheduled to obtain a fragmentation task, and distributing the fragmentation task to the execution machine for task scheduling.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor performs the steps of: receiving a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled; acquiring the task size of a task to be scheduled corresponding to the task identifier to be scheduled; performing fragmentation processing on a task to be scheduled according to the size of the task to obtain fragmentation tasks, and acquiring the priority of each fragmentation task according to preset logic; calculating the task load rate of each execution node in the distributed scheduling framework; and distributing each fragmentation task to each execution node according to the task load rate and the priority so as to instruct each execution node to perform task scheduling on the distributed fragmentation tasks.

In one embodiment, the computer program when executed by the processor performs the step of calculating a task load rate for each executing node in the distributed scheduling framework is further configured to: acquiring task states corresponding to all slicing tasks in an execution node, wherein the task states comprise a completed state and an uncompleted state; acquiring a first number of fragmentation tasks in a finished state and a second number of fragmentation tasks in an unfinished state; calculating a ratio of the first quantity to the second quantity; and obtaining the task load rate of each execution node according to the ratio.

In one embodiment, when the computer program is executed by the processor, the step of distributing the slicing tasks to the execution nodes according to the task load rates and the priorities is further configured to: and sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the sequence of the priorities from high to low.

In one embodiment, the computer program when executed by the processor performs the steps subsequent to calculating the task load rate for each executing node in the distributed scheduling framework is further configured to: obtaining the calculation performance index of each execution node according to the task load rate; when the calculation performance index cannot meet the requirement of processing all the fragmentation tasks, the processing number of the fragmentation tasks corresponding to the calculation performance index is obtained, the fragmentation tasks corresponding to the processing number are distributed to each execution node, the rest fragmentation tasks are stored in a message queue, and when the task state of the fragmentation tasks in the execution nodes corresponds to the finished state, the fragmentation tasks are extracted from the message queue and distributed to the execution nodes until all the fragmentation tasks are distributed to the execution nodes.

In one embodiment, the computer program when executed by the processor further performs the steps after obtaining the computed performance index of the executing node according to the task load rate further: when the calculated performance index cannot meet the requirement of processing all the fragmentation tasks, adding a preset number of execution nodes according to the calculated performance index; when the computer program is executed by the processor, the step of distributing each fragmentation task to each execution node according to each task load rate is further used for: and distributing each fragmentation task to each execution node and the newly added execution node according to the load rate of each task.

In one embodiment, when the computer program is executed by the processor, the step of distributing the slicing tasks to the execution nodes according to the task load rates is further configured to: acquiring the working state of each execution node, wherein the working state comprises a normal state and a fault state; distributing each slicing task to each execution node in a normal state according to each task load rate; and the working state of each execution node is stored in the block chain.

In one embodiment, the computer program when executed by the processor is further operable to: establishing a proportional relation according to the task load rate of each execution machine; and according to the proportional relation, performing fragmentation processing on the task to be scheduled to obtain a fragmentation task, and distributing the fragmentation task to the execution machine for task scheduling.

The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A task scheduling method based on a distributed scheduling framework is characterized by comprising the following steps:

acquiring the task size of the task to be scheduled corresponding to the task identifier to be scheduled;

according to the size of the task, performing fragmentation processing on the task to be scheduled to obtain a fragmentation task, and according to a preset logic, obtaining the priority of each fragmentation task;

calculating the task load rate of each execution node in the distributed scheduling framework; distributing each fragment task to each execution node according to the task load rate and the priority, so as to instruct each execution node to perform task scheduling on the distributed fragment task.

2. The method of claim 1, wherein calculating the task load rate of each executing node in the distributed scheduling framework comprises:

acquiring a task state corresponding to each slicing task in the execution node in a distributed scheduling framework, wherein the task state comprises a completed state and an uncompleted state;

acquiring a first number of the fragmentation tasks in the completed state and a second number of the fragmentation tasks in the uncompleted state;

calculating a ratio of the first quantity to the second quantity;

and obtaining the task load rate of each execution node according to the ratio.

3. The method according to claim 1 or 2, wherein the allocating each of the sliced tasks to each of the executing nodes according to each of the task load rates and each of the priorities comprises:

4. The method of claim 1, wherein after calculating the task load rate of each executing node in the distributed scheduling framework, further comprising:

when the calculation performance index cannot meet the requirement of processing all the fragmentation tasks, acquiring the processing quantity of the fragmentation tasks corresponding to the calculation performance index, distributing the fragmentation tasks corresponding to the processing quantity to each execution node, storing the rest fragmentation tasks to a message queue, and when the task state of the fragmentation tasks in the execution nodes corresponds to a finished state, extracting the fragmentation tasks from the message queue and distributing the fragmentation tasks to the execution nodes until all the fragmentation tasks are distributed to the execution nodes.

5. The method according to claim 1 or 4, wherein after obtaining the calculation performance index of the execution node according to the task load rate, the method further comprises:

when the calculation performance index cannot meet the requirement of processing all the fragmentation tasks, adding a preset number of execution nodes according to the calculation performance index;

the distributing each fragmentation task to each execution node according to each task load rate includes:

and distributing each fragmentation task to each execution node and the newly added execution node according to each task load rate.

6. The method of claim 1, wherein distributing each of the sliced tasks to each of the executing nodes according to each of the task load rates comprises:

distributing each slicing task to each execution node in the normal state according to each task load rate; and the working state of the execution node is stored in the block chain.

7. The method of any one of claims 1 to 6, further comprising:

8. A task scheduling apparatus based on a distributed scheduling framework, the apparatus comprising:

the system comprises a request receiving module, a task scheduling module and a task scheduling module, wherein the request receiving module is used for receiving a task scheduling request which carries a task identifier to be scheduled;

a task size obtaining module, configured to obtain a task size of a task to be scheduled corresponding to the task identifier to be scheduled;

the slicing task module is used for slicing the task to be scheduled according to the size of the task to obtain a slicing task and acquiring the priority of each slicing task according to preset logic;

9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.