CN116302452B

CN116302452B - Job scheduling method, system, device, communication equipment and storage medium

Info

Publication number: CN116302452B
Application number: CN202310563072.6A
Authority: CN
Inventors: 张垚
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2023-05-18
Filing date: 2023-05-18
Publication date: 2023-08-22
Anticipated expiration: 2043-05-18
Also published as: CN116302452A

Abstract

The embodiment of the application provides a job scheduling method, a job scheduling system, a job scheduling device, a job scheduling communication device and a job scheduling storage medium, wherein the job scheduling method comprises the following steps: receiving a preset operation program sent by a user, wherein the preset operation program comprises a preset independent task manager mark; generating an execution plan according to a preset job program, wherein the execution plan comprises an execution step; determining an operation position of the execution step according to the type of the execution step, wherein the type of the execution step is generated according to the mark of the independent task manager; acquiring the resource consumption corresponding to the execution step; and determining a target node corresponding to the execution step according to the resource consumption and the running position of the execution step, and starting the execution step. The application combines the Yarn cluster and the independent task manager node, determines the running position of the execution step according to the type of the execution step, reserves the flexible resource sharing characteristic of the Yarn cluster scheduling, and realizes that the resource sensitive calculation step can use sufficient resources of the independent task manager node.

Description

Job scheduling method, system, device, communication equipment and storage medium

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a job scheduling method, system, device, communication apparatus, and storage medium.

Background

The Flink is a distributed stream data computing engine, the Yarn is a scheduling framework of a big data Hadoop (distributed computing) system and is responsible for unified management of resources of the whole cluster, and the traditional Flink supports operation in a full independent mode or a Yarn mode. The full independent mode starts resident JobManager and TaskManager services at fixed nodes, and the services and the nodes are invariable and have insufficient flexibility.

Therefore, in order to better perform the flank operation, the related technology generally uses a Yarn mode to perform the operation, however, the running position of the TaskManager in the mode is uncertain, the life cycle and the flank operation are closely related, the flexibility is enough, but the position of the job scheduling cannot be known, so that the situation that the job execution step depends on specific resources cannot be solved, the Yarn cluster allocates a special queue for the flank operation in time, and some resources on the node such as disk IO, network bandwidth and the like still have the situation of fierce competition, thereby seriously affecting the delay and stability of the flank operation.

Disclosure of Invention

The embodiment of the application aims to provide a job scheduling method, a job scheduling system, a job scheduling device, a job scheduling communication device and a job scheduling storage medium, so as to solve the technical problem that the actual interfaces of servers with functions required by users cannot be obtained quickly in the prior art. The specific technical scheme is as follows:

In a first aspect of the present application, there is provided a job scheduling method, applied to Yarn, the method comprising:

receiving a preset operation program sent by a user, wherein the preset operation program comprises a preset independent task manager mark;

generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps;

determining the running position of each execution step according to the type of each execution step, wherein the type of each execution step is generated according to the independent task manager mark;

acquiring the resource consumption corresponding to each execution step;

and determining a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and starting the execution step.

Optionally, the types of the executing step include: an unbound label executing step and a bound label executing step, wherein the bound label executing step is executed on a node corresponding to the independent task manager.

Optionally, the determining the operation position of each execution step according to the type of each execution step includes:

Setting an operation position corresponding to the execution step as the independent task manager under the condition that the type of the execution step is detected to be a binding mark execution step;

and setting the running position corresponding to the execution step as a Yarn cluster under the condition that the type of the execution step is detected to be an unbound label execution step.

Optionally, the resource consumption includes resource consumption corresponding to the unbound label executing step, and resource consumption corresponding to the bound label executing step.

Optionally, the determining, according to the resource consumption and the running position of each execution step, the target node corresponding to each execution step includes:

and under the condition that the resource consumption is detected to meet the preset operation condition, determining the target node corresponding to each execution step according to the operation position of each execution step.

Optionally, the determining, according to the running position of each execution step, the target node corresponding to each execution step includes:

under the condition that the running position corresponding to the execution step is detected to be set as the independent task manager, the target node corresponding to the execution step is an independent task manager node;

And under the condition that the running position corresponding to the execution step is detected to be set as a Yarn cluster, the target node corresponding to the execution step is any node in the Yarn cluster.

Optionally, the Yarn includes a node state tracker and an execution step distributor, where the node state tracker is configured to maintain node information corresponding to an independent task manager, and the execution step distributor is configured to schedule the independent task manager according to the node information corresponding to the independent task manager.

Optionally, after the step of determining the running position of each of the execution steps according to the type of each of the execution steps, the method includes:

the node state tracker receives heartbeats sent by the independent task manager;

and the node state tracker determines whether the independent task manager is in a fault state according to the heartbeat, and sends node change information to the execution step distributor.

Optionally, after the step that the node state tracker determines whether the independent task manager is in a fault state according to the heartbeat and sends node change information to the execution step dispatcher, the method includes:

And the node state tracker sends a node change notification to a work manager corresponding to the current job under the condition that the node state tracker detects the change of the independent task manager node, so that the work manager changes the independent task manager node corresponding to the current job.

under the condition that the type of the execution step is detected to be a binding mark execution step, the execution step distributor acquires independent task manager node information corresponding to the current job from the node state tracker, and generates the running position of the binding mark execution step according to the matching of the label corresponding to the binding mark execution step in the independent task manager node information.

Optionally, after the step of determining the target node corresponding to each execution step according to the resource consumption and the running position of each execution step and starting the execution step, the method includes:

if the target node is detected to be in a fault state, scheduling the execution step corresponding to the target node to a second target node with the same binding mark if the target node is an independent task manager node, or scheduling the execution step corresponding to the target node to any node in the Yarn cluster based on a preset node scheduling strategy;

And if the target node is a Yarn node, eliminating the target node from the Yarn cluster, and scheduling the execution step corresponding to the target node to other nodes in the Yarn cluster.

Optionally, after the step of scheduling the executing step corresponding to the target node to any node in the yan cluster based on the preset node scheduling policy, the method includes:

and stopping executing the executing step corresponding to the target node, and sending the fault information corresponding to the target node to a user so that the user maintains the target node.

receiving an independent task manager node capacity expansion instruction sent by a user;

creating a target independent task manager node according to the independent task manager node capacity expansion instruction;

acquiring the average value of the number of execution steps corresponding to all the independent task manager nodes;

and scheduling the target independent task manager node according to the average value of the number of the execution steps until the number of the execution steps corresponding to the target independent task manager node reaches the average value of the number of the execution steps.

receiving an upgrade instruction sent by a user and scheduled by an execution step, wherein the upgrade instruction comprises a target execution step;

and dispatching any node in the Yarn cluster where the target execution step is located to an independent task manager node according to the upgrading instruction and a preset node dispatching strategy.

receiving a degradation instruction sent by a user and scheduled by an execution step, wherein the degradation instruction comprises a target execution step;

and dispatching the independent task manager node where the target executing step is located to any node in the Yarn cluster according to the degradation instruction and a preset node dispatching strategy.

In yet another aspect of the present application, there is also provided a job scheduling system including an independent task manager node, a Yarn cluster node, and a Yarn;

The independent task manager node is in communication connection with the Yarn cluster node;

the independent task manager node comprises a preset independent task manager mark, and the independent task manager node is used for executing the binding mark executing step;

the Yarn cluster node is used for executing an unbound label executing step;

the Yarn is used for receiving a preset operation program sent by a user, wherein the preset operation program comprises a preset independent task manager mark; generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps; determining the running position of each execution step according to the type of each execution step, wherein the type of each execution step is generated according to the independent task manager mark; acquiring the resource consumption corresponding to each execution step; and determining a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and starting the execution step.

Optionally, the Yarn includes a node state tracker and an execution step allocator;

the node state tracker is used for maintaining node information corresponding to the independent task manager;

The execution step distributor is used for scheduling the independent task manager according to the node information corresponding to the independent task manager.

In still another aspect of the present application, there is also provided a job scheduling apparatus applied to a server management platform, the apparatus including:

the receiving module is used for receiving a preset operation program sent by a user, wherein the preset operation program comprises a preset independent task manager mark;

the generation module is used for generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps;

the first determining module is used for determining the running position of each executing step according to the type of each executing step, wherein the type of each executing step is generated according to the independent task manager mark;

the acquisition module is used for acquiring the resource consumption corresponding to each execution step;

and the second determining module is used for determining a target node corresponding to each executing step according to the resource consumption and the running position of each executing step, and starting the executing step.

In yet another aspect of the present application, there is also provided a communication device including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;

A memory for storing a computer program;

and the processor is used for realizing any one of the job scheduling methods when executing the programs stored in the memory.

In yet another aspect of the present application, there is also provided a computer readable storage medium having instructions stored therein which, when executed on a computer, cause the computer to perform any of the job scheduling methods described above.

In yet another aspect of the application, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the job scheduling methods described above.

The job scheduling method provided by the embodiment of the application is characterized by receiving a preset job program sent by a user, wherein the preset job program comprises a preset independent task manager mark; generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps; determining the running position of each execution step according to the type of each execution step, wherein the type of each execution step is generated according to the independent task manager mark; acquiring the resource consumption corresponding to each execution step; and determining a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and starting the execution step. The embodiment of the application combines the advantages of the Yarn cluster and the independent task manager node deployment form, determines the running position of each execution step according to the type of each execution step, namely, reserves the flexibility and the resource sharing characteristic of the Yarn cluster scheduling, realizes that the resource sensitive calculation step can use sufficient resources of the independent task manager node, realizes the heterogeneous deployment capability of the Flink job cluster, ensures that the Flink job can fully utilize the resources of the heterogeneous cluster, and maximizes the utilization of the resources by scheduling the step depending on special resources to the matched node for execution, thereby reducing the risk that the flow calculation process possibly encounters performance bottleneck.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

FIG. 1 is a flowchart showing steps of a job scheduling method according to an embodiment of the present application;

FIG. 2 shows a second flowchart of steps of a job scheduling method according to an embodiment of the present application;

FIG. 3 shows a third flowchart of steps of a job scheduling method provided by an embodiment of the present application;

FIG. 4 shows a fourth flowchart of steps of a job scheduling method according to an embodiment of the present application;

FIG. 5 shows a flowchart of steps of a job scheduling method according to an embodiment of the present application;

FIG. 6 is a flowchart showing steps of a job scheduling method according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a job scheduling system according to an embodiment of the present application;

FIG. 8 is a block diagram of an apparatus for job scheduling according to an embodiment of the present application;

fig. 9 is a block diagram of a communication device according to an embodiment of the present application;

FIG. 10 is a schematic diagram of job scheduling according to an embodiment of the present application;

fig. 11 shows a job scheduling failure processing flowchart provided by an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the following detailed description of the embodiments of the present application will be given with reference to the accompanying drawings. However, those of ordinary skill in the art will understand that in various embodiments of the present application, numerous technical details have been set forth in order to provide a better understanding of the present application. However, the claimed application may be practiced without these specific details and with various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not be construed as limiting the specific implementation of the present application, and the embodiments can be mutually combined and referred to without contradiction.

Referring to fig. 1, a first step flowchart of a job scheduling method provided by an embodiment of the present application is shown, where the method may include:

step 101, receiving a preset operation program sent by a user, wherein the preset operation program comprises a preset independent task manager mark;

It should be noted that, in the embodiment of the present application, the whole job scheduling only needs to bind the mark of the scheduling node to the place needing special scheduling when programming, and the modification to the Flink programming API is small on the basis of realizing the function.

Therefore, when the job program is written, in the execution step of scheduling to the independent TaskManager node, the TaskManager flag corresponding to the job stage is designated, that is, the user sends the pre-written job program to the Yarn, and the pre-written job program includes the independent TaskManager flag preset by the user, so that the binding of the scheduling position in the job planning stage is completed in step 101.

It should be noted that, in the embodiment of the present application, the flag of the independent task manager is preset, and in the job program set by the user, the flag of the independent task manager may be set for the execution steps possibly included in the job program, so as to serve as a basis for dividing the types of the execution steps according to the flag.

Step 102, generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps;

it should be noted that, in the embodiment of the present application, in the stage of generating an execution plan by submitting a job, the execution plan is generated according to a preset job program, and each execution step in the preset job program may be divided into two types, one type is an unbound TaskManager marked execution step, and the unbound TaskManager marked execution step may directly participate in the dispatching of the Yarn cluster, and obeys corresponding dispatching rules according to a dispatcher configured inside the traditional Yarn cluster. The other type is an execution step binding the TaskManager mark, and the execution step binding the TaskManager mark does not participate in the dispatching of the Yarn cluster, but is fixedly dispatched to the node where the independent TaskManager matching the mark is located.

It should be noted that, in the present application, an independent task manager flag is preset, that is, an operation position of each execution step is determined for each execution step according to whether an independent task manager flag is bound, where an unbound flag execution step represents that an independent task manager flag is not bound to the execution step, and a bound flag execution step represents that an independent task manager flag is bound to the execution step.

Furthermore, in order to realize an improved scheduling mode, the application introduces two modules of a node state tracker and an execution step distributor in the Yarn cluster, and particularly, reference is made to the following description.

Step 103, determining the running position of each execution step according to the type of each execution step, wherein the type of each execution step is generated according to the mark of the independent task manager;

it should be noted that, in the present application, the execution plan generated in step 102 includes a plurality of execution steps, each of which may be allocated to a different node for processing, and in the present application, the execution steps are divided into two types, specifically, the types of the execution steps include: an unbound label executing step and a bound label executing step.

The running position of the execution step can be determined according to the type of the execution step, specifically, the binding mark execution step is executed on a node corresponding to the independent task manager, and the unbinding mark execution step can be executed in the Yarn cluster.

Further, the determining the operation position of each execution step according to the type of each execution step includes: setting the running position corresponding to the execution step as an independent task manager under the condition that the type of the execution step is detected to be the binding mark execution step; and setting the running position corresponding to the execution step as a Yarn cluster under the condition that the type of the execution step is detected as the unbound label execution step.

And 104, acquiring the resource consumption corresponding to each execution step.

Further, the resource consumption includes resource consumption corresponding to the unbound label executing step, and resource consumption corresponding to the bound label executing step.

It should be noted that, resource consumption needs to be calculated for each execution step, specifically, two kinds of job steps are divided to calculate resource consumption: the execution steps scheduled to the Yarn cluster nodes accept the resource limit configured by the Yarn scheduler, the upper limit of the resources (CPU core number, memory capacity) of each step must not exceed the limit of the Yarn queue, and the resources consumed by all the execution steps running on the Yarn and other jobs in the cluster participate in the centralized management and unified allocation of the Yarn scheduler, so the part belongs to the calculation sharing resource consumption. The other type of execution step is scheduled to an independent TaskManager node, the resource restriction of the execution step is subject to the resource allocation of the TaskManager, and the resource of the TaskManager belongs to the exclusive use of the Flink job and does not share the resource with other jobs, so that the task belongs to the exclusive resource consumption of calculation.

And 105, determining a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and starting the execution step.

It should be noted that, in the embodiment of the present application, in the case that the computing resource consumption in step 104 may satisfy the operation condition, determining the target node of the final deployment schedule according to the binding mark executing step and the unbinding mark executing step allocated in the preset job program, and starting all the executing steps.

It should be noted that, the steps 101-105 implement the whole flow of the flank job scheduling execution.

In addition, referring to fig. 10, fig. 10 shows a job scheduling schematic provided by the embodiment of the present application, for a specific example of one affinity scheduling process, there are two job processes, where the job process of job a is: in the operation A, the label A is bound in the step 1, and the labels are not bound in the steps 2 and 3; the operation flow of the operation B is as follows: in the operation B, the label A is bound in the step 1, the label B is bound in the step 2, and the labels are not bound in the steps 3 and 4; as shown in fig. 10, the scheduling results corresponding to the two job flows are shown in fig. 10, the label a is preset on the independent task manager node 1, and the label B is preset on the independent task manager node 2, so that the step 1 in the job a, the step 1 in the job B and the independent task manager node 1 are matched, the step 2 in the job B and the independent task manager node 2 are matched, and the rest of the steps are matched with the nodes in the Yarn cluster as required based on the resource allocation of Yarn.

Referring to fig. 2, a second step flowchart of a job scheduling method provided by an embodiment of the present application is shown, where the method may include:

step 201, receiving a preset job program sent by a user, wherein the preset job program comprises a preset independent task manager mark;

step 202, generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps;

step 203, determining the running position of each execution step according to the type of each execution step, wherein the type of each execution step is generated according to the independent task manager mark;

step 204, obtaining the resource consumption corresponding to each execution step;

it should be noted that, the steps 201 to 204 are discussed with reference to the foregoing, and are not repeated here.

Step 205, determining a target node corresponding to each execution step according to the operation position of each execution step when the resource consumption is detected to meet the preset operation condition.

It should be noted that, in the embodiment of the present application, the resource consumption needs to be calculated for each execution step, specifically, two kinds of job steps are divided to calculate the resource consumption: the execution steps scheduled to the Yarn cluster nodes accept the resource limit configured by the Yarn scheduler, the upper limit of the resources (CPU core number, memory capacity) of each step must not exceed the limit of the Yarn queue, and the resources consumed by all the execution steps running on the Yarn and other jobs in the cluster participate in the centralized management and unified allocation of the Yarn scheduler, so the part belongs to the calculation sharing resource consumption. The other type of execution step is scheduled to an independent TaskManager node, the resource restriction of the execution step is subject to the resource allocation of the TaskManager, and the resource of the TaskManager belongs to the exclusive use of the Flink job and does not share the resource with other jobs, so that the task belongs to the exclusive resource consumption of calculation.

Further, in step 205, the determining, according to the running position of each execution step, the target node corresponding to each execution step includes: under the condition that the running position corresponding to the execution step is detected to be set as the independent task manager, the target node corresponding to the execution step is an independent task manager node; and under the condition that the running position corresponding to the execution step is detected to be set as a Yarn cluster, the target node corresponding to the execution step is any node in the Yarn cluster.

It should be noted that, under the condition that the computing resource consumption can meet the running condition, determining the target node of the final deployment schedule according to the binding mark executing step and the unbinding mark executing step distributed in the preset job program, and starting all executing steps.

Specifically, the target node may be an independent task manager node, or may be any node in the yan cluster.

Referring to fig. 3, a step flowchart three of a job scheduling method provided by an embodiment of the present application is shown, where the method may include:

step 301, receiving a preset job program sent by a user, wherein the preset job program comprises a preset independent task manager mark;

step 302, generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps;

step 303, determining the running position of each execution step according to the type of each execution step, wherein the type of each execution step is generated according to the independent task manager mark;

step 304, obtaining the resource consumption corresponding to each execution step;

and step 305, determining a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and starting the execution step.

The Yarn comprises a node state tracker and an execution step distributor, wherein the node state tracker is used for maintaining node information corresponding to an independent task manager, and the execution step distributor is used for scheduling the independent task manager according to the node information corresponding to the independent task manager.

Step 306, the node state tracker receives heartbeats sent by an independent task manager;

step 307, the node state tracker determines whether the independent task manager is in a fault state according to the heartbeat, and sends node change information to the execution step dispatcher.

And step 308, when the node state tracker detects the change of the independent task manager node, the execution step distributor sends a node change notification to the work manager corresponding to the current job so as to enable the work manager to change the independent task manager node corresponding to the current job.

It should be noted that, in the above steps 306-307, yan in the present application introduces two modules of a node state tracker and an execution step allocator.

For the node state tracker, the node state tracker is used for maintaining information of all independent TaskManager nodes, and classifying all independent TaskManager nodes according to corresponding marks.

Specifically, the relationship between the labels and the nodes may be stored and categorized, for example, (TM is a TaskManager abbreviation): { "Label 1": [ "TM1", "TM2" ] }, { "Label 2": [ "TM3" ] }.

And, the node state tracker receives the heartbeat sent by the independent task manager, and is used for detecting whether the heartbeat survives, for example, if the time period exceeding the preset heartbeat time interval is multiplied by 30, and the heartbeat sent by the independent task manager is not received, then the task manager node can be considered to have a fault.

The node state tracker is also responsible for responding to changes in nodes, such as new additions and subtractions of nodes, and may immediately notify the execution step allocator when the node state tracker detects a node change.

For the execution step allocator, the execution step allocator is responsible for executing the scheduling and responding to the node changes. And when the scheduling is executed, the latest independent task manager node information is taken out from the node state tracker, and the matched independent task manager node is inquired according to the label bound in the executing step. When the execution steps of independent TaskManager marks are scheduled and bound, the data output target of the upstream execution steps and the input source of the downstream execution step data are set as the independent TaskManager nodes, and other unbound nodes participate in the original Yarn scheduling process of the Flink. In addition, the execution step allocator saves the scheduled job information after the completion of the scheduling.

Specifically, the node state tracker and the execution step allocator may implement the following operations: if the node state tracker detects that the TaskManager changes, the execution step allocator notifies JobManager of the job and changes the node where the execution step running on the TaskManager is located. These changes include the following: the target task manager node does not exist (the task manager node with the matched mark cannot be found), and the target task manager node has a plurality of problems, and the independent task manager node fails during operation of the job.

The embodiment of the application combines the advantages of the Yarn cluster and the independent task manager node deployment form, determines the running position of each execution step according to the type of each execution step, namely, reserves the flexibility and the resource sharing characteristic of the Yarn cluster scheduling, realizes that the resource sensitive calculation step can use sufficient resources of the independent task manager node, realizes the heterogeneous deployment capability of the Flink job cluster, ensures that the Flink job can fully utilize the resources of the heterogeneous cluster, and maximizes the utilization of the resources by scheduling the step depending on special resources to the matched node for execution, thereby reducing the risk that the flow calculation process possibly encounters performance bottleneck.

Referring to fig. 4, a fourth step flowchart of a job scheduling method according to an embodiment of the present application is shown, where the method may include:

step 401, receiving a preset job program sent by a user, wherein the preset job program comprises a preset independent task manager mark;

step 402, generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps;

step 403, when the type of the execution step is detected to be a binding mark execution step, the execution step distributor obtains independent task manager node information corresponding to the current job from the node state tracker, matches the independent task manager node information according to a label corresponding to the binding mark execution step, and generates an operation position of the binding mark execution step, wherein the type of the execution step is generated according to the independent task manager label;

it should be noted that, in the embodiment of the present application, the execution step allocator is responsible for executing scheduling and responding to node changes. And when the scheduling is executed, the latest independent task manager node information is taken out from the node state tracker, and the matched independent task manager node is inquired according to the label bound in the executing step. When the execution steps of independent TaskManager marks are scheduled and bound, the data output target of the upstream execution steps and the input source of the downstream execution step data are set as the independent TaskManager nodes, and other unbound nodes participate in the original Yarn scheduling process of the Flink. In addition, the execution step allocator saves the scheduled job information after the completion of the scheduling.

Step 404, obtaining the resource consumption corresponding to each execution step;

and step 405, determining a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and starting the execution step.

Step 406, if the target node is detected to be in a fault state, scheduling the execution step corresponding to the target node to a second target node with the same binding flag if the target node is an independent task manager node, or scheduling the execution step corresponding to the target node to any node in the Yarn cluster based on a preset node scheduling policy;

and step 407, if the target node is a yan node, eliminating the target node from the yan cluster, and scheduling the execution step corresponding to the target node to other nodes in the yan cluster.

Further, after step 406, if the target node is an independent task manager node, the executing step corresponding to the target node may be stopped, and the fault information corresponding to the target node may be sent to a user, so that the user maintains the target node.

It should be noted that, in the embodiment of the present application, in steps 406 to 407, during the start-up operation of the execution step, a problem of job failure caused by a node failure occurs, specifically, referring to fig. 11, fig. 11 shows a job scheduling failure processing flowchart provided in the embodiment of the present application, and it can be seen from the drawing that when it is detected that there is a target node in a failure state, the process is divided into two cases.

One is a node failure in the Yarn cluster, for which case the failed node needs to be removed from the cluster, then the job steps run on the originally failed node are transferred to other healthy nodes according to the scheduling rules, and finally the job execution is resumed from the last successful checkpoint (checkpoint).

The other is an independent TaskManager node failure, and specifically, is divided into two cases:

case one: there are multiple independent TaskManager nodes that match the labels, and there are other nodes that match the labels in addition to the failed node. In this case, the execution steps scheduled to the independent TaskManager are scheduled to other independent TaskManager nodes with the same label. The execution step allocator modifies the execution step data output destination upstream and the execution step output input source downstream of all jobs running on the independent TaskManager to be another node of the matching label.

And a second case: the independent TaskManager node that matches the tag is unique and cannot be replaced after failure. This time depending on the scheduling mitigation strategy of the previous section. For node scheduling degradation policies, job steps on failed nodes may be rescheduled in the Yarn cluster. For the refusal execution strategy, the whole job will fail and stop running.

It should be noted that, in the embodiment of the present application, the preset node scheduling policy is a node scheduling mitigation policy, specifically, according to a user's requirement or a current job scenario requirement, a target node where an execution step of a required operation is located may be replaced, for example, degradation of node scheduling or upgrading of node scheduling may be performed.

In addition, when the target node where the execution step of the required operation is performed is replaced according to the preset node scheduling policy, if the original independent manager node is replaced with a node in the yan cluster, it is required to determine any node in the yan cluster as a new target node according to a preset rule in the yan cluster, for example, performing resource allocation as required, or designating any node in the yan cluster as a new target node by the user.

The embodiment of the application combines the advantages of the Yarn cluster and the independent task manager node deployment form, determines the running position of each execution step according to the type of each execution step, namely, reserves the flexibility and the resource sharing characteristic of the Yarn cluster scheduling, realizes that the resource sensitive calculation step can use sufficient resources of the independent task manager node, realizes the heterogeneous deployment capability of the Flink job cluster, ensures that the Flink job can fully utilize the resources of the heterogeneous cluster, and maximizes the utilization of the resources by scheduling the steps depending on special resources to the matched nodes for execution, and reduces the risk that the flow calculation process may encounter performance bottlenecks.

In addition, by setting a load balancing fault recovery strategy aiming at the hybrid deployment and scheduling method, the independent TaskManager node has the functions of load balancing and fault backup, the resources of the independent TaskManager node are flexibly utilized, the risk of operation interruption when the independent TaskManager node breaks down is reduced, the independent TaskManager node has the functions of load balancing and fault backup, and the availability and fault tolerance of the cluster are improved.

Referring to fig. 5, a flowchart illustrating a step of a job scheduling method according to an embodiment of the present application is shown, where the method may include:

Step 501, receiving a preset job program sent by a user, wherein the preset job program comprises a preset independent task manager mark;

step 502, generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps;

step 503, determining an operation position of each execution step according to a type of each execution step, wherein the type of each execution step is generated according to the independent task manager mark;

step 504, obtaining the resource consumption corresponding to each execution step;

and step 505, determining a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and starting the execution step.

Step 506, receiving an independent task manager node capacity expansion instruction sent by a user;

step 507, creating a target independent task manager node according to the independent task manager node capacity expansion instruction;

step 508, obtaining the average value of the number of execution steps corresponding to all the independent task manager nodes;

and step 509, scheduling the target independent task manager node according to the average value of the number of the execution steps until the number of the execution steps corresponding to the target independent task manager node reaches the average value of the number of the execution steps.

It should be noted that, in step 506-509, for the case of capacity expansion of the independent task manager node, for example, in the process of operation, the user finds that the capacity expansion of the memory resource and the CPU resource is required to be performed better, so that the capacity expansion of the independent task manager node may be performed, that is, a target independent task manager node (i.e., a newly expanded node) is created, no task is performed on the newly expanded node, and the execution steps of some tasks may already be performed on the originally existing node, so that an average value of the number of execution steps performed on the original independent task manager node needs to be calculated at this time, and the target independent task manager node is scheduled according to the average value.

Specifically, the execution steps of binding the scheduling mark in the newly submitted task (new operation) are preferentially scheduled to the newly expanded node until the number of the execution steps running on the newly expanded node reaches the average value calculated previously, and then the polling scheduling of the independent task manager is restored.

It should be noted that, the purpose of the number-sharing of the execution steps, rather than the consumed resources-occupying-sharing is to minimize the impact of node faults on the job, because the Flink is a stream calculation engine, the particularity of the stream calculation job is that one calculation step is interrupted, the whole stream calculation process will not work normally, if the method of the resource-occupying-sharing is adopted, there may be a situation that a large number of execution steps with smaller occupied resources are allocated to the same node, and other nodes only have a small number of execution steps with larger consumed resources. In this case, if the former is broken down, even if there is a fault strain process, the affected work ratio is large.

In addition, the processing expands capacity for the nodes, and can also contract capacity for the nodes, specifically, the capacity contraction condition of the independent TaskManager nodes can adopt the same processing strategy as the fault of the independent TaskManager nodes, namely degradation processing or stopping execution, but for the node capacity contraction, when only one node with a certain mark of independent TaskManager node is left, the problem that the execution steps are rescheduled to a Yarn cluster or operation failure feedback occurs in the capacity contraction process.

Referring to fig. 6, a flowchart of steps of a job scheduling method provided by an embodiment of the present application is shown, where the method may include:

Step 601, receiving a preset job program sent by a user, wherein the preset job program comprises a preset independent task manager mark;

step 602, generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps;

step 603, determining an operation position of each execution step according to a type of each execution step, wherein the type of each execution step is generated according to the independent task manager mark;

step 604, obtaining the resource consumption corresponding to each execution step;

step 605, determining a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and starting the execution step.

Step 606, receiving an upgrade instruction sent by a user and scheduled by an execution step, wherein the upgrade instruction comprises a target execution step;

step 607, according to the upgrade instruction and a preset node scheduling policy, scheduling any node in the yan cluster where the target executing step is located to an independent task manager node.

It should be noted that, in the embodiment of the present application, the scheduled location may be changed according to the upgrade instruction sent by the user, that is, whether the execution step is located in the yan cluster or in an independent TaskManager node. The process from the Yarn cluster to the independent TaskManager node is upgrading, the scheduling process from the independent TaskManager node to the Yarn cluster is downgrade, the upgrading and downgrade logic multiplexes the fault recovery logic before, the execution step which needs to be upgraded stops running, and then the execution step is restored to run on the nodes which meet the conditions according to the new scheduling rule.

Further, after step 605, a degradation instruction sent by the user to execute the step schedule may also be received, including: receiving a degradation instruction sent by a user and scheduled by an execution step, wherein the degradation instruction comprises a target execution step; and dispatching the independent task manager node where the target executing step is located to any node in the Yarn cluster according to the degradation instruction and a preset node dispatching strategy.

Similarly, the degradation is that the execution step needing degradation is stopped, and then the execution step is restored to be operated on the node meeting the condition according to the new scheduling rule.

In addition, a load balancing fault recovery strategy is set for the hybrid deployment and scheduling method, and the promotion and the degradation are scheduled. The resources of the independent TaskManager nodes are flexibly utilized, and meanwhile, the risk of operation interruption of the independent TaskManager nodes during faults is reduced.

Referring to FIG. 7, a system schematic diagram of a job scheduling system provided by an embodiment of the present application is shown, the job scheduling system including an independent task manager node, a Yarn cluster node, and a Yarn;

the Yarn cluster node is used for executing an unbound label executing step;

Further, the Yarn comprises a node state tracker and an execution step allocator; the node state tracker is used for maintaining node information corresponding to the independent task manager; the execution step distributor is used for scheduling the independent task manager according to the node information corresponding to the independent task manager.

It should be noted that, in the embodiment of the present application, the job scheduling system includes a hybrid deployment architecture that uses both the independent TaskManager and the floating TaskManager in the Yarn, where the independent TaskManager operates on a specific resource-intensive node and operates as a resident service. The floating TaskManager accepts the scheduling of Yarn and starts as needed at job run-time. The step of operating at a specific node is required to be specified in the Flink job, and the operation time is scheduled to the node where the independent TaskManager is located. Other steps were run using a yan's floating TaskManager. After the operation of the job is finished, the floating TaskManager in the Yarn automatically stops operating and returns the resources to the cluster. The independent mode TaskManager on a particular node remains running all the time. And receiving scheduling operation of other jobs.

The job scheduling system in the embodiment of the application divides the schedulable nodes into two types. Binding marked nodes and unbound nodes. The node of the binding mark needs to indicate the mark of the specific node of the binding, and can be provided with a mark for the execution step of the Flink job. The job step of specifying the tag will be scheduled on the node that matches the binding tag of the tag. The unbound step is a node insensitive to resources and accepts unified scheduling of Yarn. A Flink job may bind a particular operator (also known as an operator, or a job step) to a node tag at the time of writing, distinguish between the two types of job steps at the time of job scheduling operation, and schedule separately. And thus independent resource allocation can be achieved. The cluster resources which can be used by the independent mode of the task manager are specially configured in the task manager, the task manager in the floating mode is still used for managing the resources by the Yarn, so that classified management and control of different types of node resources can be realized, performance bottlenecks and resource waste of other steps in the operation step with large resource requirements are avoided, and even the influence on monitoring of abnormal conditions of resource consumption can be avoided.

In the hybrid architecture of the application, the resources of the independent task manager nodes are exclusively allocated to the flank operation, but not the execution step of a certain flank operation, other flank operations can be scheduled to run on the node if the same mark is configured, and a plurality of independent task manager nodes with the same mark can exist, so that when a plurality of steps for binding the same mark need to be scheduled, an execution step distributor can distribute and run the execution steps on all independent task manager nodes matched with the mark in a polling mode. The resources of these nodes are fully utilized. And meanwhile, risk sharing (other nodes with matched marks are equivalent to backup nodes) is realized when the nodes fail.

Referring to fig. 8, fig. 8 shows a job scheduling apparatus provided by an embodiment of the present application, where the apparatus may include:

a receiving module 801, configured to receive a preset job program sent by a user, where the preset job program includes a preset independent task manager flag;

a generating module 802, configured to generate an execution plan according to the preset job program, where the execution plan includes a plurality of execution steps;

a first determining module 803, configured to determine an operation position of each execution step according to a type of each execution step, where the type of the execution step is generated according to the independent task manager flag;

An obtaining module 804, configured to obtain resource consumption corresponding to each execution step;

and a second determining module 805, configured to determine a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and start the execution step.

The embodiment of the present application also provides a communication device, as shown in fig. 9, including a processor 901, a communication interface 902, a memory 903, and a communication bus 904, where the processor 901, the communication interface 902, and the memory 903 perform communication with each other through the communication bus 904,

A memory 903 for storing a computer program;

the processor 901, when executing the program stored in the memory 903, may implement the following steps:

acquiring the resource consumption corresponding to each execution step;

The communication bus mentioned by the above terminal may be a peripheral component interconnect standard (Peripheral Component Interconnect, abbreviated as PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated as EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The communication interface is used for communication between the terminal and other devices.

The memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but also digital signal processors (Digital Signal Processing, DSP for short), application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), field-programmable gate arrays (Field-Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

In yet another embodiment of the present application, a computer readable storage medium having instructions stored therein that, when executed on a computer, cause the computer to perform the job scheduling of any of the above embodiments is also provided.

In yet another embodiment of the present application, there is also provided a computer program product containing instructions that, when run on a computer, cause the computer to perform the job scheduling of any of the above embodiments.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or third database to another website, computer, server, or third database by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device including one or more servers, third databases, etc. that can be integrated with the available medium. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims

1. A job scheduling method, applied to Yarn, the method comprising:

acquiring the resource consumption corresponding to each execution step;

determining a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and starting the execution step;

the types of the execution steps include: an unbound label executing step and a bound label executing step, wherein the bound label executing step is executed on a node corresponding to the independent task manager;

The determining the operation position of each execution step according to the type of each execution step comprises:

2. The job scheduling method according to claim 1, wherein the resource consumption includes a resource consumption corresponding to the unbound label performing step, and a resource consumption corresponding to the bound label performing step.

3. The job scheduling method according to claim 2, wherein said determining a target node corresponding to each of the execution steps according to the resource consumption and the running position of each of the execution steps comprises:

4. A job scheduling method according to claim 3, wherein said determining a target node corresponding to each of said execution steps according to the running position of each of said execution steps comprises:

5. The job scheduling method according to claim 1, wherein the Yarn includes a node state tracker for maintaining node information corresponding to an independent task manager, and an execution step allocator for scheduling the independent task manager according to the node information corresponding to the independent task manager.

6. The job scheduling method according to claim 5, wherein after the steps of determining the target node corresponding to each of the execution steps according to the resource consumption and the running position of each of the execution steps, and starting the execution steps, the method comprises:

7. The job scheduling method according to claim 6, wherein after the step of the node state tracker determining whether the independent task manager is in a failure state according to the heartbeat and transmitting node change information to the execution step allocator, the method comprises:

and the node state tracker sends a node change notification to a work manager corresponding to the current job under the condition that the node state tracker detects the node change of the independent task manager, so that the work manager changes the independent task manager node corresponding to the current job.

8. The job scheduling method according to claim 1, wherein the determining the running position of each of the execution steps according to the type of each of the execution steps comprises:

under the condition that the type of the execution step is detected to be a binding mark execution step, an execution step distributor obtains independent task manager node information corresponding to the current job from a node state tracker, and generates an operation position of the binding mark execution step according to the matching of the label corresponding to the binding mark execution step in the independent task manager node information.

9. The job scheduling method according to claim 1, wherein after the steps of determining the target node corresponding to each of the execution steps according to the resource consumption and the running position of each of the execution steps, and starting the execution steps, the method comprises:

if the target node is an independent task manager node, scheduling the execution step corresponding to the target node to a second target node with the same binding mark, or scheduling the execution step corresponding to the target node to any node in the Yarn cluster based on a preset node scheduling strategy;

10. The job scheduling method according to claim 9, wherein after the step of scheduling the execution step corresponding to the target node to any node in the Yarn cluster based on the preset node scheduling policy, the method comprises:

11. The job scheduling method according to claim 1, wherein after the steps of determining the target node corresponding to each of the execution steps according to the resource consumption and the running position of each of the execution steps, and starting the execution steps, the method comprises:

12. The job scheduling method according to claim 1, wherein after the steps of determining the target node corresponding to each of the execution steps according to the resource consumption and the running position of each of the execution steps, and starting the execution steps, the method comprises:

13. The job scheduling method according to claim 1, wherein after the steps of determining the target node corresponding to each of the execution steps according to the resource consumption and the running position of each of the execution steps, and starting the execution steps, the method comprises:

14. A job scheduling system comprising an independent task manager node, a Yarn cluster node, and a Yarn;

the Yarn cluster node is used for executing an unbound label executing step;

the Yarn is used for receiving a preset operation program sent by a user, wherein the preset operation program comprises a preset independent task manager mark; generating an execution plan according to the preset job program, wherein the execution plan comprises a plurality of execution steps; determining the running position of each execution step according to the type of each execution step, wherein the type of each execution step is generated according to the independent task manager mark; acquiring the resource consumption corresponding to each execution step; determining a target node corresponding to each execution step according to the resource consumption and the running position of each execution step, and starting the execution step;

15. The job scheduling system of claim 14, wherein the Yarn comprises a node state tracker and an execution step allocator;

16. A job scheduling device, the device comprising:

the second determining module is used for determining a target node corresponding to each executing step according to the resource consumption and the running position of each executing step, and starting the executing step;

the first determining module is further configured to set an operation position corresponding to the execution step as the independent task manager when the type of the execution step is detected to be a binding mark execution step; and setting the running position corresponding to the execution step as a Yarn cluster under the condition that the type of the execution step is detected to be an unbound label execution step.

17. A communication device, comprising: a transceiver, a memory, a processor, and a program stored on the memory and executable on the processor;

The processor being configured to read a program in a memory to implement a job scheduling method according to any one of claims 1-13.

18. A readable storage medium storing a program, wherein the program when executed by a processor implements a job scheduling method according to any one of claims 1-13.