CN116643858A

CN116643858A - Service priority pod-based rescheduling method, device, equipment and medium

Info

Publication number: CN116643858A
Application number: CN202310321523.5A
Authority: CN
Inventors: 包红强
Original assignee: Unicloud Technology Co Ltd
Current assignee: Unicloud Technology Co Ltd
Priority date: 2023-03-29
Filing date: 2023-03-29
Publication date: 2023-08-25

Abstract

The application provides a service priority-based pod rescheduling method, a device, equipment and a medium, wherein the method comprises the following steps: obtaining an information data structure of each node pod, wherein the information data structure comprises tps, cpu and memory information data; the priority ranking is carried out on each node pod according to the ranking rule; judging whether an event change exists in a node to be scheduled, wherein the event comprises a node fault recovery event and a node capacity expansion event; responding to the event change of the node to be scheduled and the node fault recovery event, and triggering a first pod rescheduling strategy; and responding to the event change of the node to be scheduled and the node capacity expansion event, and triggering a second pod rescheduling strategy. The application has the beneficial effects that: the service pressure of the existing dispatching node pod can be effectively reduced, the node resources are balanced, and the reasonable utilization of the node resources is ensured.

Description

Service priority pod-based rescheduling method, device, equipment and medium

Technical Field

The application belongs to the technical field of cloud computing, and particularly relates to a service priority pod-based rescheduling method, device, equipment and medium.

Background

The primary k8s cluster capacity expansion node is not switched to the capacity expansion node aiming at the running pod resource, and only the newly added pod is scheduled to the capacity expansion node, so that the node pressure cannot be shared in time; meanwhile, aiming at node fault recovery, the Pod is migrated to other nodes, and after the fault node is recovered, the Pod cannot be migrated back to the current node, so that the resource pressure of other nodes is overlarge.

Disclosure of Invention

In view of the above, the present application aims to provide a service priority pod-based rescheduling method, apparatus, device and medium, so as to solve the problem that the node resource scheduling pressure is up and the node pressure cannot be shared in time.

In order to achieve the above purpose, the technical scheme of the application is realized as follows:

in a first aspect, the present application provides a pod rescheduling method based on service priority, the method comprising:

obtaining an information data structure of each node pod, wherein the information data structure comprises tps, cpu and memory information data;

the priority ranking is carried out on each node pod according to the ranking rule;

judging whether an event change exists in a node to be scheduled, wherein the event comprises a node fault recovery event and a node capacity expansion event;

responding to the event change of the node to be scheduled and the node fault recovery event, and triggering a first pod rescheduling strategy;

and responding to the event change of the node to be scheduled and the node capacity expansion event, and triggering a second pod rescheduling strategy.

Further, the prioritizing the node pod according to the ordering rule includes:

ordering the nodes pod according to tps to obtain a first priority sequence of the nodes pod.

Further, the method further comprises the following steps:

and sequencing the nodes according to the resource utilization rate of the nodes to obtain a second priority sequence of the node pod.

Further, the responding to the event change of the node to be scheduled and the node fault recovery event triggers a first pod rescheduling strategy, including:

and determining a node fault recovery event, recovering faults of the nodes to be scheduled, restarting the nodes pod to be scheduled according to a first priority sequence, and scheduling the nodes to be scheduled again after all restarting.

Further, the responding to the event change of the node to be scheduled and the node capacity expansion event triggers a second pod rescheduling strategy, including:

determining a node capacity expansion event, sorting data of each node pod and capacity expansion node pod according to the node resource utilization rate, updating the information data structure of the node pod, and executing rescheduling operation of the node pod.

Further, the method further comprises the following steps:

and in response to no event change of the node to be scheduled, updating the information data structure if the change of the last tps of the node pod and the currently acquired node to be scheduled is greater than a preset change threshold value, otherwise, not updating.

In a second aspect, the present application provides a pod rescheduling apparatus based on service priority, the apparatus comprising:

the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring an information data structure of each node pod, wherein the information data structure comprises tps, a request cpu and memory information data;

the sequencing module is used for sequencing the priority of each node pod according to the sequencing rule;

the judging module is used for judging whether the node to be scheduled has event change or not, wherein the event comprises a node fault recovery event and a node capacity expansion event;

the first response module is used for responding to event change of the node to be scheduled and triggering a first pod rescheduling strategy when the node is a node fault recovery event;

and the second response module is used for responding to the event change of the node to be scheduled and triggering a second pod rescheduling strategy if the node is a node capacity expansion event.

In a third aspect, the present application provides an electronic device, including a processor and a memory communicatively connected to the processor and configured to store instructions executable by the processor, where the processor is configured to perform a service priority-based pod rescheduling method as described above.

In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements a service priority based pod rescheduling method as described above.

Compared with the prior art, the service priority-based pod rescheduling method, device, equipment and medium have the following beneficial effects:

the method, the device, the equipment and the medium for rescheduling the pod based on the service priority, aim at node fault recovery, and reschedule the pod before rescheduling by executing a first pod rescheduling strategy, ensure that other nodes do not increase pressure and balance the resources of each node; and aiming at the newly added capacity-expanding node, executing a second pod rescheduling strategy, so that the service pressure of the existing node pod can be shared, and the newly added node resource is ensured not to be wasted.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:

fig. 1 is a schematic flow chart of a pod rescheduling method based on service priority according to an embodiment of the present application;

FIG. 2 is a diagram of a node initialization data structure according to an embodiment of the present application;

FIG. 3 is a diagram of a pod data structure after node expansion according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a pod rescheduling device based on service priority according to an embodiment of the present application.

Reference numerals illustrate:

11-an acquisition module; 12-a sorting module; 13, a judging module; 14-a first response module; 15-a second response module.

Detailed Description

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.

The application will be described in detail below with reference to the drawings in connection with embodiments.

Referring to fig. 1, an embodiment of the present application provides a pod rescheduling method based on service priority, where the method includes:

s101, acquiring an information data structure of each node pod, wherein the information data structure comprises tps, a request cpu and memory information data.

Specifically, a system module is operated, tps of each node pod is periodically obtained from the service monitoring system, and pod requests tps, cpu and memory are recorded.

S102, sorting the priority of each node pod according to the sorting rule.

In some embodiments, nodes pod are ordered by tps, resulting in a first priority sequence for each node pod.

Or,

and sequencing the node pod according to the resource utilization rate of each node to obtain a second priority sequence of each node pod.

It should be noted that, for the first priority sequence, the present embodiment ranks the nodes pod according to tps based on a doubly linked list, and for the second priority sequence, the present embodiment ranks the nodes according to the node resource usage, that is, by cpu or memory, to obtain the second priority sequence.

S103, judging whether the node to be scheduled has event change, wherein the event comprises a node fault recovery event and a node capacity expansion event.

And S104, responding to event change of the node to be scheduled, and triggering a first pod rescheduling strategy if the node is a node fault recovery event.

In some embodiments, the responding to the event change of the node to be scheduled and the node fault recovery event triggers the first pod rescheduling strategy, including:

S105, responding to event change of the node to be scheduled, triggering a second pod rescheduling strategy if the event is a node capacity expansion event.

In some embodiments, the responding to the event change of the node to be scheduled and the node capacity expansion event triggers a second pod rescheduling strategy, including:

In some embodiments, further comprising:

Aiming at node fault recovery, the embodiment reschedules the previous pod by executing a first pod rescheduling strategy, ensures that other nodes do not increase pressure, and balances the resources of each node; and aiming at the newly added capacity-expanding node, executing a second pod rescheduling strategy, so that the service pressure of the existing node pod can be shared, and the newly added node resource is ensured not to be wasted.

The specific embodiment of the scheme is as follows:

as shown in fig. 2, the node pod information data structure is obtained from the monitoring system, and the node resources are ordered according to the CPU utilization rate: node A > node B > node C;

node a: (A_pod1 (cpu) +A_pod2 (cpu) +A_pod3 (cpu))/A (cpu)

Node B: (B_Pod1 (cpu) +B_Pod2 (cpu))/B (cpu)

Node C: (C_pod1 (cpu) +C_pod2 (cpu))/C (cpu)

TPS ordering of pod on each node

Node a: a_pod1 (tps) > A_pod2 (tps) > A_pod3 (tps)

Node B: b_pod1 (tps) > b_pod2 (tps)

Node C: c_pod1 (tps) > C_pod2 (tps)

And after the node B recovers, restarting the B_pod1 and the B_pod2, and rescheduling to the node B.

As shown in fig. 3, the capacity expansion node D, the rescheduling system detects a node capacity expansion event, selects the node a with the highest node resource usage, and selects the minimum TPS a_pod3 for rescheduling;

calculating node resource utilization rate:

node a: (A_pod1 (cpu) +A_pod2 (cpu))/A (cpu)

Node B: (B_Pod1 (cpu) +B_Pod2 (cpu))/B (cpu)

Node C: (C_pod1 (cpu) +C_pod2 (cpu))/C (cpu)

Node D: a_pod3 (cpu)/D (cpu)

Ordering node resource utilization rate, node B > node A > node C > node D, wherein the node D resource utilization rate is smaller than the resource utilization rate of the minimum node C, continuously releasing TPS minimum B_pod2 on the node B with the maximum resource utilization rate, and recalculating the node resource utilization rate:

node a: (A_pod1 (cpu) +A_pod2 (cpu))/A (cpu)

Node B: b_pod1 (cpu)/B (cpu)

Node C: (C_pod1 (cpu) +C_pod2 (cpu))/C (cpu)

Node D: (A_pod3 (cpu) +B_pod2 (cpu))/D (cpu)

And sequencing the node resource utilization rate, wherein node B > node A > node D > node C, and the node D resource utilization rate is larger than the resource utilization rate of the minimum node C, and stopping rescheduling.

As shown in fig. 4, based on the same inventive concept, an embodiment of the present application further provides a pod rescheduling apparatus based on service priority, where the apparatus includes:

an obtaining module 11, configured to obtain an information data structure of each node pod, where the information data structure includes tps, a request cpu, and memory information data;

a sorting module 12, configured to prioritize the nodes pod according to a sorting rule;

the judging module 13 is used for judging whether the node to be scheduled has event change, wherein the event comprises a node fault recovery event and a node capacity expansion event;

the first response module 14 is used for responding to the event change of the node to be scheduled and triggering a first pod rescheduling strategy if the node is a node fault recovery event;

the second response module 15 is used for responding to the event change of the node to be scheduled and triggering a second pod rescheduling strategy if the event is a node capacity expansion event.

Based on the same inventive concept, the embodiment of the application also provides an electronic device, which comprises a processor and a memory which is in communication connection with the processor and is used for storing executable instructions of the processor, wherein the processor is used for executing the pod rescheduling method based on the service priority.

The electronic device of the foregoing embodiment is configured to implement a pod rescheduling method based on service priority according to any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which is not described herein.

Based on the same inventive concept, the embodiment of the application also provides a computer readable storage medium, which stores a computer program, wherein the computer program realizes the pod rescheduling method based on the service priority when being executed by a processor.

The storage medium of the foregoing embodiment stores a computer program for causing the computer to execute a pod rescheduling method based on service priority according to any of the foregoing embodiments, and has the advantages of the corresponding method embodiments, which are not described herein.

Those of ordinary skill in the art will appreciate that the elements and method steps of each example described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the elements and steps of each example have been described generally in terms of functionality in the foregoing description to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the several embodiments provided in the present application, it should be understood that the disclosed methods and systems may be implemented in other ways. For example, the above-described division of units is merely a logical function division, and there may be another division manner when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted or not performed. The units may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment of the present application.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application, and are intended to be included within the scope of the appended claims and description.

The foregoing description of the preferred embodiments of the application is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the application.

Claims

1. A service priority-based pod rescheduling method, the method comprising:

2. The traffic priority based pod rescheduling method of claim 1, wherein the prioritizing the nodes pods according to the ordering rule comprises:

3. The traffic priority based pod rescheduling method of claim 2, further comprising:

4. The traffic priority based pod rescheduling method of claim 3, wherein the responding to the event change of the node to be scheduled and the node failure recovery event triggers the first pod rescheduling policy, comprising:

5. The traffic priority based pod rescheduling method of claim 4, wherein the responding to the event change of the node to be scheduled and the node capacity expansion event triggers a second pod rescheduling policy, comprises:

6. The traffic priority based pod rescheduling method of claim 1, further comprising:

7. A traffic priority based pod rescheduling apparatus, the apparatus comprising:

8. An electronic device comprising a processor and a memory communicatively coupled to the processor for storing processor-executable instructions, characterized in that: the processor is configured to perform a service priority based pod rescheduling method according to any of the preceding claims 1-6.

9. A computer-readable storage medium storing a computer program, characterized in that: the computer program, when executed by a processor, implements a traffic priority based pod rescheduling method as claimed in any of claims 1-6.