WO2024021486A1

WO2024021486A1 - Load balancing method and system, and electronic device and storage medium

Info

Publication number: WO2024021486A1
Application number: PCT/CN2022/141797
Authority: WO
Inventors: 邹晟; 张翼; 陈玉鹏; 侯飞
Original assignee: 天翼云科技有限公司
Priority date: 2022-07-29
Filing date: 2022-12-26
Publication date: 2024-02-01
Also published as: CN115499376B; CN115499376A

Abstract

Disclosed in the present invention are a load balancing method and system, and an electronic device and a storage medium. The method comprises: a central controller acquiring a gradient parameter of a local model sent by a switch, optimizing a global model according to the gradient parameter, and sending the global model to the switch so as to determine the usage rate of resources of the switch, wherein the resources comprise a processor, a memory and a network bandwidth; acquiring a sub-pheromone concentration, which is determined by the switch on the basis of the usage rate of the resources and an ant colony algorithm, and determining the number of task flows on the switch on the basis of the sub-pheromone concentration; acquiring a real-time health coefficient of the switch, and combining the real-time health coefficient with an initial weight to obtain a real-time health matrix, wherein the initial weight is calculated on the basis of the number of task flows; and comparing the real-time health matrix with a preset health matrix, and adjusting the load of the switch according to a comparison result. The method realizes load balancing of a switch and reduces the service response time, thereby improving the usage experience of users.

Description

A load balancing method, system, electronic device and storage medium

Technical field

The invention relates to the field of network security technology, and specifically to a load balancing method, system, electronic equipment and storage medium.

Background technique

With the rapid development of cloud computing, big data, and artificial intelligence, the amount of data in application services has increased exponentially. The traditional back-end network access layer is limited by the bandwidth bottleneck of the access layer entrance and the high cost of network hardware. It is no longer possible to cope with the massive amount of data that comes with it. Emerging network technologies are constantly emerging. Software-defined networking (SDN) decouples the control layer from the data forwarding layer. The control layer is responsible for controlling the global network, and the data forwarding layer is responsible for completing data forwarding according to the flow table issued by the control layer, which greatly improves the flexibility of network deployment and management. nature, and achieves centralized management and control of data traffic. At the same time, with the vigorous development of software switches, the method of deploying software switches on business nodes to reduce the number of network hops and thereby reduce network latency is becoming more and more popular. In this regard, when faced with massive data, more and more network access layers of application services use software routing to directly use the logical processing backend of the application service as the next hop of the hardware switch to achieve scalability. A powerful, cost-effective multi-path solution, thereby horizontally expanding the bandwidth of the network access layer entrance and solving the problem of insufficient bandwidth.

technical problem

Existing load balancing methods between multi-paths cannot sense path congestion status and link failures, which can easily cause hash conflicts among multiple data flows on the path, leading to link congestion and application performance degradation. Under normal traffic conditions, due to the small amount of short-stream data and fast processing speed, the waiting time is almost negligible. However, massive short-stream scenarios can easily cause network congestion.

Technical solutions

In view of this, embodiments of the present invention provide a load balancing method, system, electronic device, and storage medium to balance the load on the switch, reduce response time, and improve user experience.

According to a first aspect, an embodiment of the present invention provides a load balancing method applied to a central controller. The method includes:

Obtain the gradient parameters of the local model sent by the switch, optimize the global model according to the gradient parameters, and send the global model to the switch to determine the usage rate of the switch's resources, the resources including processor, memory and network bandwidth;

Obtain the sub-pheromone concentration determined by the switch based on the usage rate of the resource and the ant colony algorithm, and determine the number of task flows on the switch based on the sub-pheromone concentration;

Obtain the real-time health coefficient of the switch, and combine the real-time health coefficient with the initial weight to obtain a real-time health matrix. The initial weight is calculated based on the number of task flows;

The real-time health matrix is compared with a preset health matrix, and the load of the switch is adjusted according to the comparison result.

The load balancing method provided in this embodiment obtains the gradient parameters of the local model calculated locally by the switch, optimizes the global model based on the gradient parameters, and controls the switch to optimize the local model based on the global model, thereby calculating the resource usage corresponding to the switch. Based on the ant colony algorithm and the obtained resource utilization of the switch, the optimal allocation is determined, that is, the number of task flows on the switch, the weight of the switch is determined based on the number of task flows, the health coefficient is set, and a real-time health matrix is obtained based on the weight of the switch. By comparing the real-time health matrix and the preset health matrix, the load of the switch is adjusted, thereby achieving load balancing of the switch, reducing service response time, and improving user experience.

In conjunction with the first aspect, in one implementation, determining the number of task flows on the switch based on the sub-pheromone concentration includes:

Cyclically obtain the sub-pheromone concentration determined by the switch based on the usage rate of the resource and the ant colony algorithm;

Globally update the sub-pheromone concentration to obtain the pheromone concentration, and optimize the path;

The number of task flows on the switch is determined based on the optimized path.

In conjunction with the first aspect, in one embodiment, the pheromone concentration is calculated using the following formula:

τ _ij (t+1)=(1-ρ)τ _ij (t)+Δτ _ij (t)

Among them, ρ represents the degree of pheromone volatilization, Δτ _ij (t) represents the total amount of pheromone released by the ant colony on the path, τ _ij (t+1) represents the pheromone on the path of switch i and switch j at time t+1 concentration.

With reference to the first aspect, in one implementation, obtaining the real-time health coefficient of the switch and combining the real-time health coefficient with the initial weight to obtain a real-time health matrix includes:

Detect the path health of the switch, and determine the real-time health coefficient of the switch based on the path health of the switch;

Multiply the initial weight corresponding to the switch and the real-time health coefficient to determine the real-time health matrix.

According to a second aspect, an embodiment of the present invention provides a load balancing method, applied to a switch, and the method includes:

Obtain data flow characteristic information, determine gradient parameters based on the data flow characteristic information, and send the gradient parameters to the central controller;

Obtain the global model sent by the central controller, optimize the local model based on the global model, and calculate the resource usage based on the local model and the data flow feature information. The global model is the central controller based on the Obtained by optimizing the gradient parameters mentioned above;

The sub-pheromone concentration is determined based on the resource usage and the ant colony algorithm, and the sub-pheromone concentration is sent to the central controller to adjust the load.

Combined with the second aspect, in one implementation, the determination of the sub-pheromone concentration based on the usage rate of the resource and the ant colony algorithm includes:

Allocate the search tasks of ants based on the ant colony algorithm, and determine the heuristic factor based on the usage rate of the resources;

Calculate the probability that the ant moves to other switches according to the heuristic factor;

When the ants move to other switches, the sub-pheromone concentration is calculated based on the new pheromones on the path and the ant circle model. Combined with the second aspect, in one implementation, the probability of the ants moving to other switches is calculated using the following formula:

in,

represents the probability that ant k will visit switch j at the next moment, α represents the sensitivity of ants to pheromone, β represents the sensitivity of ant colony to pheromone,

represents the pheromone concentration on the path of switch i and switch j at time t,

η _ij represents the heuristic factor. The heuristic factor is used to describe the degree of attraction of switch j to ants on switch i. It can be expressed as η _ij =1/d _ij , d _ij represents the distance between switch j and switch i, and allowed _k means that it has not been visited yet. A collection of switches.

According to a third aspect, an embodiment of the present invention provides a load balancing system, including:

A central controller, the central controller is configured to execute the load balancing method of the first aspect or any implementation of the first aspect;

At least one switch, the switch is connected to the central controller, and the switch is used to perform the load balancing method of the second aspect or any one of the implementation modes of the second aspect.

According to a fourth aspect, an embodiment of the present invention provides an electronic device, including: a memory and a processor, the memory and the processor are communicatively connected to each other, the memory stores computer instructions, and the processor By executing the computer instructions, the load balancing method described in the first aspect, any implementation manner of the first aspect, the second aspect, or any implementation manner of the second aspect is executed.

According to a fifth aspect, embodiments of the present invention provide a computer-readable storage medium that stores computer instructions, and the computer instructions are used to cause the computer to execute the first aspect and any of the first aspects. The load balancing method described in one implementation, the second aspect, or any implementation of the second aspect.

Description of drawings

In order to more clearly explain the specific embodiments of the present invention or the technical solutions in the prior art, the accompanying drawings that need to be used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description The drawings illustrate some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without exerting any creative effort.

Figure 1 is a flow chart of a load balancing method according to an embodiment of the present invention;

Figure 2 is a schematic diagram of a GRU timing performance prediction method according to an embodiment of the present invention;

Figure 3 is a flow chart of a load balancing method according to an embodiment of the present invention;

Figure 4 is a schematic diagram of weighted equal cost multipath according to an embodiment of the present invention;

Figure 5 is a schematic diagram of a load balancing system according to an embodiment of the present invention;

Figure 6 is a schematic diagram of a load balancing system according to an embodiment of the present invention;

Figure 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.

Embodiments of the invention

In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, rather than all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative efforts fall within the scope of protection of the present invention.

According to an embodiment of the present invention, a load balancing method is provided. It should be noted that the steps shown in the flow chart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and although the steps in the flow chart A logical order is shown, but in some cases the steps shown or described may be performed in a different order than herein.

This embodiment provides a load balancing method. Figure 1 is a flow chart of a load balancing method according to an embodiment of the present invention. The method is applied to the central controller. As shown in Figure 1, the process includes the following steps:

S11: Obtain the gradient parameters of the local model sent by the switch, optimize the global model according to the gradient parameters, and send the global model to the switch to determine the resource usage of the switch.

The switch can be a software switch, and the resources of the switch can include CPU (processor), memory and network bandwidth. The software switch performs normalization and other preprocessing on the locally monitored data flow characteristic information. The data flow characteristic information can include the average value of the flow packet size from src to dst, and the average value of the flow packet size from dst to src. , minimum packet value, maximum packet value, average packet value, packet transmission time, handshake time (TCP), etc. The switch uses the data flow characteristic information as a data set. The data set can be divided according to a preset ratio, such as a 7:3 ratio, as a training data set and a test data set respectively. At the same time, the consistency of the data distribution is maintained as much as possible to avoid Additional bias is introduced due to data partitioning, which affects the final results.

The central controller sends the joint GRU (Gated Recurrent Unit) task and initial parameters to the switch, and the initial parameters can be set to 1. The GRU neural network is a variant of LSTM (Long short-term memory). GRU maintains the effect of LSTM while making the structure simpler, including update gates and reset gates. The update gate controls the extent to which state information from the previous moment is brought into the current state. The larger the value, the more state information from the previous moment is brought into the current state. The reset gate controls the extent to which status information at the previous moment is ignored. The smaller the value, the more it is ignored. The switch normalizes the divided training data set. After the training data set is normalized, there are larger values. In order to prevent the gradient from disappearing, the activation function σ uses a Rectified Linear Unit (ReLU). At the same time, using ReLU will cause some neurons to be 0, causing the sparsity of the network, and reducing the interdependence between parameters, effectively alleviating the over-fitting problem. The ReLU function expression is as follows:

ReLU=max(0,x)

In order to make full use of the computing performance of the software switch, relieve the performance pressure on the central controller, and reduce the additional bandwidth overhead caused by avoidable data transmission. For timing prediction of CPU, memory and network bandwidth, GRU is combined with federated learning, as shown in Figure 2, the GRU timing performance prediction method based on federated learning.

The switch obtains the joint GRU task sent by the central controller. After starting the joint GRU task and initializing the system parameters, it performs calculations locally based on local data such as CPU, memory, network bandwidth, and data flow characteristic information. After the calculation is completed, the obtained gradient parameters are sent. to the central controller. The load balancing system may include one or more switches, and the central controller receives gradient parameters sent by at least one switch. In federated learning, each local gradient parameter is obtained through distributed training, and then the global model is optimized based on each local gradient parameter. After receiving the gradient parameters of the switch, the central controller performs an aggregation operation on these gradient parameters, focusing on efficiency, performance and other factors during the aggregation process. For example, because of the heterogeneous nature of the system, the central controller may sometimes not wait for data upload from all switches, but select a suitable subset of switches as collection targets. After the central controller aggregates and optimizes the global model based on the obtained gradient parameters, it sends the optimized global model to the switches participating in the GRU task. The switch updates the local model based on the received global model and evaluates the performance of the local model. If the performance reaches the preset condition, that is, when the performance is good enough, the training stops and the joint modeling ends; if the performance is insufficient, the switch calculates the gradient again locally. parameters and sent to the central controller until the final local model performance reaches the preset condition. The central controller saves the trained global model. It can calculate the initial parameters through the global model and send the initial parameters to the switch. The switch calculates the usage of each resource corresponding to the switch based on the initial parameters and the trained local model, such as CPU. , memory, network bandwidth, etc. usage.

S12: Obtain the resource usage of the switch and the sub-pheromone concentration determined by the ant colony algorithm, and determine the number of task flows on the switch based on the sub-pheromone concentration.

The central controller obtains the resource usage calculated by the switches and uses the ant colony algorithm to determine the number of task flows on each switch. Ant colony algorithm is an artificial intelligence optimization algorithm that simulates the behavior of ants searching for food and returning to the nest in nature. It finds the optimal path through the cooperation among individual ant colonies. Ant colony algorithm is a heuristic global optimization algorithm in evolutionary algorithms. It has the characteristics of distributed computing, positive information feedback and heuristic search. Its basic idea is: use the walking path of ants to represent the feasible solution to the problem to be optimized. The entire All paths of the ant colony constitute the solution space of the problem to be optimized. Ants with shorter paths release more pheromones. As time goes by, the accumulated pheromone concentration on the shorter path gradually increases, and the number of ants choosing this path also increases.

The convergence speed of the traditional ant colony algorithm is relatively slow and the randomness is large. Therefore, the efficiency of seeking optimal solutions is relatively low. When solving optimization problems, it is easy to fall into local optima, thus missing the global optimal solution. In a system composed of a central controller and a switch, since there may be multiple switches in the system, and each switch is heterogeneous, there may be differences in computing power, network bandwidth, etc., so the system may be in a process of dynamic allocation. . The load borne by each switch at each moment is greatly different. If the performance of some switches is poor and the performance of another part is good, then a large number of task flows will be easily concentrated on the switches with better performance, and the switches with poor performance will be executed. May be idle. Therefore, it is necessary to share the pressure of the central controller based on the performance advantages of the switch to achieve load balancing.

In addition to the central controller and software switches, the system can also include hardware switches. Suppose there are n software switches in the system. A total of m ants are allocated, m = 2n. According to each switch, 2 ants perform the search task. Each ant's The initial point is the hardware switch.

After obtaining the resource usage of the switch, initialize the relevant parameters of the ant colony algorithm. The initialization includes setting the upper limit of the number of iterations of the algorithm and the initial pheromone concentration. The pheromone concentration can represent the efficiency of completing the task. The used rates of CPU, memory and network bandwidth are recorded as U _cpu , U _mem and U _net respectively. In order to comprehensively consider the performance of CPU, memory and network bandwidth, the distance is adjusted to the load capacity:

in:

and

Respectively represent the weight of CPU, memory, and network bandwidth in the loaded capacity. Φ is used to improve the heuristic factor η in the traditional ant colony algorithm, Φ=1/η, that is, η _ij =1/Φ _ij .

The smaller η _ij is, it means that the Φ _ij value of the switch selected by the task flow is too large, that is, the load is already too large. Selecting this switch will make the load of the entire system more unbalanced. On the contrary, if the obtained η _ij value is larger, the corresponding The value of Φ _ij is too small, that is, the load is currently too small. Selecting this switch to perform tasks will promote load balancing of the entire system. Therefore, this improvement can prompt task flows to be executed on some relatively idle switches. After multiple iterations of the algorithm, the improved algorithm can finally achieve overall load balancing.

Since the degree of pheromone volatilization ρ has a greater impact on the search performance of the algorithm, the larger ρ is, the worse the global search ability is, and the smaller ρ is, the worse the local search ability is and the slower the convergence speed is. Therefore, the value of ρ is adjusted in the following adaptive manner:

In addition, the pheromone update method is improved and the elite ant system is used. When ant k completes a path search, the global update method still uses standard ant colony optimization, while the local update method is adjusted.

The completion time of the target task O _i assigned to the software switch S _j is C _ij , then C _ij should be the transmission time T _ij from the hardware switch to S _j plus the actual execution time E _ij of O _i on S _j plus The delay time W _ij from transmission to execution,

Right now:

C _ij =T _ij +E _ij +W _ij

The data volume of task O _i is recorded as F _i , P _j represents the performance of software switch S _j , and N _j represents the network bandwidth of software switch S _j , then:

E _ij =F _i /P _j ,

T _ij =F _i /N _j

Since the software switch executes tasks concurrently, the time it takes for the system to complete all tasks is the maximum value C _max among all C _ij :

C _max =max(C _ij )

Since the overall goal of this optimization is to minimize the completion time of the task flow, the goal is to minimize, that is:

min C _max .

At this time, the ant circle model is:

Among them, C _k represents the total completion time of ant k’s search path, Q represents the total amount of pheromone left on the path after completing a search,

Indicates the total amount of pheromone released by the k-th ant on the path.

The optimal path found is recorded as Γ ^bs . When updating the local pheromone for this path, artificial release of additional pheromone is added to enhance the positive feedback effect. At this time, the local update formula is:

Among them, Δτ _ij (t) represents the total amount of pheromone released by the ant colony on the path,

represents the total amount of pheromone released by the k-th ant on the path, e is the influence weight factor of Γ ^bs ,

Represents each additional pheromone added on the path, the formula is as follows:

Among them, C _bs represents the completion time of the known optimal path Γ ^bs .

Each switch is a node, and the probability of each ant moving to a node can be calculated based on the obtained transfer function, as follows:

in,

η _ij represents the heuristic factor. The heuristic factor is used to describe the degree of attraction of switch j to ants on switch i. It can be expressed as η _ij =1/d _ij , d _ij represents the distance between switch j and switch i, and allowed _k means that it has not been visited yet. The set of switches, the ants move to the corresponding switch node according to the calculated probability.

When an ant moves to a new switch node, it updates the pheromones along its path and makes corresponding modifications to the tabu table. The sub-information concentration is obtained according to the local update formula of the pheromone. The formula is as follows:

Among them, Δτ _ij (t) represents the total amount of pheromone released by the ant colony on the path, that is, the sub-pheromone concentration,

Represents each additional pheromone added on the path.

The central controller summarizes the sub-pheromone concentrations determined by each software switch, evaluates all feasible paths according to the objective function min C _max , and selects the current optimal path Γ ^bs , and evaluates the pheromones on all paths based on the determined optimal path. Make a global update. In order to determine the optimal path, that is, the optimal solution to the allocation problem, the number of iterations can be continuously increased until the number of iterations reaches the preset upper limit of the number of iterations, and the number of task flows on each switch under the optimal path is obtained.

S13, obtain the real-time health coefficient of the switch, and combine the real-time health coefficient with the initial weight to obtain a real-time health matrix.

Among them, the initial weight is calculated based on the number of task flows. When there are multiple switches, each switch is divided by the greatest common divisor of the number of task flows between other switches in turn to obtain the initial weight. The initial weight is the weighted equivalent Initial value of multipath weight configuration. The obtained initial weight of each software switch can be sent to the hardware switch. The hardware switch can include a path distribution module, which can be used to determine the link between each software switch and the central controller.

Set the health coefficient of the switch to ζ. The health coefficient mainly represents the health of each link. When ζ=1, it means that the link health detection is normal. When ζ=0, it means that the link health detection is abnormal. The central controller can monitor the real-time resource usage and link health of the software switch through the monitoring module. The link health can be used as the basis for the hardware switch to dynamically adaptively adjust multi-path selection. The central controller monitors the switch and obtains the real-time health coefficient. The real-time health coefficient can be multiplied by the initial weight to obtain the real-time health matrix.

S14: Compare the real-time health matrix with the preset health matrix, and adjust the load of the switch according to the comparison result.

The preset health matrix is obtained by setting ζ _i = 1. The preset health matrix is (1, 1, 1,..., 1). The real-time health matrix is compared with the preset health matrix. When there is a difference in the values, the hardware is controlled. The switch adjusts the health coefficient of the corresponding link and adjusts the load of the software switch. For example, it can exclude the software switch corresponding to the link with abnormal health status, reduce or remove its load, and maximize the balance of business processing on the software switch. If a link fails, this method can also be used to interrupt data transmission on the link to avoid the risk of data loss and determine the optimal weight configuration of weighted equal-cost multipath.

In one implementation, corresponding to S12 in Figure 1, the following steps may also be included:

(1) Cyclically obtain the resource usage of the switch and the sub-pheromone concentration determined by the ant colony algorithm.

Assume that there are n software switches in the system, and a total of m ants are allocated, m=2n. Each switch performs the search task with 2 ants, and the probability of each ant moving to the next switch node is calculated. The ants move to the corresponding node based on the probability. node, and update the sub-pheromone concentration. In order to determine the optimal path, the number of iterations needs to be continuously increased.

(2) Globally update the sub-pheromone concentration to obtain the pheromone concentration and optimize the path.

Obtain the sub-pheromone concentration, perform a global update based on the sub-information concentration, increase the number of iterations and repeat the calculation, and continuously optimize the path until the number of iterations reaches the preset upper limit of the number of iterations to determine the optimal path.

Pheromone concentration is calculated using the following formula:

τ _ij (t+1)=(1-ρ)τ _ij (t)+Δτ _ij (t)

Among them, ρ represents the degree of pheromone volatilization, Δτ _ij (t) represents the sub-pheromone concentration, and τ _ij (t+1) represents the pheromone concentration on the path of switch i and switch j at time t+1.

ρ is adjusted in the following adaptive manner:

(3) Determine the number of task flows on the switch based on the optimized path.

Get the number of task flows on the switch under the optimal path.

In one implementation, corresponding to S14 in Figure 1, the following steps may also be included:

(1) Detect the path health of the switch and determine the real-time health coefficient of the switch based on the path health of the switch.

The central controller can monitor the link health status of the software switch in real time and determine the real-time health coefficient based on the set health detection coefficient. The set health monitoring coefficient is ζ, as follows:

According to the real-time link health status, the real-time health coefficient of each switch is obtained, which can be written as {ζ ₁ , ζ ₂ , ζ ₃ ,..., ζ _n }.

(2) Multiply the initial weight corresponding to the switch and the real-time health coefficient to determine the real-time health matrix.

The initial weight of the switch is calculated by the number of task flows obtained. When there are multiple switches, each switch is divided by the greatest common divisor of the number of task flows between other switches in turn to obtain the initial weight. The initial weight can be written as {w′ ₁ , w′ ₂ , w′ ₃ , ..., w′ _n }. The real-time health matrix obtained by multiplying the initial weight and the implemented health coefficient is [ζ ₁ w′ ₁ , ζ ₂ w′ ₂ , ζ ₃ w′ ₃ ,..., ζ _n w′ _n }. Figure 4 shows the weighted equivalent Multipath diagram.

This embodiment provides a load balancing method. Figure 3 is a flow chart of a load balancing method according to an embodiment of the present invention. The method is applied to a switch. As shown in Figure 3, the process includes the following steps:

S21, obtain the data flow characteristic information, determine the gradient parameters based on the data flow characteristic information, and send the gradient parameters to the central controller.

The switch may be a software switch, and the data flow characteristic information may include the average flow packet size from src to dst, the average flow packet size from dst to src, the minimum packet value, the maximum packet value, and the average packet size, Packet transmission time, handshake time (TCP), etc. The switch uses the data flow characteristic information as a data set. The data set can be divided according to a preset ratio, such as a 7:3 ratio, as a training data set and a test data set respectively. At the same time, the consistency of the data distribution is maintained as much as possible to avoid Additional bias is introduced due to data partitioning, which affects the final results.

ReLU=max(0,x)

In order to make full use of the computing performance of the software switch, relieve the performance pressure on the central controller, and reduce the additional bandwidth overhead caused by avoidable data transmission. For timing prediction of CPU, memory and network bandwidth, GRU is combined with federated learning, as shown in Figure 2.

The switch obtains the joint GRU task sent by the central controller. After starting the joint GRU task and initializing the system parameters, it performs calculations locally based on local data such as CPU, memory, network bandwidth, and data flow characteristic information. After the calculation is completed, the obtained gradient parameters are sent. to the central controller.

S22: Obtain the global model sent by the central controller, optimize the local model based on the global model, and calculate the resource usage based on the local model and data flow characteristic information.

The global model is optimized by the central controller based on gradient parameters, and the central controller receives gradient parameters sent by at least one switch. In federated learning, each local gradient parameter is obtained through distributed training, and then the global model is optimized based on each local gradient parameter. After receiving the gradient parameters of the switch, the central controller performs an aggregation operation on these gradient parameters, focusing on efficiency, performance and other factors during the aggregation process. For example, because of the heterogeneous nature of the system, the central controller may sometimes not wait for data upload from all switches, but select a suitable subset of switches as collection targets. After the central controller aggregates and optimizes the global model based on the obtained gradient parameters, it sends the optimized global model to the switches participating in the GRU task. The switch updates the local model based on the received global model and evaluates the performance of the local model. If the performance reaches the preset condition, that is, when the performance is good enough, the training stops and the joint modeling ends; if the performance is insufficient, the switch calculates the gradient again locally. parameters and sent to the central controller until the final local model performance reaches the preset condition. The central controller saves the trained global model. It can calculate the initial parameters through the global model and send the initial parameters to the switch. The switch calculates the usage of each resource corresponding to the switch based on the initial parameters and the trained local model, such as CPU. , memory, network bandwidth, etc. usage.

S23, determine the sub-pheromone concentration based on resource usage and ant colony algorithm, and send the sub-pheromone concentration to the central controller to adjust the load.

in,

and

The completion time of the target task O _i assigned to the software switch S _j is C _ij , then C _ij should be the transmission time T _ij from the hardware switch to S _j plus the actual execution time W _ij of O _i on S _j plus The delay time W _ij from transmission to execution,

Right now:

C _ij =T _ij +E _ij +W _ij

E _ij =F _i /P _j ,

T _ij =F _i /N _j

C _max =max(C _ij )

min C _max .

At this time, the ant circle model is:

Indicates the total amount of pheromone released by the k-th ant on the path.

in,

When an ant moves to a new switch node, it updates the pheromone of its path and makes corresponding modifications to the tabu table. The sub-information concentration Δτ _ij (t) is obtained according to the local update formula of the pheromone.

The preset health matrix is obtained by setting ζ _i = 1. The preset health matrix is (1, 1, 1,..., 1). The real-time health matrix is compared with the preset health matrix. When there is a difference in the values, the hardware is controlled. The switch adjusts the health coefficient of the corresponding link and adjusts the load of the software switch. For example, it can exclude the software switch corresponding to the link with abnormal health status, reduce or remove its load, and maximize the balance of business processing on the software switch. If a link fails, this method can also be used to interrupt data transmission on the link to avoid the risk of data loss.

In one implementation, determining the sub-pheromone concentration based on resource usage and ant colony algorithm includes the following steps:

(1) Allocate search tasks of ants based on ant colony algorithm, and determine heuristic factors based on resource usage. In addition to the central controller and software switches, the system can also include hardware switches. Suppose there are n software switches in the system. A total of m ants are allocated, m = 2n. According to each switch, 2 ants perform the search task. Each ant's The initial point is the hardware switch.

in:

and

(2) Calculate the probability of ants moving to other switches based on the heuristic factor.

In one implementation, the probability of ants moving to other switches is calculated using the following formula:

in,

(3) When ants move to other switches, the sub-pheromone concentration is calculated based on the new pheromone on the path and the ant circle model.

Right now:

C _ij =T _ij +E _ij +W _ij

E _ij =F _i /P _j ,

T _ij =F _i /N _j

C _max =max(C _ij )

min C _max .

At this time, the ant circle model is:

Indicates the total amount of pheromone released by the k-th ant on the path.

New pheromones

The formula is as follows:

The optimal path found is recorded as Γ ^bs . When updating the local pheromone for this path, artificial release of additional pheromone is added.

to enhance the positive feedback effect. At this time, the local update formula is:

Indicates new pheromones on the path.

In this application, a GRU timing performance prediction method based on federated learning is designed, which reduces the increase in additional delay caused by real-time training and meets the delay-sensitive requirements of short flows. At the same time, it takes full advantage of the high performance of software switches and uses federated learning to relieve the pressure on the central controller and reduce avoidable data transmission. The load status of the link is equivalently known through prediction. The distributed weighted equal-cost multi-path routing method based on the optimized ant colony algorithm enables load balancing to take into account the heterogeneity of devices on software switches, adopts a more reasonable load evaluation method, improves the convergence of the algorithm, and speeds up the results. solution speed. Through real-time link health detection, data transmission can be interrupted as soon as a link fails, thus avoiding the risk of data loss. In the scenario of massive short-flow data, we make full use of the excellent computing performance of switch nodes and combine the differences in CPU, memory and network between switch nodes to predict the optimal weight configuration of each path and perform the corresponding weighted equivalent multi-step calculation. Path selection. It shortens the response time of the overall service and improves the user experience.

This embodiment also provides a load balancing system, which is used to implement the above embodiments and implementation modes. What has been explained will not be described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the systems described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

This embodiment provides a load balancing system, as shown in Figure 5, including:

Central controller, the central controller is used to execute the load balancing method;

At least one switch is connected to the central controller for performing a load balancing method.

In one implementation, the system includes a software switch, a hardware switch and a central controller. The system is shown in Figure 6 . The hardware switch includes a path distribution module that can perform multi-path selection for short flows based on weight configuration. At the same time, multi-path selection is dynamically and adaptively adjusted based on the link health detection results sent by the central controller after real-time monitoring.

The central controller includes a monitoring module, a performance prediction module and a path training module. The monitoring module is used to monitor and detect the real-time network utilization and link health of each software switch node, which serves as the basis for the dynamic adaptive adjustment of multi-path selection by the hardware switch. The performance prediction module can cooperate with the software switch to federally train a prediction model for CPU, memory and network bandwidth usage. The path training module can cooperate with the software switch to predict the optimal weight configuration of weighted equal-cost multi-paths in a distributed manner.

The software switch includes a monitoring module, a performance prediction module and a path training module. The monitoring module is used to monitor and record the usage of local resources of the software switch. Local resources include CPU, memory, network bandwidth, etc. At the same time, the local network utilization and link health status sent by the central controller can also be recorded as a data source for the performance prediction module. The performance prediction module can coordinate the central controller to calculate resource usage based on federated learning. The path training module can coordinate the distributed calculation of the assigned path search value results by the central controller.

The load balancing system in this embodiment is presented in the form of a functional unit, where the unit refers to an ASIC circuit, a processor and memory that executes one or more software or fixed programs, and/or other devices that can provide the above functions. .

Further functional descriptions of each of the above modules are the same as those in the above corresponding embodiments, and will not be described again here.

An embodiment of the present invention also provides an electronic device having the load balancing system shown in Figure 5 above.

Please refer to Figure 7. Figure 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention. As shown in Figure 7, the electronic device may include: at least one processor 601, such as a CPU (Central Processing Unit). ), at least one communication interface 603, memory 604, and at least one communication bus 602. Among them, the communication bus 602 is used to realize connection communication between these components. Among them, the communication interface 603 may include a display screen (Display) and a keyboard (Keyboard), and the optional communication interface 603 may also include a standard wired interface and a wireless interface. The memory 604 can be a high-speed RAM memory (Random Access Memory, volatile random access memory), or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 604 may optionally be at least one storage device located remotely from the aforementioned processor 601. The processor 601 can be combined with the system described in FIG. 5 , the memory 604 stores an application program, and the processor 601 calls the program code stored in the memory 604 to execute any of the above method steps.

The communication bus 602 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The communication bus 602 can be divided into an address bus, a data bus, a control bus, etc. For ease of presentation, only one thick line is used in Figure 7, but it does not mean that there is only one bus or one type of bus.

Among them, the memory 604 may include volatile memory (English: volatile memory), such as random access memory (English: random-access memory, abbreviation: RAM); the memory may also include non-volatile memory (English: non-volatile memory), such as flash memory (English: flash memory), hard disk (English: hard disk drive, abbreviation: HDD) or solid-state drive (English: solid-state drive, abbreviation: SSD); the memory 604 can also include the above types memory combination.

Among them, the processor 601 can be a central processing unit (English: central processing unit, abbreviation: CPU), a network processor (English: network processor, abbreviation: NP) or a combination of CPU and NP.

The processor 601 may further include a hardware chip. The above-mentioned hardware chip can be an application-specific integrated circuit (English: application-specific integrated circuit, abbreviation: ASIC), a programmable logic device (English: programmable logic device, abbreviation: PLD) or a combination thereof. The above-mentioned PLD can be a complex programmable logic device (English: complex programmable logic device, abbreviation: CPLD), a field-programmable logic gate array (English: field-programmable gate array, abbreviation: FPGA), a general array logic (English: generic array logic, abbreviation: GAL) or any combination thereof.

Optionally, memory 604 is also used to store program instructions. The processor 601 can call program instructions to implement the load balancing method shown in the embodiments of this application.

Embodiments of the present invention also provide a non-transitory computer storage medium. The computer storage medium stores computer-executable instructions. The computer-executable instructions can execute the load balancing method in any of the above method embodiments. Wherein, the storage medium can be a magnetic disk, an optical disk, a read-only memory (ROM), a random access memory (RAM), a flash memory (Flash Memory), a hard disk (Hard disk). Disk Drive (abbreviation: HDD) or solid-state drive (Solid-State Drive, SSD), etc.; the storage medium may also include a combination of the above types of memories.

Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the appended rights. within the scope of the requirements.

Claims

A load balancing method, characterized in that it is applied to a central controller, and the method includes:

Obtain the gradient parameters of the local model sent by the switch, optimize the global model according to the gradient parameters, and send the global model to the switch to determine the usage rate of the switch's resources, the resources including processor, memory and network bandwidth;

Obtain the sub-pheromone concentration determined by the switch based on the usage rate of the resource and the ant colony algorithm, and determine the number of task flows on the switch based on the sub-pheromone concentration;

Obtain the real-time health coefficient of the switch, and combine the real-time health coefficient with the initial weight to obtain a real-time health matrix. The initial weight is calculated based on the number of task flows;

The real-time health matrix is compared with a preset health matrix, and the load of the switch is adjusted according to the comparison result.
The method of claim 1, wherein determining the number of task flows on the switch based on the sub-pheromone concentration includes:

Cyclically obtain the sub-pheromone concentration determined by the switch based on the usage rate of the resource and the ant colony algorithm;

Globally update the sub-pheromone concentration to obtain the pheromone concentration, and optimize the path;

The number of task flows on the switch is determined based on the optimized path.
The method according to claim 2, characterized in that the pheromone concentration is calculated using the following formula:

τ ij (t+1)=(1-ρ)τ ij (t)+Δτ ij (t)

Among them, ρ represents the degree of pheromone volatilization, Δτ ij (t) represents the total amount of pheromone released by the ant colony on the path, τ ij (t+1) represents the pheromone on the path of switch i and switch j at time t+1. Pheromone concentration.
The method according to claim 1, characterized in that obtaining the real-time health coefficient of the switch and combining the real-time health coefficient with the initial weight to obtain the real-time health matrix includes:

Detect the path health of the switch, and determine the real-time health coefficient of the switch based on the path health of the switch;

Multiply the initial weight corresponding to the switch and the real-time health coefficient to determine the real-time health matrix.
A load balancing method, characterized in that it is applied to switches, and the method includes:

Obtain data flow characteristic information, determine gradient parameters based on the data flow characteristic information, and send the gradient parameters to the central controller;

Obtain the global model sent by the central controller, optimize the local model based on the global model, and calculate the resource usage based on the local model and the data flow feature information. The global model is the central controller based on the Obtained by optimizing the gradient parameters mentioned above;

The sub-pheromone concentration is determined based on the resource usage and the ant colony algorithm, and the sub-pheromone concentration is sent to the central controller to adjust the load.
The method according to claim 5, characterized in that determining the sub-pheromone concentration based on the usage rate of the resource and the ant colony algorithm includes:

Allocate the search tasks of ants based on the ant colony algorithm, and determine the heuristic factor based on the usage rate of the resources;

Calculate the probability that the ant moves to other switches according to the heuristic factor;

When the ants move to other switches, the sub-pheromone concentration is calculated based on the new pheromones on the path and the ant circle model.
The method according to claim 6, characterized in that the probability of the ant moving to other switches is calculated using the following formula:

in,
represents the probability that ant k will visit switch j at the next moment, α represents the sensitivity of ants to pheromone, β represents the sensitivity of ant colony to pheromone,
represents the pheromone concentration on the path of switch i and switch j at time t,
η ij represents the heuristic factor. The heuristic factor is used to describe the degree of attraction of switch j to ants on switch i. It can be expressed as η ij =1/d ij , d ij represents the distance between switch j and switch i, and allowed k means that it has not been visited yet. A collection of switches.
A load balancing system, characterized by including:

A central controller, the central controller is used to execute the load balancing method according to any one of claims 1-4;

At least one switch, the switch is connected to the central controller, and the switch is used to execute the load balancing method according to any one of claims 5-7.
An electronic device, characterized by including:

A memory and a processor. The memory and the processor are communicatively connected to each other. The memory stores computer instructions. The processor executes the computer instructions to execute any one of claims 1-7. The load balancing method described.
A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the load balancing method described in any one of claims 1-7.