CN112565399B

CN112565399B - Adaptive traffic load balancing method for online learning

Info

Publication number: CN112565399B
Application number: CN202011394360.6A
Authority: CN
Inventors: 张兴; 徐世界
Original assignee: Tianyi Electronic Commerce Co Ltd
Current assignee: Tianyi Electronic Commerce Co Ltd
Priority date: 2020-12-02
Filing date: 2020-12-02
Publication date: 2022-12-09
Anticipated expiration: 2040-12-02
Also published as: CN112565399A

Abstract

The invention discloses an online learning adaptive flow load balancing method, which comprises the following two steps: model training and online prediction, wherein the model trained by the model is used for online prediction, the online prediction service predicts the label (namely the strength of service capability) of a rear-end application example through online collected characteristic data, and the load balancer takes the prediction result as a load balancing weight value, so that online learning self-adaptive load balancing is realized. The invention realizes the self-adaptive flow load balance by combining the indexes of a plurality of dimensions reflecting the real situation of the back-end example and online learning, and the flow can be more quickly and accurately distributed to healthy application nodes by utilizing the method.

Description

Adaptive traffic load balancing method for online learning

Technical Field

The invention relates to the technical field of emerging information, in particular to an online learning adaptive traffic load balancing method.

Background

Load balancing refers to balancing and distributing loads (work tasks) to a plurality of operation units for operation, such as an FTP server, a Web server, an enterprise core application server, and other main task servers, so as to cooperatively complete the work tasks, as shown in fig. 1; common load balancing algorithms include RandomLoadBalance (random balancing algorithm), roundRobinLoadBalance (weight round robin balancing algorithm), leistactionloadbalance (least active call number balancing algorithm), consistence hashloadbalance (consistent Hash balancing algorithm), and the like, and for such configuration, a conventional method is to pre-configure a designated algorithm as a load balancing strategy, but cannot dynamically adjust according to the real situation of a back-end application instance, as shown in fig. 2, a resource tension alarm occurs, and a conventional load balancing cannot make a corresponding decision according to the change.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides an online learning adaptive traffic load balancing method.

In order to solve the technical problems, the invention provides the following technical scheme:

the invention discloses an online learning adaptive flow load balancing method, which comprises the following two steps: model training and online prediction, wherein the model trained by the model is used for online prediction, the online prediction service predicts a label of a rear-end application example through online collected characteristic data, and a prediction result is used as a load balancing weight value by a load balancer, so that online learning self-adaptive load balancing is realized, and the method specifically comprises the following steps:

the specific steps of model training are as follows:

s1, collecting model training characteristics,

by adjusting the concurrency number of the requests initiated by the client and the resource state of the back-end application instance server, for example, adjusting resources such as cpu, memory and the like, the state under a real condition is simulated, and the following data with 4 dimensions are obtained and used as model training characteristics:

1) Each resource utilization rate data of the back-end application example in the T time comprises a cpu, a memory and a disk;

2) Requesting to return state code distribution within T time;

3) Average request response time within T time;

4) The health detection passing frequency of the back-end application example within T time;

s2, label definition:

the single request execution results are divided into three cases:

4) Whether the request is executed successfully;

5) Request successfully returns, but response time times out;

6) The request returns successfully and the response time is normal;

to sum up, for a single example, three values are counted: the abnormal rate (E) of requests in the T time, the timeout rate (L) in the T time and the corresponding average time length (A) ms of normal requests in the T time define label = w ₁ *E+w ₂ *L+w ₃ * A, wherein w ₁ 、w ₂ 、w ₃ Is a weighted value;

s3, training an original model off line, training a regression model by using Lightgbm through model training characteristics and sample labels confirmed by S1 and S2, wherein LightGBM (LightGradientBoosting machine) is a framework for realizing a GBDT algorithm, and performing iterative training by using a weak classifier to obtain an optimal model;

the online prediction method comprises the following specific steps:

s4, obtaining model training characteristics by an online model service;

s5, predicting the strength of service processing capacity in the following T time by utilizing the model training characteristics, and comprehensively evaluating the state of the server by using the average values of the request abnormal rate, the overtime rate and the response duration of the normal request in the T time, wherein the lower the abnormal rate is, the lower the overtime rate is, the shorter the response duration average value is, the better the service state is, and the higher the set weight is;

and S6, the online prediction service pushes the result to a load balancer, and the load balancer sets weight distribution flow according to the push result.

Compared with the prior art, the invention has the following beneficial effects:

the invention realizes the self-adaptive flow load balance by combining the indexes of multiple dimensions reflecting the real situation of the back-end instance and online learning, and the flow can be distributed to healthy application nodes more quickly and accurately by utilizing the method.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is a schematic diagram of a load balancing principle;

FIG. 2 is a diagram of a common load balancing algorithm;

FIG. 3 is an overall block diagram of the present invention;

FIG. 4 is a diagram of steps for collecting model training features according to the present invention;

fig. 5 is a schematic diagram of the online prediction principle of the present invention.

Detailed Description

The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.

Example 1

As shown in fig. 3, the present invention provides an adaptive traffic load balancing method for online learning, which includes two steps: model training and online prediction, wherein online prediction is performed through a model trained by the model, an online prediction service predicts a label (namely the strength of service capability) of a rear-end application example through characteristic data acquired online, and a load balancer takes a prediction result as a load balancing weight value, so that online learning self-adaptive load balancing is realized, specifically as follows:

the specific steps of model training are as follows:

s1, collecting model training characteristics, as shown in figure 4,

the method comprises the following steps of simulating the state under a real condition by adjusting the concurrency number of a request initiated by a client and the resource state of a back-end application instance server, such as adjusting resources of a cpu (central processing unit), a memory and the like, and obtaining the following data with 4 dimensions as model training characteristics:

1) The resource utilization rate data of the back-end application example in the T time comprise a cpu, a memory, a disk and the like;

2) Requesting to return the distribution of the state codes within T time;

3) Average request response time within T time;

s2, label definition:

the single request execution results are divided into three cases:

7) Whether the request was executed successfully (state 200);

8) Request successfully returns, but response time times out (threshold can be customized, default = Q3+1.5 x (Q3-Q1) of all request response times);

note: q3:75 quantile, Q1:25 quantiles;

9) The request returns successfully and the response time is normal;

to sum up, for a single example, three values are counted: the abnormal rate (E) of requests in the T time, the overtime rate (L) in the T time and the corresponding average time length (A) ms of normal requests in the T time are defined by Label＝w ₁ *E+w ₂ *L+w ₃ * A, wherein w ₁ 、w ₂ 、w ₃ Is a weighted value;

s3, training an original model offline, training a regression model by using Lightgbm through model training characteristics and sample labels confirmed by S1 and S2, wherein the LightGBM (LightGradientBoosting machine) is a framework for realizing a GBDT algorithm, and iterative training is carried out by using a weak classifier (decision tree) to obtain an optimal model, and the model has the advantages of good training effect, difficulty in overfitting and the like;

the specific steps of online prediction are as follows, as shown in fig. 5:

s4, obtaining model training characteristics by an online model service (the characteristic data of T time refers to service state data acquired online in T time, including CPU, memory, magnetic disks and the like, request return state code distribution, average request response time, back-end application health detection passing frequency and the like);

and S5, predicting the strength (namely the label) of the service processing capacity in the next T time by utilizing the model training characteristics (characteristics). And evaluating the state of the server by using the average value synthesis of the request abnormal rate, the overtime rate and the response duration of the normal request in the T time. The lower the abnormal rate, the lower the overtime rate and the shorter the average value of the response duration, the better the service state is, and the higher the set weight is;

and S6, the online prediction service pushes the result to a load balancer, and the load balancer sets weight distribution flow according to the pushed result.

Specifically, the method realizes the self-adaptive traffic load balancing by combining the indexes of multiple dimensions reflecting the real situation of the back-end instance and online learning, and can distribute the traffic to healthy application nodes more quickly and accurately.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An online learning adaptive flow load balancing method is characterized by comprising the following two steps: model training and online prediction, wherein the model trained by the model is used for online prediction, the online prediction service predicts a label of a rear-end application example through online collected characteristic data, and a prediction result is used as a load balancing weight value by a load balancer, so that online learning self-adaptive load balancing is realized, and the method specifically comprises the following steps:

the specific steps of model training are as follows:

s1, collecting model training characteristics,

the method comprises the following steps of simulating the state under a real condition by adjusting the concurrency number of a client-initiated request and the resource state of a back-end application instance server, and obtaining the following 4-dimensional data as a model training characteristic:

1) Each resource utilization rate data of the back-end application example in the time T comprises a CPU, a memory and a disk;

2) Requesting to return the distribution of the state codes within T time;

3) Average request response time within T time;

s2, label definition:

the single request execution results are divided into three cases:

1) Whether the request is executed successfully;

2) Request successfully returns, but response time times out;

3) The request returns successfully and the response time is normal;

to sum up, for a single backend application instance, three values are counted: defining a request anomaly rate (E) within T time, a timeout rate (L) within T time and a corresponding average duration (A) ms of normal requests within T time, label = w ₁ *E+w ₂ *L+w ₃ * A, wherein w ₁ 、w ₂ 、w ₃ Is a weighted value;

s3, training an original model off line, training a regression model by using a LightGBM (LightGradientBoosting machine) through model training characteristics and sample labels confirmed by S1 and S2, wherein the LightGBM (LightGradientBoosting machine) is a framework for realizing a GBDT algorithm, and performing iterative training by using a weak classifier to obtain an optimal model;

the online prediction comprises the following specific steps:

s4, obtaining model training characteristics by an online model service;

s5, predicting the strength of service processing capacity in the following T time by utilizing model training characteristics, and evaluating the state of the server by using the average value of the request abnormal rate, the overtime rate and the response duration of the normal request in the T time in a comprehensive mode, wherein the lower the abnormal rate, the lower the overtime rate and the shorter the average value of the response duration, the better the service state is, and the higher the set weight is;