CN115617528A

CN115617528A - Load balancing method, apparatus, electronic device, storage medium, and program product

Info

Publication number: CN115617528A
Application number: CN202211413848.8A
Authority: CN
Inventors: 李晓龙; 李冰心
Original assignee: Beijing Tendcloud Tianxia Technology Co ltd
Current assignee: Beijing Tendcloud Tianxia Technology Co ltd
Priority date: 2022-11-11
Filing date: 2022-11-11
Publication date: 2023-01-17

Abstract

The present disclosure provides a computer-implemented load balancing method, apparatus, computer device, computer-readable storage medium, and computer program product. The method comprises the following steps: configuring a sliding time window for a server system comprising a plurality of load resources for processing the allocated client requests; after determining that the server system begins processing client requests, assigning the same load score to the plurality of load resources; for each load resource in the plurality of load resources, collecting statistics of the load resource on a plurality of performance indicators within a current time window; updating the load scores of the plurality of load resources based on the collected statistics; and determining a target load resource to be allocated for the client request at the current time within the current time window from the plurality of load resources based on the updated load scores.

Description

Load balancing method, apparatus, electronic device, storage medium, and program product

Technical Field

The present disclosure relates to the field of data processing, and in particular, to a computer-implemented load balancing method, apparatus, computer device, computer-readable storage medium, and computer program product.

Background

A server system (e.g., a distributed system) with multiple load resources (e.g., processing nodes) has the advantages of easy expansion, strong parallel processing capability, good stability, and the like. When a user request from a client is received, the request of the client can be distributed to each load resource according to a certain rule for parallel processing.

An improper client request allocation rule may lead to a load imbalance situation such that some load resources are overloaded while other load resources are idle. In this case, the concurrent processing capacity, resource utilization, availability, flexibility, etc. of the server system would be adversely affected.

The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, the problems mentioned in this section should not be considered as having been acknowledged in any prior art, unless otherwise indicated.

Disclosure of Invention

It would be advantageous to provide a mechanism that alleviates, mitigates or even eliminates one or more of the above-mentioned problems.

According to an aspect of the present disclosure, there is provided a computer-implemented load balancing method, including: configuring a sliding time window for a server system comprising a plurality of load resources (e.g., processing nodes) for processing the allocated client requests; after determining that the server system starts processing client requests, assigning the same load score to the plurality of load resources, wherein the load score indicates a processing capacity margin of the load resources for processing client requests; for each load resource in the plurality of load resources, collecting statistics of the load resource over a plurality of performance indicators within a current time window, wherein the plurality of performance indicators are metrics that measure the burden of the load resource on processing different aspects of the client request; updating the load scores of the plurality of load resources based on the collected statistics; and determining a target load resource to be allocated for the client request at the current time within the current time window from the plurality of load resources based on the updated load scores.

According to another aspect of the present disclosure, there is provided a load balancing apparatus including: a first module configured to configure a sliding time window for a server system comprising a plurality of load resources for processing allocated client requests; a second module configured to assign a same load score to the plurality of load resources after determining that the server system starts processing client requests, wherein the load score indicates a processing capacity margin for a load resource to process client requests; a third module configured to collect statistics of a plurality of performance indicators of each load resource in the plurality of load resources within a current time window, wherein the plurality of performance indicators are metrics that measure burdens of the load resource on different aspects of processing client requests; a fourth module configured to update load scores of the plurality of load resources based on the collected statistics; and a fifth module configured to determine, from the plurality of load resources, a target load resource to be allocated for a client request at a current time within the current time window based on the updated load scores.

According to yet another aspect of the present disclosure, there is provided a computer apparatus including: at least one processor; and at least one memory having a computer program stored thereon, which, when executed by the at least one processor, causes the at least one processor to perform the above-described method.

According to yet another aspect of the present disclosure, a computer-readable storage medium is provided, having stored thereon a computer program which, when executed by a processor, causes the processor to perform the above-mentioned method.

According to yet another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, causes the processor to carry out the above-mentioned method.

According to one or more embodiments of the present disclosure, a load balancing method is provided, in which statistics on multiple performance indicators of each load resource are dynamically collected within a current time window, then a load score of each load resource is updated according to the collected statistics, and finally a client request is allocated according to the updated load score, so that a load resource with a lighter load has a greater probability of being allocated to the client request, and a load resource with a heavier load has a smaller probability of being allocated to the client request. Therefore, load balancing among a plurality of loads can be realized, and the concurrent processing capacity, the resource utilization rate, the availability and the flexibility of the server system are improved.

These and other aspects of the disclosure will be apparent from and elucidated with reference to the embodiments described hereinafter.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the embodiments and, together with the description, serve to explain the exemplary implementations of the embodiments. The illustrated embodiments are for purposes of illustration only and do not limit the scope of the claims. Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.

Further details, features and advantages of the disclosure are disclosed in the following description of exemplary embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a schematic diagram illustrating an example system in which various methods described herein may be implemented, according to an example embodiment;

FIG. 2 is a flow chart illustrating a method of load balancing in accordance with an illustrative embodiment;

FIG. 3 is a schematic diagram illustrating a method of configuring a sliding time window in accordance with an example embodiment;

FIG. 4 is a diagram illustrating a method of updating load scores for a plurality of load resources in accordance with an example embodiment;

FIG. 5 is a diagram illustrating a method of updating load scores for a plurality of load resources according to other example embodiments;

FIG. 6 is a flowchart illustrating a method of selecting a target load resource in accordance with an exemplary embodiment;

FIG. 7 is a diagram illustrating a method of selecting a target load resource in accordance with an example embodiment;

fig. 8 is a schematic block diagram illustrating a load balancing apparatus in accordance with some demonstrative embodiments;

FIG. 9 is a block diagram illustrating an exemplary computer device that can be applied to the exemplary embodiments.

Detailed Description

In the present disclosure, unless otherwise specified, the use of the terms "first", "second", etc. to describe various elements is not intended to limit the positional relationship, the timing relationship, or the importance relationship of the elements, and such terms are used only to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, while in some cases they may refer to different instances based on the context of the description.

The terminology used in the description of the various described examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, if the number of elements is not specifically limited, the elements may be one or more. As used herein, the term "plurality" means two or more, and the term "based on" should be interpreted as "based, at least in part, on". Further, the terms "and/or" and "… …" encompass any and all possible combinations of the listed items.

Exemplary embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.

Fig. 1 is a schematic diagram illustrating an example system 100 in which various methods described herein may be implemented, according to an example embodiment.

Referring to fig. 1, the system 100 includes a client device 110, a server 120, a network 130 communicatively coupling the client device 110 and the server 120, and a load balancer (not shown).

The system 100 may be a distributed processing system including a plurality of servers 120, and the plurality of servers 120 may also be a plurality of processing nodes or a plurality of load resources. The system 100 can include multiple client devices 110, and client requests issued at the multiple client devices 110 can be distributed via a load balancer to multiple load resources in the system 100 for parallel processing.

The client device 110 includes a display 114 and a client Application (APP) 112 displayable via the display 114. The client application 112 may be an application that needs to be downloaded and installed before running or an applet (liteapp) that is a lightweight application. In the case where the client application 112 is an application program that needs to be downloaded and installed before running, the client application 112 may be installed on the client device 110 in advance and activated. In the case where the client application 112 is an applet, the user 102 can run the client application 112 directly on the client device 110 by searching the client application 112 in a host application (e.g., by name of the client application 112, etc.) or scanning a graphical code (e.g., barcode, two-dimensional code, etc.) of the client application 112, etc., without installing the client application 112. In some embodiments of the present disclosure, the client device 110 may be any type of mobile computer device, including a mobile computer, a mobile phone, a wearable computer device (e.g., a smart watch, a head-mounted device, including smart glasses, etc.), or other type of mobile device. In some embodiments of the present disclosure, client device 110 may alternatively be a stationary computer device, such as a desktop, server computer, or other type of stationary computer device.

The server 120 is typically a server deployed by an Internet Service Provider (ISP) or Internet Content Provider (ICP). Server 120 may represent a single server, a cluster of multiple servers, a distributed system, or a cloud server providing an underlying cloud service (such as cloud database, cloud computing, cloud storage, cloud communications). It will be understood that although the server 120 is shown in fig. 1 as communicating with only one client device 110, the server 120 may provide background services for multiple client devices simultaneously.

Examples of network 130 include a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), and/or a combination of communication networks such as the Internet. The network 130 may be a wired or wireless network. In some embodiments of the present disclosure, data exchanged over network 130 is processed using techniques and/or formats including hypertext markup language (HTML), extensible markup language (XML), and the like. In addition, all or some of the links may also be encrypted using encryption techniques such as Secure Sockets Layer (SSL), transport Layer Security (TLS), virtual Private Network (VPN), internet protocol security (IPsec), and so on. In some embodiments of the present disclosure, customized and/or dedicated data communication techniques may also be used in place of, or in addition to, the data communication techniques described above.

Fig. 2 is a flow chart illustrating a method 200 of load balancing according to an example embodiment. Method 200 may be performed by a load balancer in the example system 100 of fig. 1. The various steps of method 200 are described in detail below in conjunction with fig. 2.

Referring to fig. 2, at step 210, a sliding time window is configured for a server system that includes a plurality of load resources for processing allocated client requests.

Fig. 3 is a schematic diagram illustrating a method 300 of configuring a sliding time window in accordance with an example embodiment.

Referring to fig. 3, the time window 302 may slide over a time axis over time. In some embodiments of the present disclosure, the time window 302 may be configured with a fixed duration, i.e., the length of time of the time window 302 may be fixed. In some embodiments of the present disclosure, the duration of the time window 302 may be less than or equal to 1 hour, such that statistics of a plurality of load resources over a plurality of performance metrics may be dynamically collected within the time window 302. The appropriate duration may be selected according to the actual application scenario, for example, a smaller duration may be used for an application scenario with higher dynamic requirements, and a larger duration may be used for an application scenario with lower dynamic requirements. Exemplary durations may be 1 hour, 0.5 hour, 10 minutes, 1 minute, and so forth. In some embodiments of the present disclosure, the time window 302 may be divided at a uniform time granularity 304 such that the time window 302 slides on the time axis each time at the divided time granularity 304. Referring to FIG. 3, the time window 302 may be slid a small grid on the time axis each time it is slid. The appropriate time granularity 304 may be selected according to the actual application scenario, e.g., using a smaller time granularity 304 for application scenarios with higher dynamic requirements and using a larger time granularity 304 for application scenarios with lower dynamic requirements. Exemplary time granularities 304 may be one twentieth, one tenth, one fifth, etc. of the time window 302.

Referring back to FIG. 2, at step 220, after determining that the server system begins processing the client request, the same load score is assigned to the plurality of load resources, wherein the load score indicates a processing capacity margin for the load resources to process the client request. In some embodiments of the present disclosure, the number of the plurality of load resources of the server system is n (n is greater than 1), and after determining that the server system receives and starts to process the client request, the same initial score may be configured for the n load resources, so that at an initial stage (time t 0), the n resource loads (S) ₁ To S _n ) Load fraction S of _1,t0 ，S _2,t0 ，S _3,t0 ……S _n,t0 Are equal. The specific value of the load score is not limited and may be, for example, 10000, 1000, 100, etc.

At step 230, for each load resource in the plurality of load resources, statistics of the load resource over a plurality of performance indicators are collected within a current time window, wherein the plurality of performance indicators are metrics that measure burdens of the load resource on different aspects of processing client requests.

In some embodiments of the present disclosure, the plurality of performance indicators may include any combination of: the method comprises the steps of processing times of client requests, processing time of the client requests, the number of load resource connections, the sum of network transmission quantity, inflow network transmission quantity, outflow network transmission quantity, the number of client request failures and routing time.

The following details how the above performance indicators indicate the load that the load resource bears when processing the client request:

the smaller the number of times of processing the client request, the more limited the capacity of the load resource to process the request, and the more likely the load resource will become the bottleneck of improving the processing performance.

The longer the client request processing time, the heavier the burden the load resource bears in this respect when processing the client request. It should be noted that client request processing time may refer to the processing time of a single client request.

The greater the number of load resource connections (e.g., TCP port connections), the more heavily loaded the load resource will be in this regard when processing client requests.

The larger the traffic flowing into the network, the larger the traffic flowing into the resource load, which is the most burdened in this respect when handling client requests.

The larger the outgoing network traffic, the larger the traffic that is flowing out of the resource load, which means that the load resource is the most burdened in this respect when handling client requests.

Network traffic refers to the sum of traffic flowing into and out of a load resource. The larger the sum of the network traffic, the heavier the burden on the load resource in processing the client request.

The larger the number of failed client requests, the heavier the burden on the load resource in processing the client request.

The longer the routing time, the more heavily loaded the load resource will be in this regard when processing client requests. Here, "routing time" refers to the time interval from when a client request is assigned to when it is processed.

According to some embodiments of the present disclosure, based on the statistics of the multiple performance indicators, the multiple load resources can be comprehensively evaluated, and it is beneficial to subsequently and accurately select the load resource with lighter load to process a new client request.

At step 240, load scores for a plurality of load resources are updated based on the collected statistics.

In some embodiments of the present disclosure, updating the load scores of the plurality of load resources based on the collected statistics may include: determining whether a trigger condition to update load scores of a plurality of load resources is satisfied; and updating the load scores of the plurality of load resources based on the collected statistics in response to determining that the trigger condition is satisfied.

According to some embodiments of the present disclosure, the load scores of a plurality of load resources are updated only when the trigger condition is satisfied, thereby avoiding too frequent load score updates, thus reducing the amount of computation of the update scores and saving computational resources.

In some embodiments of the present disclosure, the trigger condition for updating the load scores of the plurality of load resources may include that the number of client requests processed by the plurality of load resources reaches a client request processing number threshold; or a preset proportion of the duration of each elapsed current time window.

According to some embodiments of the present disclosure, any one of the above trigger conditions may be selected according to an actual application scenario, so that the trigger conditions may be configured from a request number dimension or a time dimension more flexibly.

Fig. 4 is a schematic diagram illustrating a method 400 of updating load scores of a plurality of load resources, according to an example embodiment. In the method 400, the triggering condition is that the number of client requests processed by the plurality of load resources reaches a threshold number of client request processing times. An appropriate threshold for the number of client request processes may be selected according to the actual application scenario, for example, a smaller threshold for the number of client request processes may be used for an application scenario with a higher dynamic requirement, and a larger threshold for the number of client request processes may be used for an application scenario with a lower dynamic requirement. Exemplary client request handling times thresholds may be 20, 100, 500, 1000, and so on. In some embodiments of the present disclosure, the threshold number of client request processing times is 100. Referring to fig. 4, at time T1, the total number of client requests processed by multiple load resources reaches the threshold 100 times, and therefore, at time T1, the load scores of multiple load resources are updated according to the statistics on the multiple performance indicators within the time window 402 collected up to time T1. The total number of processed client requests is then re-counted. At time T2, the total number of client requests processed by the multiple load resources reaches the threshold 100 times again, so the load scores of the multiple load resources are updated again at time T2 according to the statistics on the multiple performance indicators within the time window 402 collected up to time T2.

According to some embodiments of the present disclosure, the method 400 may flexibly perform dynamic load score updates by setting a threshold number of client request processing times. Setting a proper threshold of the processing times of the client requests according to the actual application scene can avoid updating the load score too dynamically (for example, updating the load score once every time a client request is processed), thereby greatly reducing the calculation amount and providing certain dynamics and real-time performance.

Fig. 5 is a schematic diagram illustrating a method 500 of updating load scores of a plurality of load resources according to other exemplary embodiments. In method 500, the trigger condition is a preset proportion of the duration of each elapsed current time window. A suitable preset ratio may be selected according to an actual application scenario, for example, a smaller preset ratio is used for an application scenario with a higher dynamic requirement, and a larger preset ratio is used for an application scenario with a lower dynamic requirement. Exemplary preset ratios may be one fifth, one tenth, one twentieth, etc. of the duration of the current time window. In some embodiments of the present disclosure, the preset proportion may be one fifth of the duration of the current time window. Referring to fig. 5, at time T1, one fifth of the duration of the current time window has elapsed, and therefore, at time T1, the load scores of the plurality of load resources will be updated based on the statistics on the plurality of performance indicators within the time window 502 collected up to time T1. And then retime for the trigger condition. At time T2, one fifth of the duration of the current time window has again elapsed, so at time T2 the load scores of the plurality of load resources will be updated again based on the statistics on the plurality of performance indicators within the time window 502 collected as of time T2.

According to some embodiments of the present disclosure, the method 500 may flexibly perform dynamic load score updates by setting a preset ratio. The appropriate preset proportion is set according to the actual application scene, so that the load fraction can be prevented from being updated too dynamically, the calculated amount is greatly reduced, and certain dynamic property and real-time property can be provided.

In some embodiments of the present disclosure, updating the load scores of the plurality of load resources based on the collected statistics may include: for each performance indicator of the plurality of performance indicators, determining a load resource from the plurality of load resources having a maximum or minimum statistical value on the performance indicator, the maximum or minimum statistical value indicating that, among the plurality of load resources, the load resource has a heaviest load processing burden on the performance indicator; and subtracting a preset score from the load score of the load resource.

In some embodiments of the present disclosure, the size of the preset score may be set according to actual needs. For example, the preset score may be one percent, five percent, one thousandth, etc. of the initial load score.

In some embodiments of the present disclosure, a load resource corresponding to a minimum number of client request processes may be found among a plurality of load resources, and has a load score minus a preset score.

In some embodiments of the present disclosure, the load resource corresponding to the maximum client request processing time may be found among the plurality of load resources, having a load score less a preset score.

In some embodiments of the present disclosure, the load resource corresponding to the maximum load resource connection number may be found among the plurality of load resources, and the load score it has is subtracted by the preset score.

In some embodiments of the present disclosure, a load resource corresponding to a maximum amount of incoming network traffic may be found among a plurality of load resources, having a load score minus a preset score.

In some embodiments of the present disclosure, a load resource corresponding to a maximum outgoing network traffic amount may be found among the plurality of load resources, having a load score minus a preset score.

In some embodiments of the present disclosure, a load resource corresponding to a maximum sum of network transmission amounts may be found among a plurality of load resources, having a load score minus a preset score.

In some embodiments of the present disclosure, the load resource corresponding to the largest number of client request failures may be found among the plurality of load resources, and the load score it has is subtracted by the preset score.

In some embodiments of the present disclosure, the load resource corresponding to the maximum routing time may be found among a plurality of load resources, having a load score minus a preset score.

In some embodiments of the present disclosure, based on the collected client request processing time, a load resource corresponding to the maximum client request processing time TP95 may be found among the plurality of load resources, and has a load score minus a preset score. The client request processing time TP95 represents at least the time required to complete 95% of the client request.

In some embodiments of the present disclosure, based on the collected inflow network traffic, a load resource corresponding to the maximum inflow network traffic TP95 may be found among the plurality of load resources, and has a load score minus a preset score. The inbound network traffic TP95 represents at least the amount of inbound network traffic required to complete 95% of the client requests.

In some embodiments of the present disclosure, based on the collected outgoing network transmission amount, a load resource corresponding to the maximum outgoing network transmission amount TP95 may be found out from a plurality of load resources, and has a load score minus a preset score. The outgoing network traffic TP95 represents at least the required outgoing network traffic to complete 95% of the client requests.

In some embodiments of the present disclosure, after subtracting the preset score from the load score of the load resource, determining whether the updated load score is less than 0; in response to determining that the updated load score is less than 0, adding the same score to the load scores of the plurality of load resources such that the updated load score is greater than 0. Therefore, although the load scores of the plurality of load resources are subjected to the plurality of score updates (i.e., subjected to the plurality of decrements), the load scores of the plurality of load resources are never made negative.

Referring back to fig. 2, at step 250, a target load resource to be allocated for the client request at the current time within the current time window is determined from the plurality of load resources based on the updated load scores.

In some embodiments of the present disclosure, determining, from the plurality of load resources, a target load resource to be allocated for the client request at a current time within a current time window based on the updated load score may include: and selecting a load resource from the plurality of load resources as a target load resource, wherein the probability of each selected load resource in the plurality of load resources is positively correlated with the updated load score of the load resource.

According to the embodiment of the present disclosure, the client request at the current time has a greater probability of being allocated to the load resource with a higher load score (indicating a higher processing capability margin), while the load resource with a lower load score (indicating a lower processing capability margin) will be allocated to the client request at the current time with a smaller probability, so that load balancing can be better achieved among a plurality of load resources.

Referring to fig. 4, in some embodiments of the present disclosure, when the current time is T1, a target load resource to be allocated to a client request may be selected based on the load scores of the plurality of load resources updated at time T1; in some embodiments of the present disclosure, when the current time is T2, the target load resource to be allocated to the client request may be selected based on the updated load scores of the plurality of load resources at time T2.

Referring to fig. 5, in some embodiments of the present disclosure, when the current time is T1, a target load resource to be allocated to a client request may be selected based on the load scores of the plurality of load resources updated at time T1; in some embodiments of the present disclosure, when the current time is T2, the target load resource to be allocated to the client request may be selected based on the updated load scores of the plurality of load resources at time T2.

Fig. 6 is a flow diagram illustrating a method 600 of selecting a target load resource in accordance with an example embodiment.

Referring to fig. 6, a value interval defined by a start point and an end point is established on a number axis in step 610, the value interval includes a plurality of successive sub-intervals, the plurality of sub-intervals are in one-to-one correspondence with a plurality of load resources, and lengths of the plurality of sub-intervals are respectively proportional to load fractions of the plurality of load resources.

Fig. 7 is a schematic diagram illustrating a method 600 of selecting a target load resource in accordance with an example embodiment. Step 610 is further described with reference to FIG. 7, where a value range of [ a, b ] is first established]A is the value at the beginning of the value interval and b is the value at the end of the value interval. In some embodiments of the present disclosure, the number of the plurality of resource loads is n, and thus the value interval may be divided into n mutually consecutive sub-intervals, each sub-interval ranging from left open to right closed (i.e. including the right end point but not including the left end point). Each subinterval is respectively connected with a plurality of load resources (S) ₁ To S _n ) And the length of each subinterval is respectively in direct proportion to the load fraction of the corresponding load resource. In some embodiments of the present disclosure, the length of each subinterval may be equal to the load fraction of the corresponding load resource (i.e., the length of each subinterval may be in a relationship of 1:1 with the load fraction of the corresponding load resource). In this case, the length of the value interval is equal to the sum of the values of the load fractions of the plurality of load resources. In some embodiments of the present disclosure, the length of each subinterval may be in other proportional relationships with the load fraction of the corresponding load resource.

Referring back to fig. 6, at step 620, a random number between the start point and the end point is generated. In some embodiments of the present disclosure, the random number may be generated for a client request to be allocated at a current time within a current time window. Based on the random number, the load resource to be allocated to the client request at the current time can be determined.

At step 630, the load resource corresponding to the subinterval where the random number is located is selected as the target load resource. Referring to fig. 7, the random number is in the sub-interval corresponding to the load resource S2, so the load resource S2 is selected as the target load resource to process the client request at the current time.

According to the load balancing method provided by the embodiment of the disclosure, statistics values of each load resource on a plurality of performance indexes are dynamically collected in a current time window, then the load score of each load resource is updated according to the collected statistics values, and finally a client request is distributed according to the updated load score, so that the load resource with lighter load has a higher probability to be distributed to the client request, and the load resource with heavier load has a lower probability to be distributed to the client request. Therefore, load balancing among a plurality of loads can be realized, and the concurrent processing capacity, the resource utilization rate, the availability and the flexibility of the server system are improved.

While the various operations described above are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, nor that all illustrated operations be performed, to achieve desirable results. For example, step 220 may be performed prior to step 210, or concurrently with step 210.

Fig. 8 is a schematic block diagram illustrating a load balancing apparatus 800 according to an example embodiment. Referring to fig. 8, the load balancing apparatus 800 includes: a first module 810 configured to configure a sliding time window for a server system comprising a plurality of load resources for processing allocated client requests; a second module 820 configured to assign the same load score to the plurality of load resources after determining that the server system starts processing the client request, wherein the load score indicates a processing capacity margin of the load resources for processing the client request; a third module 830 configured to collect statistics of a plurality of performance indicators of each load resource in the plurality of load resources within a current time window, wherein the plurality of performance indicators are metrics that measure burdens of the load resource on different aspects of processing the client request; a fourth module 840 configured to update load scores of the plurality of load resources based on the collected statistics; and a fifth module 850 for determining a target load resource to be allocated for the client request at the current time within the current time window from the plurality of load resources based on the updated load scores.

According to the load balancing device provided by the embodiment of the disclosure, statistics values of each load resource on a plurality of performance indexes are dynamically collected in a current time window, then the load score of each load resource is updated according to the collected statistics values, and finally a client request is distributed according to the updated load score, so that the load resource with lighter load has a higher probability to be distributed to the client request, and the load resource with heavier load has a lower probability to be distributed to the client request. Therefore, load balancing among a plurality of loads can be realized, and the concurrent processing capacity, the resource utilization rate, the availability and the flexibility of the server system are improved

It should be understood that the various modules of the apparatus 800 shown in fig. 8 may correspond to the various steps in the method 200 described with reference to fig. 2. Thus, the operations, features and advantages described above with respect to the method 800 are equally applicable to the apparatus 800 and the modules included therein. Certain operations, features and advantages may not be described in detail herein for the sake of brevity.

Although specific functionality is discussed above with reference to particular modules, it should be noted that the functionality of the various modules discussed herein may be divided into multiple modules and/or at least some of the functionality of multiple modules may be combined into a single module. Performing an action by a particular module discussed herein includes the particular module itself performing the action, or alternatively the particular module invoking or otherwise accessing another component or module that performs the action (or performs the action in conjunction with the particular module). Thus, a particular module that performs an action can include the particular module that performs the action itself and/or another module that the particular module invokes or otherwise accesses that performs the action. For example, the third module 830 and the fourth module 840 described above may be combined into a single module in some embodiments. As used herein, the phrase "entity a initiates action B" may refer to entity a issuing instructions to perform action B, but entity a itself does not necessarily perform that action B.

It should also be appreciated that various techniques may be described herein in the general context of software hardware elements or program modules. The various modules described above with respect to fig. 8 may be implemented in hardware or in hardware in combination with software and/or firmware. For example, the modules may be implemented as computer program code/instructions configured to be executed in one or more processors and stored in a computer readable storage medium. Alternatively, the modules may be implemented as hardware logic/circuitry. For example, in some embodiments, one or more of the first module 810, the second module 820, the third module 830, the fourth module 840, and the fifth module 850 may be implemented together in a System on Chip (SoC). The SoC may include an integrated circuit chip (which includes one or more components of a Processor (e.g., a Central Processing Unit (CPU), microcontroller, microprocessor, digital Signal Processor (DSP), etc.), memory, one or more communication interfaces, and/or other circuitry), and may optionally execute received program code and/or include embedded firmware to perform functions.

According to an aspect of the disclosure, a computer device is provided that includes a memory, a processor, and a computer program stored on the memory. The processor is configured to execute the computer program to implement the steps of any of the method embodiments described above.

According to an aspect of the present disclosure, a non-transitory computer-readable storage medium is provided, having stored thereon a computer program which, when executed by a processor, implements the steps of any of the method embodiments described above.

According to an aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the steps of any of the method embodiments described above.

FIG. 9 is a block diagram illustrating an exemplary computer device 900 to which the exemplary embodiments can be applied.

Illustrative examples of such computer devices, non-transitory computer-readable storage media, and computer program products are described below in connection with fig. 9.

Fig. 9 illustrates an example configuration of a computer device 900 that may be used to implement the methods described herein. For example, the server 120 and/or the client device 110 shown in fig. 1 may include an architecture similar to the computer device 900. The data processing apparatus 800 described above may also be implemented in whole or at least in part by a computer device 900 or similar device or system.

Computer device 900 may be a variety of different types of devices. Examples of computer device 900 include, but are not limited to: a desktop computer, a server computer, a notebook or netbook computer, a mobile device (e.g., a tablet, a cellular or other wireless telephone (e.g., a smartphone), a notepad computer, a mobile station), a wearable device (e.g., glasses, a watch), an entertainment device (e.g., an entertainment appliance, a set-top box communicatively coupled to a display device, a gaming console), a television or other display device, an automotive computer, and so forth.

The computer device 900 may include at least one processor 902, memory 904, communication interface(s) 906, display device 908, other input/output (I/O) devices 910, and one or more mass storage devices 912, which may be capable of communicating with each other, such as through a system bus 914 or other appropriate connection.

The processor 902 may be a single processing unit or multiple processing units, all of which may include single or multiple computing units or multiple cores. The processor 902 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitry, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor 902 may be configured to retrieve and execute computer-readable instructions stored in the memory 904, mass storage device 912, or other computer-readable medium, such as program code for an operating system 916, program code for an application program 918, program code for other programs 920, and so forth.

Memory 904 and mass storage device 912 are examples of computer-readable storage media for storing instructions that are executed by processor 902 to implement the various functions described above. By way of example, the memory 904 may generally include both volatile and nonvolatile memory (e.g., RAM, ROM, and the like). In addition, the mass storage device 912 may generally include a hard disk drive, solid state drive, removable media including external and removable drives, memory cards, flash memory, floppy disks, optical disks (e.g., CD, DVD), storage arrays, network attached storage, storage area networks, and the like. Memory 904 and mass storage device 912 may both be collectively referred to herein as memory or computer-readable storage medium, and may be non-transitory media capable of storing computer-readable, processor-executable program instructions as computer program code that may be executed by processor 902 as a particular machine configured to implement the operations and functions described in the examples herein.

A number of programs may be stored on the mass storage device 912. These programs include an operating system 916, one or more application programs 918, other programs 920, and program data 922, which can be loaded into memory 904 for execution. Examples of such applications or program modules may include, for instance, computer program logic (e.g., computer program code or instructions) for implementing the following components/functions: a first module 810, a second module 820, a third module 830, a fourth module 840, a fifth module 850, methods 200-600 (including any suitable steps therein), and/or further embodiments described herein.

Although illustrated in fig. 9 as being stored in memory 904 of computer device 900,

modules

916, 918, 920, and 922, or portions thereof, may be implemented using any form of computer-readable media that is accessible by computer device 900. As used herein, "computer-readable media" includes at least two types of computer-readable media, namely computer-readable storage media and communication media.

Computer-readable storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer-readable storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information for access by a computer device. In contrast, communication media may embody computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism. Computer-readable storage media, as defined herein, does not include communication media.

One or more communication interfaces 906 are used to exchange data with other devices, such as over a network, direct connection, and so forth. Such communication interfaces may be one or more of the following: any type of network interface (e.g., a Network Interface Card (NIC)), wired or wireless (such as IEEE 802.11 Wireless LAN (WLAN)) wireless interface, global microwave access interoperability (Wi-MAX) interface, ethernet interface, universal Serial Bus (USB) interface, cellular network interface, bluetooth interface, near Field Communication (NFC) interface, and the like. Communication interface 906 may facilitate communications within a variety of networks and protocol types, including wired networks (e.g., LAN, cable, etc.) and wireless networks (e.g., WLAN, cellular, satellite, etc.), the Internet, and so forth. Communication interface 906 may also provide for communication with external storage devices (not shown), such as in a storage array, network attached storage, storage area network, or the like.

In some examples, a display device 908, such as a monitor, may be included for displaying information and images to a user. Other I/O devices 910 may be devices that receive various inputs from a user and provide various outputs to the user, and may include touch input devices, gesture input devices, cameras, keyboards, remote controls, mice, printers, audio input/output devices, and so forth.

The techniques described herein may be supported by these various configurations of computer device 900 and are not limited to specific examples of the techniques described herein. For example, the functionality may also be implemented in whole or in part on a "cloud" using a distributed system. The cloud includes and/or represents a platform for resources. The platform abstracts underlying functionality of hardware (e.g., servers) and software resources of the cloud. Resources may include applications and/or data that may be used when performing computing processes on servers remote from computer device 900. Resources may also include services provided over the internet and/or over a subscriber network such as a cellular or Wi-Fi network. The platform may abstract resources and functionality to connect the computer device 900 with other computer devices. Thus, implementations of the functionality described herein may be distributed throughout the cloud. For example, the functionality may be implemented in part on the computer device 900 and in part by a platform that abstracts the functionality of the cloud.

While the disclosure has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative and exemplary and not restrictive; the present disclosure is not limited to the disclosed embodiments. Variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed subject matter, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps not listed, the indefinite article "a" or "an" does not exclude a plurality, the term "a" or "an" means two or more, and the term "based on" should be construed as "based at least in part on". The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

1. A computer-implemented load balancing method, comprising:

configuring a sliding time window for a server system comprising a plurality of load resources for processing the allocated client requests;

after determining that the server system starts processing client requests, assigning the same load score to the plurality of load resources, wherein the load score indicates a processing capacity margin of the load resources for processing client requests;

for each load resource in the plurality of load resources, collecting statistics of the load resource over a plurality of performance indicators within a current time window, wherein the plurality of performance indicators are metrics that measure the burden of the load resource on processing different aspects of the client request;

updating the load scores of the plurality of load resources based on the collected statistics; and

determining, from the plurality of load resources, a target load resource to be allocated for a client request at a current time within the current time window based on the updated load scores.

2. The method of claim 1, wherein updating the load scores for the plurality of load resources based on the collected statistics comprises:

for each performance indicator of the plurality of performance indicators:

determining, from the plurality of load resources, a load resource having a maximum or minimum statistical value on the performance metric, the maximum or minimum statistical value indicating that, among the plurality of load resources, the load resource has a heaviest load processing burden on the performance metric; and

the load score of the load resource is subtracted by a preset score.

3. The method of claim 2, further comprising:

determining whether the updated load score is less than 0;

in response to determining that the updated load score is less than 0, adding the same score to the load scores of the plurality of load resources such that the updated load score is greater than 0.

4. The method of any of claims 1-3, wherein determining, from the plurality of load resources, a target load resource to be allocated for a client request at a current time within the current time window based on the updated load scores comprises:

and selecting a load resource from the plurality of load resources as the target load resource, wherein the probability of each load resource being selected in the plurality of load resources is positively correlated with the updated load score of the load resource.

5. The method of claim 4, wherein selecting a load resource from the plurality of load resources as the target load resource comprises:

establishing a numerical interval defined by a starting point and an end point on a numerical axis, wherein the numerical interval comprises a plurality of successive subintervals which are in one-to-one correspondence with the plurality of load resources, and the lengths of the subintervals are respectively in direct proportion to the load fractions of the load resources;

generating a random number between the start point and the end point;

and selecting the load resource corresponding to the subinterval where the random number is positioned as the target load resource.

6. The method of any of claims 1-3, wherein the plurality of performance indicators comprises any combination of: the method comprises the steps of processing times of client requests, processing time of the client requests, the number of load resource connections, the sum of network transmission quantity, inflow network transmission quantity, outflow network transmission quantity, the number of client request failures and routing time.

7. The method of any of claims 1-3, wherein updating the load scores for the plurality of load resources based on the collected statistics comprises:

determining whether a trigger condition to update the load scores of the plurality of load resources is satisfied; and

updating the load scores of the plurality of load resources based on the collected statistics in response to determining that the trigger condition is satisfied.

8. The method of claim 7, wherein the trigger condition comprises:

the client request processing times processed by the multiple load resources reach a client request processing time threshold; or

A preset proportion of the duration of each elapsed time window.

9. A load balancing apparatus comprising:

a first module configured to configure a sliding time window for a server system including a plurality of load resources for processing the allocated client requests;

a second module configured to assign a same load score to the plurality of load resources after determining that the server system starts processing client requests, wherein the load score indicates a processing capacity margin for a load resource to process client requests;

a third module configured to collect statistics of a plurality of performance indicators of each load resource in the plurality of load resources within a current time window, wherein the plurality of performance indicators are metrics that measure burdens of the load resource on different aspects of processing client requests;

a fourth module configured to update load scores of the plurality of load resources based on the collected statistics; and

a fifth module configured to determine, from the plurality of load resources, a target load resource to be allocated for a client request at a current time within the current time window based on the updated load scores.

10. A computer device, comprising:

at least one processor; and

at least one memory having a computer program stored thereon,

wherein the computer program, when executed by the at least one processor, causes the at least one processor to perform the method of any one of claims 1-8.

11. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, causes the processor to carry out the method of any one of claims 1-8.

12. A computer program product comprising a computer program which, when executed by a processor, causes the processor to carry out the method of any one of claims 1 to 8.