CN112035254A

CN112035254A - Load balancing method and device

Info

Publication number: CN112035254A
Application number: CN202010883903.4A
Authority: CN
Inventors: 曹福祥; 王玉龙
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2020-08-28
Filing date: 2020-08-28
Publication date: 2020-12-04

Abstract

The present disclosure relates to a load balancing method and device, the method comprising: receiving a service calling request; determining the overload condition of each instance according to the request information of each instance in a target time window in a service instance set of the micro-service, and determining a target instance in the service instance set according to the overload condition, wherein the request information comprises the number of received service requests and the number of successfully processed service requests; and sending the service calling request to the target instance. Therefore, the current overload condition of each instance can be determined through the number of the service requests received by each instance in the service instance set in the target time window and the number of the service requests successfully processed, so that the target instance which is not overloaded in the service instance set can be selected as the object of the service request to be sent, and the load balancing effect is further improved.

Description

Load balancing method and device

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a load balancing method and apparatus.

Background

In a microservice architecture, a microservice is composed of multiple instances (also called processes), and when a Remote Call (RPC), or a request to send, is initiated to a microservice, one of the instances may be selected by some mechanism and the request is sent to it, which is generally called load balancing. Generally, load balancing has two roles: firstly, the request flow received by each instance is relatively balanced, and the overall performance is ensured; and secondly, when a certain instance fails, the requests are prevented from failing, and the whole availability is ensured.

In the related art, load balancing is realized by recording the number of requests currently being processed on each instance, also called the concurrency number, and preferentially selecting the instance with the minimum current concurrency number when selecting the instance. However, this approach does not consider the actual processing capacity of each instance, for example, the current instance with low concurrency may have lower actual processing capacity and thus may be overloaded, and the instance with high concurrency may have stronger extra service capacity.

Therefore, the balancing effect of the existing load balancing mode is poor.

Disclosure of Invention

The present disclosure provides a load balancing method and device to at least solve the problem of poor balancing effect in the related art. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a load balancing method, including:

receiving a service calling request;

determining the overload condition of each instance according to the request information of each instance in a target time window in a service instance set of the micro-service, and determining a target instance in the service instance set according to the overload condition, wherein the request information comprises the number of received service requests and the number of successfully processed service requests;

and sending the service calling request to the target instance.

According to a second aspect of the embodiments of the present disclosure, there is provided a load balancing apparatus, including:

a receiving module configured to perform receiving a service invocation request;

the determining module is configured to execute request information of each instance in a target time window in a service instance set of the micro-service, determine an overload condition of each instance, and determine a target instance in the service instance set according to the overload condition, wherein the request information comprises the number of received service requests and the number of successfully processed service requests;

a sending module configured to execute sending the service invocation request to the target instance.

According to a third aspect of the embodiments of the present disclosure, there is provided a load balancing apparatus, including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the load balancing method of the first aspect.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer program product comprising executable instructions that, when run on a computer, enable the computer to perform the load balancing method of the first aspect.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

receiving a service calling request; determining the overload condition of each instance according to the request information of each instance in a target time window in a service instance set of the micro-service, and determining a target instance in the service instance set according to the overload condition, wherein the request information comprises the number of received service requests and the number of successfully processed service requests; and sending the service calling request to the target instance. Therefore, the current overload condition of each instance can be determined through the number of the service requests received by each instance in the service instance set in the target time window and the number of the service requests successfully processed, so that the target instance which is not overloaded in the service instance set can be selected as the object of the service call request to be sent, and the load balancing effect is further improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1 is a flow diagram illustrating a method of load balancing according to an example embodiment.

Fig. 2 is a block diagram illustrating a load balancing apparatus according to an example embodiment.

Fig. 3 is a block diagram illustrating another load balancing apparatus in accordance with an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Fig. 1 is a flow chart illustrating a method of load balancing according to an exemplary embodiment, as shown in fig. 1, including the steps of:

in step S11, a service invocation request is received;

in step S12, an overload condition of each instance is determined according to request information of each instance in a service instance set of the microservice within a target time window, and a target instance is determined in the service instance set according to the overload condition, wherein the request information includes the number of received service requests and the number of successfully processed service requests.

In step S13, the service invocation request is sent to the target instance.

In the micro-service architecture, one micro-service may be composed of a plurality of instances, and the set of the plurality of instances is a service instance set of the micro-service. In practical application, a client can initiate an RPC call to a server, that is, send a service call request, and when the server receives the service call request, the server can select a target instance from the service call request according to the actual processing conditions of the instances in the current service instance set, and send the service call request to the target instance, and the target instance processes the received service call request.

The receiving service invocation request may be a service invocation request sent by a receiving client.

In the embodiment of the present disclosure, to ensure a load balancing effect, after receiving a service invocation request, a target instance may be selected from a service instance set of a micro service as a target for receiving and processing the service invocation request, where the target instance may be one having a better service request processing capability in the service instance set, such as no overload.

Specifically, the overload condition of each instance may be obtained according to the request information of each instance in the service instance set within the target time window, the service processing condition and the actual processing capability of each instance may be known according to the overload condition of each instance, and an instance with a better overload condition, such as no overload, may be determined therefrom as the target instance, where the target time window may refer to a current latest time period, such as latest 1 second; the request information of an instance may refer to the number of service requests received and the number of service requests successfully processed by the instance. In this way, according to the request information of each instance in the target time window, it can know how many service request numbers are received by each instance recently and successfully processed, and further can know the current overload condition and actual processing capacity of each instance, and can select the instance which is not overloaded yet and has stronger actual processing capacity as the target instance. For example, if a certain instance receives 5 service request numbers in the last 1 second and successfully processes 4 service request numbers, it is known that the instance has a strong service processing capability and is not overloaded currently; if another example receives 6 service request numbers in the last 1 second and processes 3 service request numbers successfully, it is known that the example processes only half of the service request numbers successfully, the service processing capability is weak and is close to an overload state, so that the previous example with strong service processing capability can be selected as a target example.

Upon determining the target instance, the current service invocation request may be sent to the target instance to be processed by the target instance. Therefore, each time the instance is selected, the instance is distributed based on the current actual processing capacity of each instance in the service instance set, so that the service call requests distributed on each instance can be ensured to be balanced, and a better load balancing effect can be ensured.

Optionally, the target time window is a time window matched with the receiving time of the request corresponding to the service in a plurality of time windows divided in advance.

That is, a time window size may be predefined, for example, a time window is 1 second, the historical time is divided into a plurality of time windows according to the time window size, the target time window is a time window which is matched with the receiving time of the service invocation request in the plurality of time windows which are predefined, that is, a time window with the smallest difference with the receiving time of the service invocation request, for example, the receiving time of the service invocation request is 10:00, and the time window which is matched with the receiving time of the service invocation request in the plurality of time windows which are predefined is 9:57-9:58, 9:58-9:59, and 9:59-10:00 is 9:59-10: 00. Thus, for each instance, the requested information for each instance within the last time window (e.g., the last 1 second) may be recorded, and thus, each time an instance is selected, the determination may be made based on the requested information for each recorded instance within the last time window.

Optionally, the step S12 includes:

determining the overload degree of each instance according to the request information of each instance in a target time window in a service instance set of the micro-service, and determining a target instance in the service instance set according to the overload degree, wherein the overload degree is equal to the quotient of a difference value divided by the number of successfully processed service requests, and the difference value is the difference value obtained by subtracting the number of successfully processed service requests from the number of received service requests.

In one embodiment, to determine the target instance more quickly and directly, the overload level of each instance in the service instance set may be determined based on the request information of each instance within the time window, so that the target instance may be determined according to the overload level of each instance. Considering that the overload degree is related to the number of successfully processed service requests and the number of remaining service requests, the overload degree is smaller as the number of successfully processed service requests is larger, and the number of remaining service requests is smaller as well, so a formula can be designed to facilitate fast calculation of the overload degree of each instance according to the number of service requests received by each instance and the number of successfully processed service requests, for example, the overload degree is (the number of received service requests-the number of successfully processed service requests)/(the number of successfully processed service requests), that is, the overload degree is equal to the quotient of the difference of the number of received service requests minus the number of successfully processed service requests divided by the number of successfully processed service requests. Thus, after determining the number of service requests received by each instance in the service instance set within the latest time window and the number of service requests successfully processed, the overload of each instance can be quickly calculated according to the above formula, and since the actual processing capacity of the representation instance with low overload is stronger and the number of service requests already processed by the representation instance with high overload is far beyond its processing capacity, i.e. is overloaded, the instance with low overload in the service instance set can be selected as the target instance, and specifically, the instance with the lowest overload or relatively low overload and the number of received service requests is less can be selected as the target instance.

In this way, by determining the target instance according to the overload degree of each instance in the service instance set, it can be ensured that any service request is not sent to the instance which has been overloaded when the total service capacity exceeds the request traffic, thereby ensuring that any one instance is not continuously overloaded. And the overload degree of the instance is dynamically measured, and does not need to be configured in advance, which means that if the service capability of each instance is different (for example, the hardware configuration of the server is different), we do not need to configure different load balancing parameters for different instances.

Optionally, the overload degree of the target instance is the smallest among the overload degrees of all instances of the service instance set.

That is, after determining the overload degree of each instance in the service instance set, the instance with the smallest overload degree in the service instance set may be selected as a target instance, that is, the instance with stronger remaining service capability is selected, so that by selecting the instance with the lowest overload degree each time as a processing object of the current service request, it may be ensured that the instances in the service instance set are not overloaded as much as possible, and the load of each instance is balanced.

Optionally, the step of determining a target instance in the service instance set according to the overload degree includes:

determining an instance with the least number of service requests received in the target time window in the at least two instances as a target instance if at least two instances with the same and smallest overload degrees exist in the service instance set; or

In the case where there are at least two instances in the set of service instances that have the same and minimum overload degree, one instance is randomly selected from the at least two instances as a target instance.

In the case of selecting the target instance according to the overload degree, there may exist a plurality of instances with the same overload degree, and particularly, in the case of existing a plurality of instances with the same and the smallest overload degree, it may be difficult to determine the final target instance. Specifically, in one manner, the target instance may be determined according to the number of service requests received within the target time window by at least two instances with the same and minimum overload degrees, for example, the instance with the minimum number of service requests received within the target time window among the at least two instances is determined as the target instance, and in another manner, one instance may be randomly selected from the at least two instances with the same and minimum overload degrees as the target instance. Of course, the target instance may also be determined according to other indexes of at least two instances with the same overload degree and the smallest overload degree, for example, an instance with a higher number of requests successfully processed in the target time window of the at least two instances is taken as the target instance, and so on.

In this way, in the case that there are at least two instances in the set of service instances with the same and minimum overload degree, by taking the instance with the minimum number of service requests received within the target time window of the at least two instances as the target instance, the selection of the overloaded instance can be avoided as much as possible, while randomly selecting one instance from the at least two instances as the target instance, which can ensure a quicker and random selection of the target instance.

and calculating the score of each instance by combining the overload degree and a target parameter of each instance in the service instance set, and determining the instance with the highest score as a target instance, wherein the target parameter comprises at least one of concurrency and response time, the concurrency of the instance is the number of service requests currently being processed on the instance, and the response time of the instance is the time from the beginning of processing the service requests to the end of processing the service requests.

When determining a target instance in the service instance set according to the overload degree, the overload degree and the target parameter of each instance in the service instance set may be combined to determine, where the target parameter may be a concurrency number, a response time, and the like, where the concurrency number of an instance is the number of service requests currently being processed on the instance, the fewer the concurrency number of the instance is, it indicates that the instance is currently idle, the greater the concurrency number of the instance is, it indicates that the instance is currently busy, the response time of the instance is the time from the start of processing the service request to the end of processing the service request, the shorter the response time of the instance is, it indicates that the instance processes the service request faster, and the longer the response time of the instance is, it indicates that the instance processes the service request slower. Specifically, an instance with better indexes such as overload degree, concurrency number and response time in the service instance set may be selected as a target instance, more specifically, a weighting algorithm may be used to calculate a score of each instance according to the overload degree and target parameters of each instance in the service instance set, and an instance with the highest score is selected as the target instance, where the instance with a low overload degree, a small concurrency number or a short response time has a high score.

Therefore, the target instance is determined by combining the overload degree and the target parameters, such as concurrency and response time, of each instance in the service instance set, the actual processing capacity of each instance can be comprehensively considered in the selected instance, and the selection mechanism can be further ensured to obtain a better load balancing effect.

Optionally, the method further includes:

under the condition that the overload degree of each instance in the service instance set is greater than a preset threshold value, giving up sending the current service calling request; or

And under the condition that the overload degree of each instance in the service instance set is greater than a preset threshold value, after a preset time interval, determining a target instance and sending a service calling request to the target instance based on the overload degree of each instance in the service instance set.

In this embodiment, after determining the overload degree of each instance in the service instance set, it may be further determined whether the overload degree of each instance exceeds a preset threshold, where the preset threshold may be determined according to an overload standard of the instance, and if the overload degree of a certain instance is greater than the preset threshold, it indicates that the instance is currently overloaded, and the actual processing capability is weak. Therefore, under the condition that the overload degree of each instance in the service instance set is judged to be larger than the preset threshold value, each instance has overload of different degrees, so that the current service invoking request can be selected to be abandoned, or the preset time duration can be waited to reserve certain processing time for each instance, the overload degree of each instance is calculated after the preset time duration, the target instance is determined according to the overload degree, and the service invoking request is sent to the target instance.

Therefore, when the overload degree of each instance in the service instance set is greater than the preset threshold value, the current service call request is selected to be abandoned, or after a preset time interval, a target instance is determined and the service call request is sent to the target instance based on the overload degree of each instance in the service instance set, so that the overload degree of each instance can be avoided to be aggravated as much as possible, and the overload state of each instance in the service instance set can be relieved to a certain extent.

The scheme can be applied to a micro-service calling scene, namely when a client sends an RPC call to a micro-service, namely a service calling request, a target instance can be selected from the service instance set according to the number of service requests received by each instance in the target time window and the number of service requests successfully processed in the service instance set of the current micro-service, and the current service calling request is sent to the target instance to process the current service calling request through the target instance.

The load balancing method provided by the embodiment of the disclosure receives a service call request; determining the overload condition of each instance according to the request information of each instance in a target time window in a service instance set of the micro-service, and determining a target instance in the service instance set according to the overload condition, wherein the request information comprises the number of received service requests and the number of successfully processed service requests; and sending the service calling request to the target instance. Therefore, the current overload condition of each instance can be determined through the number of the service requests received by each instance in the service instance set in the target time window and the number of the service requests successfully processed, so that the target instance which is not overloaded in the service instance set can be selected as the object of the service call request to be sent, and the load balancing effect is further improved.

Fig. 2 is a block diagram illustrating a load balancing apparatus according to an example embodiment. Referring to fig. 2, the apparatus includes a receiving module 201, a determining module 202, and a transmitting module 203.

The receiving module 201 is configured to perform receiving a service invocation request;

the determining module 202 is configured to execute request information of each instance in a target time window in a service instance set of the micro-service, determine an overload condition of each instance, and determine a target instance in the service instance set according to the overload condition, wherein the request information includes the number of received service requests and the number of successfully processed service requests;

the sending module 203 is configured to perform sending the service invocation request to the target instance.

Optionally, the determining module 202 is configured to execute requesting information of each instance in a target time window in a service instance set according to a micro service, determine an overload degree of each instance, and determine a target instance in the service instance set according to the overload degree, where the overload degree is equal to a quotient of a difference value divided by the number of successfully processed service requests, and the difference value is a difference value obtained by subtracting the number of successfully processed service requests from the number of received service requests.

Optionally, the determining module 202 is configured to determine, as the target instance, an instance with the smallest number of service requests received within the target time window in a case that there are at least two instances with the same and smallest overload degrees in the service instance set; or

The determining module 202 is configured to perform, in case that there are at least two instances in the set of service instances with the same and minimum overload, randomly selecting one instance from the at least two instances as a target instance.

Optionally, the determining module 202 is configured to perform calculating a score of each instance in the service instance set in combination with the overload degree and a target parameter of each instance, and determine an instance with the highest score as a target instance, where the target parameter includes at least one of a concurrency number and a response time, the concurrency number of the instance is the number of service requests currently being processed on the instance, and the response time of the instance is the time from the beginning of processing the service requests to the end of processing the service requests by the instance.

Optionally, the target time window is a time window matched with the receiving time of the service invocation request in a plurality of time windows divided in advance.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 3 is a block diagram illustrating a load balancing apparatus 300 according to an exemplary embodiment, and referring to fig. 3, the load balancing apparatus 300 includes: a processor 301, a memory 302, and a bus interface 303.

A processor 301, configured to read the program in the memory 302, and execute the following processes:

receiving a service calling request;

and sending the service calling request to the target instance.

In fig. 3, the bus architecture may include any number of interconnected buses and bridges, with one or more processors represented by processor 301 and various circuits of memory represented by memory 302 being linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. Bus interface 303 provides an interface.

The processor 301 is responsible for managing the bus architecture and general processing, and the memory 302 may store data used by the processor 301 in performing operations.

Optionally, the processor 301 is further configured to:

The load balancing apparatus 300 can implement each process implemented by the load balancing apparatus in the foregoing embodiments, and is not described here again to avoid repetition.

In an exemplary embodiment, a storage medium comprising instructions, such as a memory 302 comprising instructions, executable by a processor 301 of a load balancing apparatus 300 to perform the above method is also provided. Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method of load balancing, comprising:

receiving a service calling request;

and sending the service calling request to the target instance.

2. The method of claim 1, wherein the step of determining an overload condition for each instance based on the request information for each instance within a target time window in the service instance set of microservices, and determining a target instance in the service instance set based on the overload condition comprises:

3. The method of claim 2, wherein the overload level of the target instance is the smallest among the overload levels of all instances of the set of service instances.

4. The method of claim 2, wherein the step of determining the target instance in the service instance set according to the overload level comprises:

5. The method of claim 2, wherein the step of determining the target instance in the service instance set according to the overload level comprises:

6. The method of claim 1, wherein the target time window is a time window matching the time of receipt of the service invocation request in a plurality of pre-divided time windows.

7. A load balancing apparatus, comprising:

8. The load balancing apparatus of claim 7, wherein the determining module is configured to perform the request information for each instance in a set of service instances of a microservice within a target time window, determine an overload level for each instance, and determine a target instance in the set of service instances according to the overload level, wherein the overload level is equal to a quotient of a difference value divided by the number of successfully processed service requests, the difference value being the difference value of the number of received service requests minus the number of successfully processed service requests.

9. A load balancing apparatus, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the load balancing method of any one of claims 1 to 6.

10. A storage medium having instructions which, when executed by a processor of a load balancing apparatus, enable the load balancing apparatus to perform a load balancing method as claimed in any one of claims 1 to 6.