CN114205361B

CN114205361B - Load balancing method and server

Info

Publication number: CN114205361B
Application number: CN202111496270.2A
Authority: CN
Inventors: 王清华; 郭伟; 矫恒浩
Original assignee: Juhaokan Technology Co Ltd
Current assignee: Juhaokan Technology Co Ltd
Priority date: 2021-12-08
Filing date: 2021-12-08
Publication date: 2023-10-27
Anticipated expiration: 2041-12-08
Also published as: CN114205361A

Abstract

The application discloses a load balancing method and a server. The problem of unstable service caused by overload of the distributed request quantity of the service node is solved. The load balancing method comprises the following steps: receiving a service request; determining weights corresponding to the service nodes; wherein each service node comprises a first service node and a second service node; the first service node is a node with the cache amount of the local cache being lower than a first threshold value; the second service node is a node with the cache amount of the local cache higher than a second threshold value; the first threshold is not greater than the second threshold, the weight corresponding to the first service node is the first weight, and the weight corresponding to the second service node is the second weight; the first weight is less than the second weight; and distributing the service request to each service node according to the weight, so that the service node acquires data according to the distributed service request and stores the data in a local cache, wherein the probability of acquiring the service request by the first service node is lower than that of acquiring the service request by the second service node.

Description

Load balancing method and server

Technical Field

The present application relates to the field of information communications, and in particular, to a load balancing method and a server.

Background

Load balancing is a technology for distributing load, and the main purpose of load balancing is to uniformly distribute a large number of requests sent from outside to each service node according to a certain algorithm, so that compared with a service node which completes all requests, a plurality of service nodes work simultaneously, the response time can be shortened, and the throughput rate can be improved.

Currently, existing load balancing schemes support distributing requests to nodes according to weights, and a fixed weight distribution mode is adopted. In this way, the service node easily splits the request volume beyond its load capacity, resulting in unstable service.

Disclosure of Invention

The application provides a load balancing method and a server, which solve the problem of unstable service caused by overload of the distributed request quantity of a service node.

In a first aspect, the present application provides a load balancing method, the method comprising: receiving a service request; determining weights corresponding to the service nodes; wherein each service node comprises a first service node and a second service node; the first service node is a node with the cache amount of the local cache being lower than a first threshold value; the second service node is a node with the cache amount of the local cache higher than a second threshold value; the first threshold is not greater than the second threshold, the weight corresponding to the first service node is the first weight, and the weight corresponding to the second service node is the second weight; the first weight is less than the second weight; and distributing the service request to each service node according to the weight, and enabling the service node to acquire data according to the distributed service request and store the data in a local cache, wherein the probability of acquiring the service request by the first service node is lower than that of acquiring the service request by the second service node.

In this way, the weight of the service node is determined according to different buffer memory amounts by monitoring the corresponding locally stored buffer memory amount of the service node. The weight corresponding to the service node with small buffer memory is smaller than that of the service node with large buffer memory. The method solves the problem that the service node just started receives the overload request quantity.

In some embodiments, the service request distributed to the first service node is a first service request; after distributing the service request to the service nodes, the method further comprises: receiving a monitoring index returned by the first service node, wherein the monitoring index is the cache state of the local cache of the first service node after the first service node completes the first service request; and when the monitoring index represents that the cache amount of the local cache of the first service node is increased, increasing the first weight so as to distribute the service request to the first service node according to the increased weight when distributing the service request next time.

In some embodiments, receiving the monitoring indicator returned by the first service node includes: the first service node sends a query request generated according to the first service request to a local cache and/or a third party system, wherein the query request is used for querying data corresponding to the first service request; the first service node receives first data returned by the third party system; the first data are data obtained by a third party system in response to a query request; the first service node stores the first data in a local cache corresponding to the first service node; adding monitoring indexes along with the data; the monitoring index comprises a buffer quantity; the first service node uploads the monitoring index to the load balancing system.

In some embodiments, determining weights corresponding to the respective serving nodes includes: the formula of the weight W is: w=w ₀ X F; wherein W is ₀ Representing the initial weights; f represents an influence factor, and is a number between 0 and 1, if the monitoring index M is smaller than the first threshold M ₁ The value of the influence factor F is the initial influence value F ₀ The method comprises the steps of carrying out a first treatment on the surface of the If the monitor index M is greater than or equal to the first threshold M ₁ And is less than or equal to a second threshold M ₂ The formula for the influence factor F is:if the monitor index M is greater than the second threshold M ₂ The value of the influencing factor F is 1.

In some embodiments, the first weight W ₁ The formula of (2) is: w (W) ₁ ＝W ₀ ×F ₀ 。

In some embodiments, the first service node comprises a failure-restored service node or a newly-built node.

In some embodiments, the method further comprises: receiving a monitoring index returned by the second service node, wherein the monitoring index is the cache state of the local cache after the second service node completes the second service request; and updating the second weight according to the monitoring index, and distributing the service request to the second service node according to the updated second weight when distributing the service request next time.

In some embodiments, when the monitoring indicator is inversely related to the impact factor, the impact factor F _f The formula of (2) is:

F _f ＝1-F+F ₀ 。

In some embodiments, when monitoring metrics and effectsWhen the factor is negative correlation, the final influence factor F _f The formula of (2) is:

F _f ＝1-(M ₂ -M)+F ₀ 。

in a second aspect, the present application provides a server configured to: receiving a service request; the server comprises: the system comprises a load balancing module, at least one service node and a local storage; wherein the service node comprises a first service node; the load balancing module is configured to: receiving a service request; determining weights corresponding to the service nodes; distributing service requests to all service nodes according to the weights; wherein, the probability of the first service node obtaining the service request is lower than that of the second service node; the service request distributed to the first service node is a first service request; the service node is configured to: acquiring data according to the distributed service request and storing the data in a local cache; the first service node is a node with a local cache amount lower than a first threshold, the weight corresponding to the first service node is a first weight, and the first service node is configured to: inquiring data from a local cache or a third party system according to the first service request; the second service node is a node with a cache amount of the local cache higher than a second threshold value, the weight corresponding to the second service node is a second weight, and the second service node is configured to: inquiring data from the local cache according to the first service request; wherein the first threshold is not greater than the second threshold; the second weight is greater than the first weight.

According to the embodiment, the load balancing module in the equipment provided by the application confirms the monitoring index according to the service state by monitoring the service state, confirms the influence factor according to the monitoring index, and further updates the weight.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

FIG. 1 illustrates a scenario diagram of a load balancing method, according to some embodiments;

fig. 2 illustrates an interactive schematic diagram of a load balancing method according to some embodiments.

Detailed Description

For the purposes of making the objects and embodiments of the present application more apparent, an exemplary embodiment of the present application will be described in detail below with reference to the accompanying drawings in which exemplary embodiments of the present application are illustrated, it being apparent that the exemplary embodiments described are only some, but not all, of the embodiments of the present application.

It should be noted that the brief description of the terminology in the present application is for the purpose of facilitating understanding of the embodiments described below only and is not intended to limit the embodiments of the present application. Unless otherwise indicated, these terms should be construed in their ordinary and customary meaning.

The terms first, second, third and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar or similar objects or entities and not necessarily for describing a particular sequential or chronological order, unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances.

The terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to all elements explicitly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

The term "module" refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware or/and software code that is capable of performing the function associated with that element.

Load balancing (load balancing) of distributing service requests to multiple operating units, such as Web servers, FTP servers, enterprise critical application servers, and other critical task servers, to collectively accomplish work tasks. As the traffic increases, the access volume and data traffic increases rapidly, so does the processing power and computational intensity of the current core of the network, making a single server device unable to afford at all. In this case, if the existing device is thrown away to make a large amount of hardware upgrades, this will cause waste of existing resources, and if the next increase of traffic is faced, this will result in high cost investment of hardware upgrade again, and even the device with excellent performance cannot meet the current demand of traffic increase. The network load is built on top of the existing network structure, which provides an inexpensive, efficient, transparent method to extend the bandwidth of network devices and servers, increase throughput, enhance network data processing capabilities, and increase flexibility and availability of the network. Load balancing has two implications: firstly, a large amount of concurrent access or data traffic is shared to a plurality of node devices for processing respectively, so that the waiting time of a user for response is reduced; and secondly, single heavy load operation is shared to a plurality of node devices for parallel processing, and after the processing of each node device is finished, the result is summarized and returned to the user, so that the processing capacity of the system is greatly improved.

The load balancing method disclosed by the application is a software load balancing solution. The load balancing is realized by installing one or more additional software on the corresponding operating systems of one or more servers, and the weight is set based on a specific environment. The local load balancing does not need to spend high cost to purchase a high-performance server, and can effectively avoid the loss of data traffic caused by single-point failure of the server only by utilizing the existing equipment resources, and is generally used for solving the problems of overlarge data traffic and overlarge network load. Meanwhile, the system has various equalization strategies to reasonably and uniformly distribute data traffic to each server. If the current server is required to be upgraded and expanded, the current network structure is not required to be changed, the current service is stopped, and only a load balancing module is required to be simply added in the service group.

In video applications, one server typically cannot solve all services, and load balancing of the system needs to be considered. In the practical application scene, two typical applications are mainly considered, one is an application with a large number of access cameras, more clients and more devices, and in this case, the devices also need to realize video uploading without the clients to realize cloud storage, such as in the mobile vehicle-mounted field and the video monitoring field. The second application is that the devices are not so many, but have a large number of clients, and such applications are more common in industries such as live broadcast, education, etc. Regardless of the nature of the two applications, the multiple video server architecture needs to be considered and load balancing implemented after the equivalent is large. The application realizes the balance strategy through soft load.

Fig. 1 illustrates a scenario diagram of a load balancing method, according to some embodiments. As shown in fig. 1, the present application is applicable to all devices that provide service interfaces to the outside. The device comprises a load balancing module, at least one service node and a local storage; the device receives an external service request, and in order to improve service performance, a plurality of service nodes are generally deployed, where in a certain implementation manner, the service nodes may include a first service node and a second service node; in another implementation, the service nodes may include a first service node, a second service node, and a third service node. The number of service nodes can be set according to the hardware resources and the use environment of the device. The load balancing module distributes the service request to a service node according to the weight in the load balancing scheme so as to distribute load pressure; in a specific embodiment, the load balancing module distributes the service request to the first service node and the second service node, wherein the service request distributed to the first service node is denoted as a first service request; the service request distributed to said second service node is denoted as second service request. Each service node is provided with a corresponding local cache. In a specific implementation manner, after receiving a service request, a service node initiates a query to its local cache, and if the local cache does not have data corresponding to the service request, the service node queries data from a database or a downstream service.

In one scenario, the amount of locally stored cache corresponding to a service node is built up as the number of service requests is accepted increases. When a service node starts to start, a service request is not received yet, and a locally stored cache is not established yet, at this time, the local cache may be empty, so when the service node is allocated with the service request, the local cache does not have data corresponding to the service request, and the service node needs to acquire the data from a database or a third party system. Thus, the service node has poorer response time and throughput than the service node that obtains data from local storage while waiting for the data to be transmitted. Thus, for a service node that has just started up and a service node that has already carried multiple service requests, the service requests should be handled differently when they are distributed. The service node that has just started should be divided a little and the service node that has already carried the service request a plurality of times should be divided a little more. The problems that the response is slow, the requests are piled up, the memory of the node is occupied, the service is unstable, the requests are overtime, and the overall service performance is influenced due to overload of the received request amount of the service node which is just started are avoided.

In order to solve the problems, the application provides a load balancing method, which is used for determining the weight of a service node according to different buffer storage amounts by monitoring the corresponding locally stored buffer storage amount of the service node. In this way, the weight corresponding to the service node with small buffer memory is smaller than the weight of the service node with large buffer memory. The method solves the problem that the service node just started receives the overload request quantity.

As shown in fig. 2, in some embodiments, the load balancing method provided by the present application includes: the load balancing module receives a service request; the load balancing module determines weights corresponding to the service nodes; wherein each service node comprises a first service node and a second service node; the first service node is a node with the cache amount of the local cache being lower than a first threshold value; the second service node is a node with the cache amount of the local cache higher than a second threshold value; the first threshold is not greater than the second threshold, the weight corresponding to the first service node is the first weight, and the weight corresponding to the second service node is the second weight; the first weight is smaller than the second weight so that the probability that the first service node acquires the service request is lower than that of the second service node; and distributing the service request to each service node according to the weight, and enabling the service node to acquire data according to the distributed service request and store the data in a local cache, wherein the probability of acquiring the service request by the first service node is lower than that of acquiring the service request by the second service node. It is understood that the first threshold may be characterized as: when the local cache does not reach the threshold, the service node is considered to be just started; the second threshold may be characterized as: when the local cache reaches the threshold, the service node is considered to have completely completed preheating, and the service can be normally provided.

In some embodiments, distributing the service request to each service node according to the weights includes distributing the service request to each service node according to the weights corresponding to each service node, where the probability of the first service node obtaining the service request is lower than that of the second service node.

Illustratively, the first threshold is 1000 and the second threshold is 10000. The first service node is a newly-started new adding point, at this time, the buffer capacity of the first service node is 0, that is, the local buffer is empty, the buffer capacity is smaller than a first threshold value 1000, and a first weight corresponding to the first service node is 0.1; the second service node has a buffer size of 11000, the buffer size is greater than a second threshold 10000, and the corresponding second weight is 0.9. It should be noted that, in the load balancing scheme, the initial value of the weight corresponding to each service node may be given according to the hardware environment or different environments. For example, the initial weight may be determined by performing a pressure measurement under the condition that the local cache of the service node is empty. It is understood that the weight value has a one-to-one correspondence with each service node. The weight is just a reference value, different systems may be defined differently, and the decimal numbers greater than 0 and less than 1 may be used as shown in the above embodiments; it is also possible to use a value greater than 1, for example to redirect the initial weight to 10, or to redirect the initial weight to 90. There will be some differences in use. When the decimal number greater than 0 and less than 1 is used, the sum of the weights of all the nodes may be made 1, and at this time, the probability of each node obtaining the request is the ratio of the corresponding weight value to 1 (i.e., the corresponding weight value.

In some embodiments, when the weight is defined as an integer, the step of distributing the service request to the service nodes according to the weight further comprises: dividing the service request into a first service request and a second service request according to the ratio of the first weight to the sum of the weights of all the service nodes and the ratio of the second weight to the sum of the weights of all the service nodes, sending the first service request to the first service node, and sending the second service request to the second service node. It will be appreciated that when there are n service nodes, the weight corresponding to each service node (denoted as the nth weight W _n ) The ratio of the sum of the weights of the n service nodes (denoted as Σw) serves as a direct basis for distributing the service request.

Illustratively, the first threshold is 1000 and the second threshold is 10000. The first service node is a newly-started new adding point, at this time, the buffer capacity of the first service node is 0, that is, the local buffer is empty, the buffer capacity is smaller than a first threshold value 1000, and a first weight corresponding to the first service node is 10; the second service node buffer quantity is 11000, and the corresponding second weight is 90, so that the first service node obtains the service request with the ratio of 0.1, namely 10/(10+90); the second service node obtains a service request duty cycle of 0.9, i.e. 90/(10+90).

Further, the weight of the service node is gradually increased by monitoring the corresponding locally stored buffer quantity of the service node which is just started when the service request is completed once. In this way, the service nodes with less caches have less weight than the service nodes with more caches, and the service nodes with less caches have more weight as the local caches increase. The method solves the problem that the service node just started receives the overload request quantity.

Illustratively, the first threshold is 1000 and the second threshold is 10000. The first service node is a newly-started new adding point, at this time, the buffer capacity of the first service node is 0, that is, the local buffer is empty, the buffer capacity is smaller than a first threshold value 1000, and a first weight corresponding to the first service node is 0.1; the second service node has a buffer size of 11000, the buffer size is greater than a second threshold 10000, and the corresponding second weight is 0.9. When a service request a is distributed, a first weight corresponding to a first service node which is just started is 0.1; after the first service node requests data from the database or the third party system according to the first service request, the data is stored in a local cache of the first service node, at the moment, the cache amount of the local cache of the first service node is increased, the monitoring index is 300 when the cache amount is increased from 0 to 300, and the load balancing module increases the weight corresponding to the first service node according to the monitoring index of 300, and at the moment, the first weight is increased from 0.1 to 0.2; when the service request b is allocated, the first weight corresponding to the first service node is 0.2,

In some embodiments, the first threshold is 1000 when the weight is defined as an integer. When a service request a is distributed, a first weight corresponding to a first service node which is just started is 10; after the first service node requests data from the database or the third party system according to the first service request, the data is stored in a local cache of the first service node, at the moment, the cache amount of the local cache of the first service node is increased, the monitoring index is 300 when the cache amount is increased from 0 to 300, and the load balancing module increases the weight corresponding to the first service node according to the monitoring index of 300, and at the moment, the first weight is increased from 10 to 20; when the service request b is allocated, the first weight corresponding to the first service node is 20, if the second weight corresponding to the second service node is still 90, the ratio of the first service request acquired by the first service node to the service request b is 0.182, namely 20/(20+90); the second service node obtains a service request with a duty cycle of 0.818, namely 90/(20+90); if the second weight corresponding to the second service node is 100, the ratio of the first service request obtained by the first service node to the service request b is 0.167, namely 20/(20+100); the second service node obtains a service request duty cycle of 0.833, i.e. 100/(20+100).

Further, the present application provides a method for determining weights corresponding to respective service nodes, in which an influence factor is introduced on the basis of gradually increasing the weights of the service nodes according to the locally stored buffer amounts. The influence factors change along with the change of the monitoring indexes, so that the adjusted weight is more in line with the corresponding working state of the service node. Therefore, when the cache amount of the service node is not only smaller than the first threshold value but also smaller than or equal to the second threshold value, the problem that the service node which has not completed preheating bears the overload request amount is further solved through a gradually increasing mode.

In some embodiments, determining weights corresponding to the respective serving nodes includes: the formula of the weight W is: w=w ₀ X F; wherein W is ₀ Representing the initial weights; f represents an influence factor, and is a number between 0 and 1, if the monitoring index M is smaller than the first threshold M ₁ The value of the influence factor F is the initial influence value F ₀ The method comprises the steps of carrying out a first treatment on the surface of the If the monitor index M is greater than or equal to the first threshold M ₁ And is less than or equal to a second threshold M ₂ The formula for the influence factor F is:if the monitor index M is greater than the second threshold M ₂ The value of the influencing factor F is 1. Wherein an initial influence value F is set ₀ First threshold M ₁ A second threshold M ₂ The influence factor changes with the change of the monitoring index to change the formula, so that the impact of service preheating on the nodes is reduced, and the nodes are stably and efficiently preheated. The initial impact value F can be adjusted by pressure measurement or real environment test ₀ Take the value and determine the first threshold M ₁ A second threshold M ₂ 。

Illustratively, to ensure service performance, when no more than 20% of the requests of a normal node (the service node with the cache amount of the local cache higher than the second threshold) reach the database and the downstream service, the node without the local cache is indicated to be capable of bearing at least 20% of the traffic, and the factor initial value is set to 0.2.

In one implementation, the first threshold is 1000 and the second threshold is 10000. When distributing service request a, initial weight W of first service node ₀ Initial impact value F of 0.5 ₀ 0.2; first weight W corresponding to first service node just started ₁ The formula of (2) is: w (W) ₁ ＝W ₀ ×F ₀ I.e., 0.1 (0.5×0.2); the second weight corresponding to the second service node is 0.9. After the first service node requests data from the database or the third party system according to the first service request, the data is stored in the local cache of the first service node, at the moment, the cache amount of the local cache of the first service node is increased, the monitoring index is 300 when the cache amount is increased from 0 to 300, the load balancing module increases the first weight according to the monitoring index of 300 and 300 is less than 1000, the initial weight at the moment is 0.5+0.1=0.6, and the initial influence value F is the initial influence value ₀ At 0.2, the first weight increases to 0.6x0.2=0.12; when the service request b is allocated, the first weight corresponding to the first service node is 0.12, and the second weight corresponding to the second service node is 0.88. After the first service node requests data from the database or the third party system according to the first service request, the data is stored in the local cache of the first service node, and at the moment, the cache amount of the local cache of the first service node is increased and is changed from 30 0 to 3000, the monitoring index is 3000, and the load balancing module substitutes the formula according to the monitoring index of 3000, 1000 < 3000 < 10000:f=0.378, the initial weight at this time is "0.6+0.12=0.72", and the first weight increases to 0.72×0.378= 0.27216. Similarly, in the process that the cache amount of the local cache is increased to the second threshold value, the weight of the first service node is gradually increased until the preheating is completed.

In another implementation, when the weight is defined as an integer, the first threshold is 1000 and the second threshold is 10000. When distributing service request a, initial weight W of first service node ₀ Initial impact value F of 50 ₀ 0.2; first weight W corresponding to first service node just started ₁ The formula of (2) is: w (W) ₁ ＝W ₀ ×F ₀ Namely 10 (50×0.2); the second weight corresponding to the second service node is 90, and the first service node obtains the service request with the duty ratio of 0.1, namely 10/(10+90); the second service node obtains a service request duty cycle of 0.9, i.e. 90/(10+90). After the first service node requests data from the database or the third party system according to the first service request, the data is stored in the local cache of the first service node, at the moment, the cache amount of the local cache of the first service node is increased, the monitoring index is 300 when the cache amount is increased from 0 to 300, the load balancing module increases the first weight according to the monitoring index of 300, 300 is less than 1000, the initial weight at the moment is 50+10=60, and the initial influence value F is equal to or smaller than 1000 ₀ At 0.2, the first weight increases to 60×0.2=12; when the service request b is allocated, the first weight corresponding to the first service node is 12, and if the second weight corresponding to the second service node is still 90, the ratio of the first service request acquired by the first service node to the service request b is 12/(12+90) =0.118; the second service node acquires a duty cycle of 90/(12+90) =0.882. After the first service node requests data from the database or the third party system according to the first service request, the data is stored in the local cache of the first service node, at this time, the cache amount of the local cache of the first service node is increased from 300 to 3000, and monitoring is performedThe index is 3000, the load balancing module substitutes the formula according to the monitoring index of 3000, 1000 < 3000 < 10000:f=0.378, the initial weight at this time is "60+12=72", and the first weight is increased to 72×0.378= 27.216. Similarly, in the process that the cache amount of the local cache is increased to the second threshold value, the weight of the first service node is gradually increased until the preheating is completed. When the service request b is allocated, the first weight corresponding to the first service node is 12, and if the second weight corresponding to the second service node is 100, the ratio of the first service request acquired by the first service node to the service request b is 12/(12+100) =0.107; the second service node acquires a duty cycle of 90/(12+90) =0.893. After the first service node requests data from the database or the third party system according to the first service request, the data is stored in the local cache of the first service node, at this time, the cache amount of the local cache of the first service node is increased, the monitoring index is 3000 when the cache amount is increased from 300 to 3000, the load balancing module substitutes the formula when 1000 is less than 3000 and less than 10000 according to the monitoring index of 3000: / >F=0.378, the initial weight at this time is "60+12=72", and the first weight is increased to 72×0.378= 27.216. Similarly, in the process that the cache amount of the local cache is increased to the second threshold value, the weight of the first service node is gradually increased until the preheating is completed.

In some embodiments, the first service node comprises a failure-restored service node or a newly-built node. When the local cache is out of date due to some reasons, such as network failure, GC, etc., the value of the influencing factor may be reduced again, and the node enters a warm-up state.

F _f ＝1-F+F ₀ 。

F _f ＝1-(M ₂ -M)+F ₀ 。

in the above embodiment, the monitoring indicator may be added by the first service node to the response header of the first service request by the return buffer size, for example, the HTTP request may return the buffer size in the HTTP response header.

According to the embodiment, the service state is monitored regularly, the monitoring index is confirmed according to the service state, the influence factor is confirmed according to the monitoring index, and then the weight is updated.

In some embodiments, the method further comprises: receiving a monitoring index returned by the second service node, wherein the monitoring index is the cache state of the local cache after the second service node completes the second service request; and updating the second weight according to the monitoring index, and distributing the service request to the second service node according to the updated second weight when distributing the service request next time. The method for updating the second weight may refer to the method for updating the first weight, and will not be described again.

The application provides a server, which is used for determining the weight of a service node according to different cache amounts by monitoring the corresponding locally stored cache amount of the service node. In this way, the weight corresponding to the service node with small buffer memory is smaller than the weight of the service node with large buffer memory. The method solves the problem that the service node just started receives the overload request quantity.

In some cases, the first service node gradually increases in weight through the load balancing method described above until the warm-up is completed. At this time, both the first service node and the second service node are in a normal working state. However, due to the difference of the cache data in the local caches corresponding to the different service nodes and the difference of the service requests corresponding to the different service nodes, the working states of the service nodes cannot be well adapted according to the fixed weights.

The application provides another load balancing method for solving the problem that the weight is reasonably changed according to the monitoring index when a plurality of service nodes which work normally exist in the server. The load balancing method comprises the following steps: receiving a service request; distributing the service request to the service node according to the weight in the load balancing scheme; wherein the service node comprises a first service node; the service request distributed to the first service node is denoted as a first service request; receiving a monitoring index added when the first service node returns data corresponding to the first service request; the monitoring index comprises a cache loading state of data; and updating the weight according to the monitoring index, and distributing the service request according to the updated weight when distributing the service request next time.

As shown in fig. 1, the server includes three service nodes, namely a first service node, a second service node and a third service node, where if the load balancing module distributes new service requests according to the weights in the load balancing scheme, the initial weights of the service nodes may be the same or different. Wherein the initial weight of the first service node is 0.33, the initial weight of the second service node is 0.33, and the initial weight of the third service node is 0.34. And the load balancing module disperses the service requests according to the weights to obtain a first service request, a second service request and a third service request. The loads corresponding to the first service request and the second service request are the same, and the load corresponding to the third service request is the largest. The first service request is sent to the first service node, the second service request is sent to the second service node, and the third service request is sent to the third service node. The first service node firstly requests data from the local cache according to the first service request, and if the local cache does not have the data corresponding to the first service request, the first service node inquires the data corresponding to the first service request from the database or the third party system according to the first service request. And if the second service node and the third service node inquire the data corresponding to the second service request and the third service request in the local cache, the data are directly obtained, and the data are returned to the load balancing module.

It will be appreciated that the local cache may include only a portion of the data corresponding to the first service request, and that another portion of the data corresponding to the first service request may be queried by the database or a third party system.

In a certain scenario, since the local storage has no data corresponding to the first service request, the first service node cannot acquire data from the local storage, and only can acquire data from a database or a third party system; the local storage has data corresponding to the second service request, and the second service node can directly acquire the data from the local storage. The second service node operates normally while the first service node waits for data transmission, and the response time and throughput of the first service node are worse than those of the second service node. At this time, if the load balancing module distributes new service requests according to the weights in the load balancing scheme, because the first service node does not complete the last data feedback task, the new request quantity is distributed according to the original weights, which can cause overload of the first service node, affect the service performance of the first service node, and even cause the first service node to be crushed.

For example, the initial weight of the first service node and the initial weight of the second service node are both 0.33, and the load amounts corresponding to the first service request and the second service request are the same, but because the first service node queries the database or the third party system and acquires the data corresponding to the first service request, and the second service node directly requests the data from the local cache, the response time and the throughput of the first service node are poorer than those of the second service node under the condition of the same load amount. At this time, if a new service request is allocated according to the initial weight data, the request amount corresponding to the weight data 0.33 cannot be born any more obviously according to the current working state of the first service node. Meanwhile, compared with the second service node still bearing the request amount corresponding to the weight data of 0.33, the first service node at the current moment exceeds the load of the second service node, so that the allocation is obviously unreasonable. Similarly, the initial weight of the third service node is 0.34, and the third service node directly requests data from the local cache, if a new service request is allocated according to the initial weight data, overload of the first service node can be caused, service performance of the first service node is affected, and even the first service node is crushed. Even if the second service node and the third service node can theoretically complete part of the service request, task failure still occurs, and loss is caused. Therefore, the new weight data corresponding to the first service node should be less than 0.33, and the new weight data corresponding to the second service node and the new weight data corresponding to the third service node should be changed as appropriate. In particular, if a new service node, for example, a fourth service node, is added, the weight data corresponding to the second service node and the third service node, respectively, may be reduced.

It will be appreciated that the weights are only reference values and that different systems may be defined differently, either in decimal or in integer numbers as in the above embodiments, for example, the initial weights may be redirected to 100.

In order to solve the above problems, the present application provides a method for updating weight data. Fig. 2 illustrates an interactive schematic diagram of a load balancing method according to some embodiments. As shown in fig. 2, in some embodiments, the load balancing module of the present application returns the monitoring indicator added when the data corresponding to the first service request is received by the first service node; wherein the monitoring index comprises a cache loading state of the data; and updating the weight according to the monitoring index, so as to distribute the weight according to the updated weight when distributing the service request next time. In this way, the load balancing module may rearrange the weights corresponding to the respective nodes according to the monitoring index.

When the first service node queries the database or the third party system and obtains the data corresponding to the first service request, the first service node simultaneously returns the buffer loading state of the data when the database or the third party system returns the data to the first service node, and the first service node adds the monitoring index according to the buffer loading state and simultaneously returns the monitoring index and the data to the load balancing module. The load balancing module calculates new weight according to the monitoring index when the first service node is loaded from the database or the third-party system more or more.

In one implementation, the cache load state is added by the first service node to the response header of the first service request, e.g., an HTTP request may return the cache size in the HTTP response header.

In one scenario, there are multiple service nodes simultaneously, and the operating states of the service nodes are different. Therefore, each service node needs to add a monitoring index when data is returned, and the load balancing module distributes new weights to each service node after comprehensive calculation according to the monitoring indexes fed back by each service node. In some embodiments, the monitoring index may also be obtained by periodically monitoring the service state, and the monitoring dimension of the monitoring service state may be interface response time, cache loading state, interface call failure rate, and the like. In this way, the monitoring indexes can simultaneously comprise a plurality of indexes, namely, the response time of the interface corresponds to the first monitoring index, the buffer loading state corresponds to the second monitoring index, and the interface call failure rate corresponds to the third monitoring index. And the load balancing module calculates and obtains the weight according to the first monitoring index, the second monitoring index and the third monitoring index.

In a specific implementation, the monitoring indicator may also be a memory usage of the interface.

In order to further calculate the monitoring index, the application introduces the influence factor, and simultaneously, the initial influence factor is preset according to the hardware environment and the actual working requirement and is recorded as a preset influence value. The final influencing factors are generated by the monitoring indexes singly or in combination. The monitoring index and the influencing factor may be positively or negatively correlated: the more local cache loads in the cache loading state, the larger the influence factor is, and the second monitoring index corresponding to the cache loading state is positively correlated with the influence factor; the larger the interface response time is, the smaller the influence factor is, and the first monitoring index corresponding to the interface response time is inversely related to the influence factor. When the monitoring dimension of the monitoring service state is a plurality of dimensions such as interface response time, cache loading state, interface call failure rate and the like, each monitoring dimension correspondingly generates an influence factor, so that a plurality of influence shadows jointly act on weight calculation, and the obtained load balancing effect is best.

In a specific implementation, the weight calculation formula is: the weight is equal to the initial weight multiplied by the impact factor; the method of confirming the influence factor attack is as follows: the monitoring index has a threshold range including a lower threshold and an upper threshold, the lower threshold being referred to as a third threshold and the upper threshold being referred to as a fourth threshold. Wherein, the influence factor is a number between 0 and 1, and if the monitoring index is smaller than or equal to a third threshold value, the influence factor is a preset influence value; if the monitoring index is greater than the third threshold and less than the fourth threshold, the formula of the influence factor is: influence factor = preset influence value [ (monitor indicator-third threshold)/(fourth threshold-third threshold) ]x (1-preset influence value); and if the monitoring index is greater than or equal to a fourth threshold value, the influence factor is 1.

For example, the number of cache loads is taken as a monitoring index, the initial weight is changed to 100, the third threshold is 1000, the fourth threshold is 10000, and the preset influence value is 0.1. When the number of cache loading is less than 1000, the influence factor is fixed to be 0.1; when the number of cache loads exceeds 10000, the influence factor is fixed to be 1; when the number of cache loads is between 1000 and 10000, it is assumed to be 5000, and the influence factor=0.1+ [ (5000-1000)/(10000-1000) ]× (1-0.1) =0.5.

In a specific implementation, when the monitoring indicator is inversely related to the influence factor, the final influence factor=1-influence factor+preset influence value.

In a specific implementation, when the monitoring indicator and the influence factor are negatively correlated, the final influence factor=1- (fourth threshold-monitoring indicator) +preset influence value.

According to the embodiment, the load balancing module in the equipment provided by the application confirms the monitoring index according to the service state by regularly monitoring the service state, confirms the influence factor according to the monitoring index, and further updates the weight.

In some scenarios, when a new service request is distributed, the device may increase or decrease the number of service nodes that are involved in the distribution of the request. For example, the service node for fault recovery and the newly added service node are started as new service nodes and added to load balancing distribution.

In some implementations, the node that initiates and joins the load balancing allocation as a new serving node is weighted as the initial weight. It can be appreciated that when the service request is reassigned, the weight corresponding to the new service node when started is calculated according to the foregoing embodiment, which is not described herein.

The above embodiments can be seen that the load balancing module in the device provided by the present application monitors the service status periodically.

In some embodiments, the first threshold may also be equal to the second threshold. At this time, when the number of caches of the node reaches a threshold, the request may be allocated according to the second weight.

In a second aspect, the present application provides a server configured to: receiving a service request; the server comprises: the system comprises a load balancing module, at least one service node and a local storage; wherein the service node comprises a first service node; the load balancing module is configured to: distributing the service request to the service node according to the weight of the load balancing scheme; wherein the service request distributed to the first service node is denoted as a first service request; the first service node is configured to: inquiring data from a local cache or a third party system according to the first service request; wherein the data corresponds to the first service request; the load balancing module is further configured to: receiving a monitoring index added when the first service node returns data corresponding to the first service request; the monitoring index comprises a cache loading state of data; and updating the weight according to the monitoring index, and distributing the service request according to the updated weight when distributing the service request next time.

The service calling party sends a request instruction for calling the service, and the load balancing module responds to the request instruction and distributes the service request to the service node according to the weight in the load balancing scheme so as to share the load pressure; when the service node is allocated with a service request, the local cache does not have data corresponding to the service request, and the service node needs to acquire the data from a database or a third-party system. When the first service node inquires the database or the third party system and acquires the data corresponding to the first service request, the database or the third party system returns the data to the first service node, and simultaneously returns the buffer loading state of the data, and the first service node adds the monitoring index according to the buffer loading state and simultaneously returns the monitoring index and the data to the load balancing module. The load balancing module may rearrange weights corresponding to the respective nodes according to the monitoring index. And simultaneously, the load balancing module returns the data corresponding to the calling service to the service calling party.

The embodiment of the application also provides a chip which is connected with the memory or comprises the memory and is used for reading and executing the software program stored in the memory.

Embodiments of the present application also provide a computer program product comprising one or more computer program instructions. When the computer program instructions are loaded and executed by a computer, the processes or functions in accordance with the various embodiments of the present application described above are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. Which when run on a computer causes the computer to perform the method provided by the embodiments of the application.

There is also provided in this embodiment a computer-readable storage medium storing computer program instructions that, when executed, implement all the steps of the image processing method of the above-described embodiments of the present application. The computer readable storage medium includes magnetic disk, optical disk, read-only memory ROM or random access memory RAM, etc.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the embodiments are not limited and may be implemented in whole or in part in the form of a computer program product.

It will also be appreciated by those of skill in the art that the various illustrative logical blocks (illustrative logical block) and steps (step) described herein may be implemented in electronic hardware, computer software, or combinations of both. Whether such functionality is implemented as hardware or software depends upon the particular application and design requirements of the overall system. Those skilled in the art may implement the functionality in a variety of ways for each particular application, but such implementation is not to be understood as beyond the scope of the present application.

The various illustrative logical blocks and circuits described in this disclosure may be implemented or performed with a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the general purpose processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.

The steps of a method or algorithm described in the connection with the present application may be embodied directly in hardware, in a software element executed by a processor, or in a combination of the two. The software elements may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. In an example, a storage medium may be coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC, which may reside in a UE. In the alternative, the processor and the storage medium may reside in different components in a UE.

It should be understood that, in various embodiments of the present application, the sequence number of each process does not mean that the execution sequence of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the present application.

Furthermore, the terms first, second, third and the like in the description and in the claims and in the above drawings, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It will be apparent to those skilled in the art that the techniques of embodiments of the present application may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present application may be embodied essentially or in parts contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method of the embodiments or some parts of the embodiments of the present application.

The same or similar parts between the various embodiments in this specification are referred to each other. In particular, for a network device/node or apparatus device, since it is substantially similar to the method embodiments, the description is relatively simple, as far as the description in the method embodiments is concerned.

The above embodiments of the present application do not limit the scope of the present application.

Claims

1. A method of load balancing, comprising:

receiving a service request;

according to the formula w=w ₀ The xF determines the weight corresponding to each service node; wherein each service node comprises a first service node and a second service node; the first service node is a node with the cache amount of the local cache being lower than a first threshold value; the second service node is a node with the cache quantity of the local cache higher than a second threshold value; the first threshold is not greater than the second threshold, the weight corresponding to the first service node is a first weight, and the weight corresponding to the second service node is a second weight; the first weight is less than the second weight; w represents the weight, W ₀ Representing the initial weights; f represents an influence factor, and is a number between 0 and 1, if the monitoring index M) is smaller than the first threshold M ₁ The value of the influence factor F is the initial influence value F ₀ The method comprises the steps of carrying out a first treatment on the surface of the If the monitor index M is greater than or equal to a first threshold M ₁ ) And is less than or equal to a second threshold M ₂ The formula for the influence factor F is:if the monitor index M is greater than the second threshold M ₂ The value of the influencing factor F is 1; the monitoring index is the cache state of the local cache of the first service node after the first service node completes the first service request;

and distributing the service request to each service node according to the weight, so that the service node obtains data according to the distributed service request and stores the data in a local cache, wherein the probability of obtaining the service request by the first service node is lower than that of obtaining the service request by the second service node.

2. The load balancing method according to claim 1, wherein the service request distributed to the first service node is a first service request; after said distributing said service request to said service nodes, said method further comprises:

receiving a monitoring index returned by the first service node;

and when the monitoring index represents that the cache amount of the local cache of the first service node is increased, increasing the first weight, and distributing the service request to the first service node according to the increased weight when distributing the service request next time.

3. The load balancing method according to claim 2, wherein the receiving the monitoring indicator returned by the first service node includes:

the first service node sends a query request generated according to the first service request to a local cache and/or a third party system, wherein the query request is used for querying data corresponding to the first service request;

the first service node receives first data returned by the third party system; the first data are data obtained by the third party system in response to the query request;

the first service node stores the first data in a local cache corresponding to the first service node; adding a monitoring index along with the data; the monitoring index comprises a buffer capacity;

and the first service node uploads the monitoring index to a load balancing system.

4. The load balancing method according to claim 1, wherein the first weight W ₁ The formula of (2) is:

W ₁ ＝W ₀ ×F ₀ 。

5. the load balancing method according to claim 1, wherein the first service node comprises a failure-recovered service node or a newly-built node.

6. The load balancing method of claim 1, further comprising:

Receiving a monitoring index returned by the second service node, wherein the monitoring index is a cache state of a local cache after the second service node completes a second service request;

and updating the second weight according to the monitoring index, and distributing the service request to the second service node according to the updated second weight when distributing the service request next time.

7. The load balancing method of claim 1, further comprising: when the monitoring index and the influence factor are in negative correlation, the final influence factor F _f The formula of (2) is:

F _f ＝1-F+F ₀ 。

8. the load balancing method of claim 1, further comprising: when the monitoring index and the influence factor are in negative correlation, the final influence factor F _f The formula of (2) is:

F _f ＝1-(M ₂ -M)+F ₀ 。

9. a server, wherein the server is configured to: receiving a service request;

the server includes: the system comprises a load balancing module, at least one service node and a local storage; the service nodes comprise a first service node and a second service node;

the load balancing module is configured to: receiving a service request; according to the formula w=w ₀ The xF determines the weight corresponding to each service node; distributing the service request to each service node according to the weight; wherein the probability of the first service node acquiring the service request is lower than that of the second service node; the service request distributed to the first service node is a first service request; w represents the weight, W ₀ Representing the initial weights; f represents an influence factor, and is a number between 0 and 1, if the monitoring index M is smaller than the first threshold M ₁ The value of the influence factor F is the initial influence value F ₀ The method comprises the steps of carrying out a first treatment on the surface of the If the monitor index M is greater than or equal to a first threshold M ₁ And is less than or equal to a second threshold M ₂ The formula for the influence factor F is:if the monitor index M is greater than the second threshold M ₂ The value of the influencing factor F is 1; the monitoring index is the cache state of the local cache of the first service node after the first service node completes the first service request;

the service node is configured to: acquiring data according to the distributed service request and storing the data in a local cache;

the first service node is a node with a local cache amount lower than a first threshold, the weight corresponding to the first service node is a first weight, and the first service node is configured to: inquiring data from a local cache or a third party system according to the first service request;

the second service node is a node with a cache amount of local cache higher than a second threshold, the weight corresponding to the second service node is a second weight, and the second service node is configured to: inquiring data from a local cache according to the first service request; wherein the first threshold is not greater than the second threshold; the second weight is greater than the first weight.