CN114390089A

CN114390089A - API gateway load balancing method and API gateway

Info

Publication number: CN114390089A
Application number: CN202111480192.7A
Authority: CN
Inventors: 胡梅贤; 龙榜; 饶学贵; 李天国
Original assignee: Shenzhen Farben Information Technology Co ltd
Current assignee: Shenzhen Farben Information Technology Co ltd
Priority date: 2021-12-06
Filing date: 2021-12-06
Publication date: 2022-04-22

Abstract

The invention relates to an API gateway load balancing method and an API gateway. The method comprises the following steps: dividing the service instance API into a high-performance area and a low-performance area according to the response time of the service instance API, wherein the response time of the high-performance area is shorter than that of the low-performance area; the service instance API of the high-performance area is in a normal calling state, the service instance API of the low-performance area is in a call limiting state, and the call limiting state means that the service instance API is called after preset time. The invention divides the service instance API into a high-performance area and a low-performance area, reduces the influence of the service instance API with longer response time on the whole, and thus improves the service experience of normally responding the service instance API.

Description

API gateway load balancing method and API gateway

Technical Field

The invention relates to the field of API gateways, in particular to an API gateway load balancing method and an API gateway.

Background

The development of cloud technology changes the traditional enterprise application software development mode, and large-scale chimney-type enterprise application is changing to a micro-service-based architecture. The system based on the micro-service architecture is constructed by a plurality of service units, the services are mutually matched and cooperated to realize the final value, and the load balance among the services is particularly important for improving the concurrency and the robustness of the system.

Conventional load balancing algorithms include: polling (Round Robin), Weighted polling (Weighted Round Robin), Random (Random), Source address Hashing (Source Hashing), Least number of Connections (Least Connections), fast response (fast), etc., where these algorithms are found to be deficient in practical applications:

polling, random method, source address hashing algorithm: the situation that server configurations are different and server performance differences are large is not considered, a short-plate effect of a service cluster wooden barrel is easy to occur, most of service response time meets the requirement of response time, and part of the response time is too long and cannot meet the requirement of quality of service (QoS).

And (3) a weighted polling method: the weight is static, and the proper weight is difficult to set, and for the condition of dynamically increasing the service instances, the proper weight cannot be set.

Minimum connection number method (Least Connections): the difficulty in calculating the number of connections is high, and more performance is consumed additionally.

Fastest response (Fastest): to select the fastest response instance, more server performance is consumed.

Therefore, the existing load balancing algorithm is not good in eliminating the service instance API with slow response, so that the service instance API with slow response delays the whole response speed, and most users experience poor.

Disclosure of Invention

The technical problem to be solved by the invention is to provide an API gateway load balancing method and an API gateway.

The technical scheme adopted by the invention for solving the technical problems is as follows: an API gateway load balancing method is constructed, and the method comprises the following steps:

dividing a service instance API into a high-performance area and a low-performance area according to the response time of the service instance API, wherein the response time of the high-performance area is smaller than that of the low-performance area;

the service instance API of the high-performance area is in a normal calling state, the service instance API of the low-performance area is in a call limiting state, and the call limiting state refers to that the service instance API is called after preset time.

Further, the API gateway load balancing method of the present invention further comprises the steps of:

and the service instance API of the high-performance area is called according to a polling method.

and after calling the service instance API of the high-performance area, judging whether the response time is less than threshold time, and if not, transferring the service instance API to the low-performance area.

Further, the API gateway load balancing method of the present invention further comprises the steps of: after the API of the service instance of the low-performance area is called, whether the response time is less than the threshold time is judged;

if yes, transferring the service instance API to the high-performance area;

if not, restarting timing and prolonging the preset time.

Further, in the API gateway load balancing method of the present invention, the threshold time is an average value of response times of all the service instances APIs in the high performance area.

and if the average value of the response time of all the service instance APIs in the high-performance area is greater than the first preset time, the API gateway increases the preset number of service instance APIs until the maximum value of the service instance APIs borne by the API gateway is reached.

if the average value of the response time of all the service instance APIs in the high-performance area is greater than second preset time and the response time of the service instance APIs in the first preset proportion is greater than third preset time, performing current limiting operation and sending overload alarm information, and sending the overload alarm information to a management terminal;

the first preset time is less than the second preset time, and the second preset time is less than the third preset time.

Further, the API gateway load balancing method of the present invention further comprises the steps of: after the current limiting operation is started, the API gateway reduces the query rate per second according to a preset speed, and the current limiting operation is stopped until the response time of the service instance API with the second preset proportion is longer than a third preset time; the first preset proportion is larger than the second preset proportion.

Further, the API gateway load balancing method of the present invention further comprises the steps of: if receiving a new service instance API registration, dividing the service instance API into the high-performance area;

and if the existing service instance API is offline, deleting the service instance API from the high-performance area or the low-performance area.

In addition, the invention also provides an API gateway, which comprises a memory and a processor;

the memory has stored therein a computer program;

the processor performs the steps of the API gateway load balancing method as described above by calling the computer program stored in the memory.

The implementation of the API gateway load balancing method and the API gateway has the following beneficial effects: the invention divides the service instance API into a high-performance area and a low-performance area, reduces the influence of the service instance API with longer response time on the whole, and thus improves the service experience of normally responding the service instance API.

Drawings

The invention will be further described with reference to the accompanying drawings and examples, in which:

fig. 1 is a flowchart of an API gateway load balancing method according to an embodiment of the present invention.

Detailed Description

For a more clear understanding of the technical features, objects and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.

In a preferred embodiment, referring to fig. 1, the API gateway load balancing method of this embodiment is applied to an API gateway under a micro-service architecture, a cloud server provides a plurality of service instances, each service instance corresponds to a service instance API, and the API gateway calls the service instance APIs according to a certain rule to access the service instances. Specifically, the API gateway load balancing method includes the following steps:

and S1, dividing the service instance API into a high-performance area and a low-performance area according to the response time of the service instance API, wherein the response time of the high-performance area is smaller than that of the low-performance area. Specifically, in the prior art, the API gateway does not partition the service instance API, but directly calls the service instance API. In the actual use process, the response time of some service instance APIs is long, and the whole is tired, so that the average response time of all service instance APIs in the API gateway is long, and the whole access experience of a user is influenced. In order to avoid that the service instance API with a long response time hinders the overall service instance API access experience, in this embodiment, the response times of all the service instance APIs in the API gateway are obtained, the service instance API is divided into a high-performance area and a low-performance area according to the response time of the service instance API, and the response time of the high-performance area is shorter than the response time of the low-performance area. The proportion or number of service instance APIs in the high-performance zone and the low-performance zone may be set as desired, for example, the proportion of service instance APIs in the high-performance zone and the low-performance zone is 70% and 30%, respectively. It will be appreciated that the response time of the service instance API is dynamically changing, so the service instance API in the high-performance region and the low-performance region is also dynamically changing, and the service instance API may be moved from the high-performance region to the low-performance region, and may be moved from the low-performance region to the high-performance region.

And S2, the service instance API of the high-performance area is in a normal calling state, the service instance API of the low-performance area is in a call limiting state, and the call limiting state refers to that the service instance API is called after preset time. Specifically, after the high-performance area and the low-performance area are divided, the high-performance area and the low-performance area adopt different calling strategies, the service instance API of the high-performance area is in a normal calling state, and the service instance API of the low-performance area is in a call-limiting state. The normal call state refers to that the service instance API in the high-performance area is called without limitation, for example, the service instance API in the high-performance area is called according to a polling method. Of course, the restriction call state is not called all the time, but means that the service instance API of the low performance region is called after a preset time. It can be understood that after the service instance API in the low-performance area is restricted from being called, the influence of the service instance API with longer response time on the overall response time is eliminated, so that the service instance API in the high-performance area has faster response speed, and the user experience is improved. Alternatively, the preset time of the low performance zone may be set as desired, for example, 5 minutes, 10 minutes, 100 minutes, etc.

According to the embodiment, the service instance API is divided into the high-performance area and the low-performance area, and the calling rate of the service instance API in the first performance area is reduced, so that the influence of the service instance API with long response time on the whole is reduced, and the service experience of normally responding to the service instance API is improved.

In some embodiments of the API gateway load balancing method, the response time of the service instance API in the high-performance area is dynamically changed, and the response time is short before and can be subsequently lengthened, so that it is required that the service instance API in the high-performance area and the low-performance area is dynamically changed, that is, the service instance API may be transferred from the high-performance area to the low-performance area. And after calling the service instance API in the high-performance area, judging whether the response time is less than the threshold time, if the response time is less than the threshold time, indicating that the response of the service instance API is still faster at this time, and keeping the service instance API in the high-performance area. And if the response time is not less than the threshold time, the service instance API responds slowly at this time and is not suitable for being continuously remained in the high-performance area, and the service instance API is transferred to the low-performance area. Alternatively, the threshold time is the average of the response times of all service instance APIs in the high-performance zone; it will be appreciated that the threshold time is also dynamically variable as the response time of each service instance API is constantly changing. According to the embodiment, the response time change of the service instance API in the high-performance area is monitored in real time, and the service instance API which does not meet the requirement of quick response is transferred to the low-performance area, so that the response speed of the service instance API in the high-performance area is ensured, and the user experience is improved.

In some embodiments of the API gateway load balancing method, the response time of the service instance API in the low performance region is dynamically changed, and the response time is longer before and then becomes shorter, so that it is required that the service instance API in the high performance region and the low performance region keeps dynamically changing, that is, the service instance API may be transferred from the low performance region to the high performance region. And after the API of the service instance in the low-performance area is called, judging whether the response time is less than the threshold time. If the current response time is not less than the threshold time, the response time of the service instance API is still longer, and if the service instance API is migrated to the high-performance area, the overall response speed is still affected, so that the service instance API is kept in the low-performance area, the service instance API is restarted, and the preset time is prolonged. Alternatively, each time the service instance API restarts the timing, the preset time may be extended, for example, the preset time may be 5 minutes, 10 minutes after the first time delay, 15 minutes after the second time delay, 20 minutes after the third time delay, etc. And if the response time is less than the threshold time, the service instance API responds faster at this time and is not suitable for being continuously remained in the low-performance area, and the service instance API is transferred to the high-performance area. Alternatively, the threshold time is the average of the response times of all service instance APIs in the high-performance zone; it will be appreciated that the threshold time is also dynamically variable as the response time of each service instance API is constantly changing. According to the embodiment, the response time change of the service instance API in the low-performance area is monitored in real time, and the service instance API meeting the quick response requirement is transferred to the high-performance area, so that the service instance API in the low-performance area recovers quick response, and the user experience is improved.

In the API gateway load balancing method in some embodiments, if the average value of the response times of all the service instance APIs in the high-performance area is greater than the first preset time, and the first preset time is an alarm value, the API gateway increases the preset number of service instance APIs until the maximum value of the service instance APIs borne by the API gateway is reached. Specifically, if the number of calling clients is increased greatly, which results in that the average value of the response time of all service instance APIs in the high-performance area is greater than a first preset time, for example, the first preset time is 1000ms, the API gateway triggers the elastic scaling mechanism, and the API gateway increases the number of service instance APIs by a preset amount, for example, the preset number is 2, so as to improve the throughput of the service cluster and reduce the average response time. Further, the API gateway increases the preset number of service instance APIs and then continuously monitors the average value of the response time, and if the average value of the response time still exceeds the first preset time after a period of time, the API gateway continuously increases the preset number of service instance APIs until the maximum value of the service instance APIs borne by the API gateway is reached. According to the embodiment, the service instance API is added when the number of the calling clients is increased, so that the throughput of the service cluster is improved, and the user experience is improved.

In the API gateway load balancing method of some embodiments, if the average value of the response times of all service instance APIs in the high-performance area is greater than the second preset time and the response time of the service instance API of the first preset proportion is greater than the third preset time, performing a current-limiting operation and sending overload alarm information, and sending the overload alarm information to the management terminal; the first preset time is less than the second preset time, and the second preset time is less than the third preset time. Specifically, if the number of calling clients is increased greatly, which results in that the average value of the response times of all service instance APIs in the high-performance area is greater than a second preset time, for example, the second preset time is 2000ms, and the response time of the service instance API with the first preset proportion is greater than a third preset time, for example, the first preset proportion is 10%, and the third preset time is 3000ms, then the current limiting condition is reached, and the current limiting operation needs to be performed. And further, overload alarm information is sent out during the current limiting operation, and the overload alarm information is sent to the management terminal to remind a manager to take measures. Alternatively, the overload alarm information can be sent to the management terminal by means of mails, short messages, instant messaging tools and the like. Further, after the current limiting operation is started, the API gateway reduces the query rate per second according to a preset speed, and stops the current limiting operation until the response time of the remaining service instance APIs with the second preset proportion is longer than a third preset time, where the first preset proportion is larger than the second preset proportion, for example, the second preset proportion is 1%. According to the embodiment, when the number of clients is increased greatly, current limiting measures are automatically taken, so that the system is prevented from being crashed due to overload, and the safety and stability of the system are improved.

In the API gateway load balancing method of some embodiments, if a new service instance API registration is received, the service instance API is divided into high-performance regions, and the high-performance region or the low-performance region is reselected according to the response time of the service instance API after the service instance API is subsequently called. Alternatively, the service instance APIs in the high-performance area are sorted from low to high according to the response time, that is, the response time of the forefront of the API queue is shortest, the response time of the rearmost of the API queue is longest, and the high-performance area is called from the forefront of the API queue in sequence when called. Further, if the existing service instance API is offline in the API gateway, the service instance API is deleted from the high-performance area or the low-performance area. That is, if the service instance API in the high-performance area in the API gateway receives the offline instruction, the service instance API is deleted from the high-performance area; and if the service instance API in the low-performance area in the API gateway receives the offline instruction, deleting the service instance API from the low-performance area.

In a preferred embodiment, the API gateway of this embodiment includes a memory and a processor, where the memory stores a computer program, and the processor executes the steps of the API gateway load balancing method according to the above embodiment by calling the computer program stored in the memory. According to the embodiment, the service instance API is divided into the high-performance area and the low-performance area, so that the influence of the service instance API with longer response time on the whole is reduced, and the service experience of normally responding to the service instance API is improved.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The above embodiments are merely illustrative of the technical ideas and features of the present invention, and are intended to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the scope of the present invention. All equivalent changes and modifications made within the scope of the claims of the present invention should be covered by the claims of the present invention.

Claims

1. An API gateway load balancing method is characterized by comprising the following steps:

2. The API gateway load balancing method of claim 1, further comprising the steps of:

3. The API gateway load balancing method of claim 1, further comprising the steps of:

4. The API gateway load balancing method of claim 1, further comprising the steps of: after the API of the service instance of the low-performance area is called, whether the response time is less than the threshold time is judged;

if yes, transferring the service instance API to the high-performance area;

if not, restarting timing and prolonging the preset time.

5. The API gateway load balancing method of claim 3 or 4, wherein the threshold time is an average of response times of all the service instance APIs of the high performance zone.

6. The API gateway load balancing method of claim 1, further comprising the steps of:

7. The API gateway load balancing method of claim 6, further comprising the steps of:

8. The API gateway load balancing method of claim 7, further comprising the steps of: after the current limiting operation is started, the API gateway reduces the query rate per second according to a preset speed, and the current limiting operation is stopped until the response time of the service instance API with the second preset proportion is longer than a third preset time; the first preset proportion is larger than the second preset proportion.

9. The API gateway load balancing method of claim 1, further comprising the steps of: if receiving a new service instance API registration, dividing the service instance API into the high-performance area;

10. An API gateway comprising a memory and a processor;

the memory has stored therein a computer program;

the processor performs the steps of the API gateway load balancing method of any one of claims 1 to 9 by calling the computer program stored in the memory.