CN114390089A - API gateway load balancing method and API gateway - Google Patents

API gateway load balancing method and API gateway Download PDF

Info

Publication number
CN114390089A
CN114390089A CN202111480192.7A CN202111480192A CN114390089A CN 114390089 A CN114390089 A CN 114390089A CN 202111480192 A CN202111480192 A CN 202111480192A CN 114390089 A CN114390089 A CN 114390089A
Authority
CN
China
Prior art keywords
api
service instance
performance area
load balancing
api gateway
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111480192.7A
Other languages
Chinese (zh)
Inventor
胡梅贤
龙榜
饶学贵
李天国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Farben Information Technology Co ltd
Original Assignee
Shenzhen Farben Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Farben Information Technology Co ltd filed Critical Shenzhen Farben Information Technology Co ltd
Priority to CN202111480192.7A priority Critical patent/CN114390089A/en
Publication of CN114390089A publication Critical patent/CN114390089A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/66Arrangements for connecting between networks having differing types of switching systems, e.g. gateways

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention relates to an API gateway load balancing method and an API gateway. The method comprises the following steps: dividing the service instance API into a high-performance area and a low-performance area according to the response time of the service instance API, wherein the response time of the high-performance area is shorter than that of the low-performance area; the service instance API of the high-performance area is in a normal calling state, the service instance API of the low-performance area is in a call limiting state, and the call limiting state means that the service instance API is called after preset time. The invention divides the service instance API into a high-performance area and a low-performance area, reduces the influence of the service instance API with longer response time on the whole, and thus improves the service experience of normally responding the service instance API.

Description

API gateway load balancing method and API gateway
Technical Field
The invention relates to the field of API gateways, in particular to an API gateway load balancing method and an API gateway.
Background
The development of cloud technology changes the traditional enterprise application software development mode, and large-scale chimney-type enterprise application is changing to a micro-service-based architecture. The system based on the micro-service architecture is constructed by a plurality of service units, the services are mutually matched and cooperated to realize the final value, and the load balance among the services is particularly important for improving the concurrency and the robustness of the system.
Conventional load balancing algorithms include: polling (Round Robin), Weighted polling (Weighted Round Robin), Random (Random), Source address Hashing (Source Hashing), Least number of Connections (Least Connections), fast response (fast), etc., where these algorithms are found to be deficient in practical applications:
polling, random method, source address hashing algorithm: the situation that server configurations are different and server performance differences are large is not considered, a short-plate effect of a service cluster wooden barrel is easy to occur, most of service response time meets the requirement of response time, and part of the response time is too long and cannot meet the requirement of quality of service (QoS).
And (3) a weighted polling method: the weight is static, and the proper weight is difficult to set, and for the condition of dynamically increasing the service instances, the proper weight cannot be set.
Minimum connection number method (Least Connections): the difficulty in calculating the number of connections is high, and more performance is consumed additionally.
Fastest response (Fastest): to select the fastest response instance, more server performance is consumed.
Therefore, the existing load balancing algorithm is not good in eliminating the service instance API with slow response, so that the service instance API with slow response delays the whole response speed, and most users experience poor.
Disclosure of Invention
The technical problem to be solved by the invention is to provide an API gateway load balancing method and an API gateway.
The technical scheme adopted by the invention for solving the technical problems is as follows: an API gateway load balancing method is constructed, and the method comprises the following steps:
dividing a service instance API into a high-performance area and a low-performance area according to the response time of the service instance API, wherein the response time of the high-performance area is smaller than that of the low-performance area;
the service instance API of the high-performance area is in a normal calling state, the service instance API of the low-performance area is in a call limiting state, and the call limiting state refers to that the service instance API is called after preset time.
Further, the API gateway load balancing method of the present invention further comprises the steps of:
and the service instance API of the high-performance area is called according to a polling method.
Further, the API gateway load balancing method of the present invention further comprises the steps of:
and after calling the service instance API of the high-performance area, judging whether the response time is less than threshold time, and if not, transferring the service instance API to the low-performance area.
Further, the API gateway load balancing method of the present invention further comprises the steps of: after the API of the service instance of the low-performance area is called, whether the response time is less than the threshold time is judged;
if yes, transferring the service instance API to the high-performance area;
if not, restarting timing and prolonging the preset time.
Further, in the API gateway load balancing method of the present invention, the threshold time is an average value of response times of all the service instances APIs in the high performance area.
Further, the API gateway load balancing method of the present invention further comprises the steps of:
and if the average value of the response time of all the service instance APIs in the high-performance area is greater than the first preset time, the API gateway increases the preset number of service instance APIs until the maximum value of the service instance APIs borne by the API gateway is reached.
Further, the API gateway load balancing method of the present invention further comprises the steps of:
if the average value of the response time of all the service instance APIs in the high-performance area is greater than second preset time and the response time of the service instance APIs in the first preset proportion is greater than third preset time, performing current limiting operation and sending overload alarm information, and sending the overload alarm information to a management terminal;
the first preset time is less than the second preset time, and the second preset time is less than the third preset time.
Further, the API gateway load balancing method of the present invention further comprises the steps of: after the current limiting operation is started, the API gateway reduces the query rate per second according to a preset speed, and the current limiting operation is stopped until the response time of the service instance API with the second preset proportion is longer than a third preset time; the first preset proportion is larger than the second preset proportion.
Further, the API gateway load balancing method of the present invention further comprises the steps of: if receiving a new service instance API registration, dividing the service instance API into the high-performance area;
and if the existing service instance API is offline, deleting the service instance API from the high-performance area or the low-performance area.
In addition, the invention also provides an API gateway, which comprises a memory and a processor;
the memory has stored therein a computer program;
the processor performs the steps of the API gateway load balancing method as described above by calling the computer program stored in the memory.
The implementation of the API gateway load balancing method and the API gateway has the following beneficial effects: the invention divides the service instance API into a high-performance area and a low-performance area, reduces the influence of the service instance API with longer response time on the whole, and thus improves the service experience of normally responding the service instance API.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
fig. 1 is a flowchart of an API gateway load balancing method according to an embodiment of the present invention.
Detailed Description
For a more clear understanding of the technical features, objects and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
In a preferred embodiment, referring to fig. 1, the API gateway load balancing method of this embodiment is applied to an API gateway under a micro-service architecture, a cloud server provides a plurality of service instances, each service instance corresponds to a service instance API, and the API gateway calls the service instance APIs according to a certain rule to access the service instances. Specifically, the API gateway load balancing method includes the following steps:
and S1, dividing the service instance API into a high-performance area and a low-performance area according to the response time of the service instance API, wherein the response time of the high-performance area is smaller than that of the low-performance area. Specifically, in the prior art, the API gateway does not partition the service instance API, but directly calls the service instance API. In the actual use process, the response time of some service instance APIs is long, and the whole is tired, so that the average response time of all service instance APIs in the API gateway is long, and the whole access experience of a user is influenced. In order to avoid that the service instance API with a long response time hinders the overall service instance API access experience, in this embodiment, the response times of all the service instance APIs in the API gateway are obtained, the service instance API is divided into a high-performance area and a low-performance area according to the response time of the service instance API, and the response time of the high-performance area is shorter than the response time of the low-performance area. The proportion or number of service instance APIs in the high-performance zone and the low-performance zone may be set as desired, for example, the proportion of service instance APIs in the high-performance zone and the low-performance zone is 70% and 30%, respectively. It will be appreciated that the response time of the service instance API is dynamically changing, so the service instance API in the high-performance region and the low-performance region is also dynamically changing, and the service instance API may be moved from the high-performance region to the low-performance region, and may be moved from the low-performance region to the high-performance region.
And S2, the service instance API of the high-performance area is in a normal calling state, the service instance API of the low-performance area is in a call limiting state, and the call limiting state refers to that the service instance API is called after preset time. Specifically, after the high-performance area and the low-performance area are divided, the high-performance area and the low-performance area adopt different calling strategies, the service instance API of the high-performance area is in a normal calling state, and the service instance API of the low-performance area is in a call-limiting state. The normal call state refers to that the service instance API in the high-performance area is called without limitation, for example, the service instance API in the high-performance area is called according to a polling method. Of course, the restriction call state is not called all the time, but means that the service instance API of the low performance region is called after a preset time. It can be understood that after the service instance API in the low-performance area is restricted from being called, the influence of the service instance API with longer response time on the overall response time is eliminated, so that the service instance API in the high-performance area has faster response speed, and the user experience is improved. Alternatively, the preset time of the low performance zone may be set as desired, for example, 5 minutes, 10 minutes, 100 minutes, etc.
According to the embodiment, the service instance API is divided into the high-performance area and the low-performance area, and the calling rate of the service instance API in the first performance area is reduced, so that the influence of the service instance API with long response time on the whole is reduced, and the service experience of normally responding to the service instance API is improved.
In some embodiments of the API gateway load balancing method, the response time of the service instance API in the high-performance area is dynamically changed, and the response time is short before and can be subsequently lengthened, so that it is required that the service instance API in the high-performance area and the low-performance area is dynamically changed, that is, the service instance API may be transferred from the high-performance area to the low-performance area. And after calling the service instance API in the high-performance area, judging whether the response time is less than the threshold time, if the response time is less than the threshold time, indicating that the response of the service instance API is still faster at this time, and keeping the service instance API in the high-performance area. And if the response time is not less than the threshold time, the service instance API responds slowly at this time and is not suitable for being continuously remained in the high-performance area, and the service instance API is transferred to the low-performance area. Alternatively, the threshold time is the average of the response times of all service instance APIs in the high-performance zone; it will be appreciated that the threshold time is also dynamically variable as the response time of each service instance API is constantly changing. According to the embodiment, the response time change of the service instance API in the high-performance area is monitored in real time, and the service instance API which does not meet the requirement of quick response is transferred to the low-performance area, so that the response speed of the service instance API in the high-performance area is ensured, and the user experience is improved.
In some embodiments of the API gateway load balancing method, the response time of the service instance API in the low performance region is dynamically changed, and the response time is longer before and then becomes shorter, so that it is required that the service instance API in the high performance region and the low performance region keeps dynamically changing, that is, the service instance API may be transferred from the low performance region to the high performance region. And after the API of the service instance in the low-performance area is called, judging whether the response time is less than the threshold time. If the current response time is not less than the threshold time, the response time of the service instance API is still longer, and if the service instance API is migrated to the high-performance area, the overall response speed is still affected, so that the service instance API is kept in the low-performance area, the service instance API is restarted, and the preset time is prolonged. Alternatively, each time the service instance API restarts the timing, the preset time may be extended, for example, the preset time may be 5 minutes, 10 minutes after the first time delay, 15 minutes after the second time delay, 20 minutes after the third time delay, etc. And if the response time is less than the threshold time, the service instance API responds faster at this time and is not suitable for being continuously remained in the low-performance area, and the service instance API is transferred to the high-performance area. Alternatively, the threshold time is the average of the response times of all service instance APIs in the high-performance zone; it will be appreciated that the threshold time is also dynamically variable as the response time of each service instance API is constantly changing. According to the embodiment, the response time change of the service instance API in the low-performance area is monitored in real time, and the service instance API meeting the quick response requirement is transferred to the high-performance area, so that the service instance API in the low-performance area recovers quick response, and the user experience is improved.
In the API gateway load balancing method in some embodiments, if the average value of the response times of all the service instance APIs in the high-performance area is greater than the first preset time, and the first preset time is an alarm value, the API gateway increases the preset number of service instance APIs until the maximum value of the service instance APIs borne by the API gateway is reached. Specifically, if the number of calling clients is increased greatly, which results in that the average value of the response time of all service instance APIs in the high-performance area is greater than a first preset time, for example, the first preset time is 1000ms, the API gateway triggers the elastic scaling mechanism, and the API gateway increases the number of service instance APIs by a preset amount, for example, the preset number is 2, so as to improve the throughput of the service cluster and reduce the average response time. Further, the API gateway increases the preset number of service instance APIs and then continuously monitors the average value of the response time, and if the average value of the response time still exceeds the first preset time after a period of time, the API gateway continuously increases the preset number of service instance APIs until the maximum value of the service instance APIs borne by the API gateway is reached. According to the embodiment, the service instance API is added when the number of the calling clients is increased, so that the throughput of the service cluster is improved, and the user experience is improved.
In the API gateway load balancing method of some embodiments, if the average value of the response times of all service instance APIs in the high-performance area is greater than the second preset time and the response time of the service instance API of the first preset proportion is greater than the third preset time, performing a current-limiting operation and sending overload alarm information, and sending the overload alarm information to the management terminal; the first preset time is less than the second preset time, and the second preset time is less than the third preset time. Specifically, if the number of calling clients is increased greatly, which results in that the average value of the response times of all service instance APIs in the high-performance area is greater than a second preset time, for example, the second preset time is 2000ms, and the response time of the service instance API with the first preset proportion is greater than a third preset time, for example, the first preset proportion is 10%, and the third preset time is 3000ms, then the current limiting condition is reached, and the current limiting operation needs to be performed. And further, overload alarm information is sent out during the current limiting operation, and the overload alarm information is sent to the management terminal to remind a manager to take measures. Alternatively, the overload alarm information can be sent to the management terminal by means of mails, short messages, instant messaging tools and the like. Further, after the current limiting operation is started, the API gateway reduces the query rate per second according to a preset speed, and stops the current limiting operation until the response time of the remaining service instance APIs with the second preset proportion is longer than a third preset time, where the first preset proportion is larger than the second preset proportion, for example, the second preset proportion is 1%. According to the embodiment, when the number of clients is increased greatly, current limiting measures are automatically taken, so that the system is prevented from being crashed due to overload, and the safety and stability of the system are improved.
In the API gateway load balancing method of some embodiments, if a new service instance API registration is received, the service instance API is divided into high-performance regions, and the high-performance region or the low-performance region is reselected according to the response time of the service instance API after the service instance API is subsequently called. Alternatively, the service instance APIs in the high-performance area are sorted from low to high according to the response time, that is, the response time of the forefront of the API queue is shortest, the response time of the rearmost of the API queue is longest, and the high-performance area is called from the forefront of the API queue in sequence when called. Further, if the existing service instance API is offline in the API gateway, the service instance API is deleted from the high-performance area or the low-performance area. That is, if the service instance API in the high-performance area in the API gateway receives the offline instruction, the service instance API is deleted from the high-performance area; and if the service instance API in the low-performance area in the API gateway receives the offline instruction, deleting the service instance API from the low-performance area.
In a preferred embodiment, the API gateway of this embodiment includes a memory and a processor, where the memory stores a computer program, and the processor executes the steps of the API gateway load balancing method according to the above embodiment by calling the computer program stored in the memory. According to the embodiment, the service instance API is divided into the high-performance area and the low-performance area, so that the influence of the service instance API with longer response time on the whole is reduced, and the service experience of normally responding to the service instance API is improved.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and are intended to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the scope of the present invention. All equivalent changes and modifications made within the scope of the claims of the present invention should be covered by the claims of the present invention.

Claims (10)

1. An API gateway load balancing method is characterized by comprising the following steps:
dividing a service instance API into a high-performance area and a low-performance area according to the response time of the service instance API, wherein the response time of the high-performance area is smaller than that of the low-performance area;
the service instance API of the high-performance area is in a normal calling state, the service instance API of the low-performance area is in a call limiting state, and the call limiting state refers to that the service instance API is called after preset time.
2. The API gateway load balancing method of claim 1, further comprising the steps of:
and the service instance API of the high-performance area is called according to a polling method.
3. The API gateway load balancing method of claim 1, further comprising the steps of:
and after calling the service instance API of the high-performance area, judging whether the response time is less than threshold time, and if not, transferring the service instance API to the low-performance area.
4. The API gateway load balancing method of claim 1, further comprising the steps of: after the API of the service instance of the low-performance area is called, whether the response time is less than the threshold time is judged;
if yes, transferring the service instance API to the high-performance area;
if not, restarting timing and prolonging the preset time.
5. The API gateway load balancing method of claim 3 or 4, wherein the threshold time is an average of response times of all the service instance APIs of the high performance zone.
6. The API gateway load balancing method of claim 1, further comprising the steps of:
and if the average value of the response time of all the service instance APIs in the high-performance area is greater than the first preset time, the API gateway increases the preset number of service instance APIs until the maximum value of the service instance APIs borne by the API gateway is reached.
7. The API gateway load balancing method of claim 6, further comprising the steps of:
if the average value of the response time of all the service instance APIs in the high-performance area is greater than second preset time and the response time of the service instance APIs in the first preset proportion is greater than third preset time, performing current limiting operation and sending overload alarm information, and sending the overload alarm information to a management terminal;
the first preset time is less than the second preset time, and the second preset time is less than the third preset time.
8. The API gateway load balancing method of claim 7, further comprising the steps of: after the current limiting operation is started, the API gateway reduces the query rate per second according to a preset speed, and the current limiting operation is stopped until the response time of the service instance API with the second preset proportion is longer than a third preset time; the first preset proportion is larger than the second preset proportion.
9. The API gateway load balancing method of claim 1, further comprising the steps of: if receiving a new service instance API registration, dividing the service instance API into the high-performance area;
and if the existing service instance API is offline, deleting the service instance API from the high-performance area or the low-performance area.
10. An API gateway comprising a memory and a processor;
the memory has stored therein a computer program;
the processor performs the steps of the API gateway load balancing method of any one of claims 1 to 9 by calling the computer program stored in the memory.
CN202111480192.7A 2021-12-06 2021-12-06 API gateway load balancing method and API gateway Pending CN114390089A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111480192.7A CN114390089A (en) 2021-12-06 2021-12-06 API gateway load balancing method and API gateway

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111480192.7A CN114390089A (en) 2021-12-06 2021-12-06 API gateway load balancing method and API gateway

Publications (1)

Publication Number Publication Date
CN114390089A true CN114390089A (en) 2022-04-22

Family

ID=81196752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111480192.7A Pending CN114390089A (en) 2021-12-06 2021-12-06 API gateway load balancing method and API gateway

Country Status (1)

Country Link
CN (1) CN114390089A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107888708A (en) * 2017-12-25 2018-04-06 山大地纬软件股份有限公司 A kind of load-balancing algorithm based on Docker container clusters
CN108712464A (en) * 2018-04-13 2018-10-26 中国科学院信息工程研究所 A kind of implementation method towards cluster micro services High Availabitity
CN111355814A (en) * 2020-04-21 2020-06-30 上海润欣科技股份有限公司 Load balancing method and device and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107888708A (en) * 2017-12-25 2018-04-06 山大地纬软件股份有限公司 A kind of load-balancing algorithm based on Docker container clusters
CN108712464A (en) * 2018-04-13 2018-10-26 中国科学院信息工程研究所 A kind of implementation method towards cluster micro services High Availabitity
CN111355814A (en) * 2020-04-21 2020-06-30 上海润欣科技股份有限公司 Load balancing method and device and storage medium

Similar Documents

Publication Publication Date Title
CN108306971B (en) Method and system for sending acquisition request of data resource
US20200136982A1 (en) Congestion avoidance in a network device
WO2023050901A1 (en) Load balancing method and apparatus, device, computer storage medium and program
CN110138756B (en) Current limiting method and system
CN109783227B (en) Task allocation method, device and system and computer readable storage medium
US10491535B2 (en) Adaptive data synchronization
WO2016011903A1 (en) Traffic control method and apparatus
CN112398945B (en) Service processing method and device based on backpressure
US20110138053A1 (en) Systems, Methods and Computer Readable Media for Reporting Availability Status of Resources Associated with a Network
EP3255849A1 (en) Multi-channel communications for sending push notifications to mobile devices
US9935861B2 (en) Method, system and apparatus for detecting instant message spam
JP7285699B2 (en) Program, method and terminal device
CN113285884B (en) Flow control method and system
CN108566344B (en) Message processing method and device
CN112383585A (en) Message processing system and method and electronic equipment
CN114448989B (en) Method, device, electronic equipment, storage medium and product for adjusting message distribution
CN111404839A (en) Message processing method and device
WO2024088079A1 (en) Request processing method and system
CN114390089A (en) API gateway load balancing method and API gateway
CN111930710A (en) Method for distributing big data content
US11902365B2 (en) Regulating enqueueing and dequeuing border gateway protocol (BGP) update messages
CN111490944A (en) Information processing method, device, equipment and machine-readable storage medium
CN108966160B (en) Short message processing method and device and computer readable storage medium
CN112380011A (en) Dynamic adjustment method and device for service capacity
CN108243112B (en) Chat group network flow control method and device, storage medium and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination