CN113268351A

CN113268351A - Load balancing method and device for gateway service

Info

Publication number: CN113268351A
Application number: CN202110632651.2A
Authority: CN
Inventors: 赵宇
Original assignee: Beijing Kingsoft Cloud Network Technology Co Ltd
Current assignee: Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date: 2021-06-07
Filing date: 2021-06-07
Publication date: 2021-08-17

Abstract

The present disclosure relates to a method and an apparatus for implementing gateway services, and relates to the technical field of big data, wherein the method comprises: deploying a gateway service on container instances within a container cluster and on hosts within a cloud host cluster; and carrying out load balancing on the gateway services deployed in the container cluster and the cloud host cluster according to respective balancing strategies of the two clusters. The embodiment of the application adopts a dual-cluster deployment mode comprising the container cluster and the cloud host cluster, so that the high availability and flexibility of the network are improved, the gateway service in the cluster can be dynamically balanced, the network data processing capacity is enhanced, and the throughput is increased.

Description

Load balancing method and device for gateway service

Technical Field

The present disclosure relates to the field of big data technologies, and in particular, to a load balancing method and apparatus for gateway services.

Background

In the related art, if the customer service traffic suddenly increases, the request volume of the service gateway stand-alone service has an upper limit bottleneck, and if the deployment service needs to be manually expanded to improve the system throughput, the deployment service is generally manually added into the load balancing to achieve the purpose of laterally expanding and deploying the service gateway service, but the manual operation is delayed, so that the stability of the service gateway is poor and the dynamic lateral expansion capability is weak.

Disclosure of Invention

The embodiment of the application adopts a dual-cluster deployment mode comprising a container cluster and a cloud host cluster, so that the high availability and flexibility of a network are improved, the gateway services in the cluster can be dynamically balanced, the network data processing capacity is enhanced, and the throughput is increased. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, a load balancing method for a gateway service is provided, where the load balancing method for the gateway service includes: deploying a gateway service on container instances within a container cluster and on hosts within a cloud host cluster; and carrying out load balancing on the gateway services deployed in the container cluster and the cloud host cluster according to respective balancing strategies of the two clusters.

According to an embodiment of the present disclosure, the method for load balancing of gateway services further includes: and monitoring the state of the container instances in the container cluster, and balancing the load of the gateway service deployed in the container cluster based on the monitored state information of the container instances.

According to an embodiment of the present disclosure, the method for load balancing of gateway services further includes: performing fault detection on the container cluster; and responding to the container cluster with a fault, sending the service traffic needing to be sent to the container cluster to the cloud host cluster, and sending the service traffic to a server by a gateway service in the cloud host cluster.

According to an embodiment of the present disclosure, the method for load balancing of gateway services further includes: acquiring state information of each host in the cloud host cluster; determining a target host from the hosts in the cloud host machine cluster based on the state information of the hosts; and sending the service flow to the target host, and sending the service flow to a server by a gateway service on the target host.

According to an embodiment of the present disclosure, the load balancing the gateway service deployed in the container cluster based on the monitored state information of the container instance includes: determining a target balance mode of the container cluster and a target number to which container instances need to be balanced according to the state information, wherein the balance mode of the container cluster comprises container instance capacity expansion and container instance capacity reduction, and the target balance mode is one of the container instance capacity expansion and the container instance capacity reduction; adjusting the number of container instances within the container cluster to the target number in accordance with the target balancing mode.

According to an embodiment of the present disclosure, the monitoring the status of the container instance in the container cluster includes: and calling a container instance monitoring control deployed in the container cluster, and monitoring the state of the container instance in the container cluster by the container instance monitoring control so as to acquire the state information of the container instance.

According to an embodiment of the present disclosure, the performing state monitoring on the container instance in the container cluster further includes performing data protocol conversion on the state information of the container instance sent by the container instance monitoring component in response to that the data format of the container instance monitoring component is incompatible with the data format of the container cluster, and generating target state information matched with the data format of the container cluster.

According to an embodiment of the present disclosure, the determining a target balancing mode of the container cluster and a target number of container instances to be balanced according to the state information includes: acquiring an average load index of the container cluster based on the state information of each container instance; in response to the average load index being greater than a preset index threshold, determining that the target balancing mode is capacity expansion of the container instance; or, in response to the average load index being less than or equal to the preset index threshold, determining that the target balancing mode is the container instance reduction; determining the target number based on a difference between the average load index and the preset index threshold.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

the embodiment of the application adopts a dual-cluster deployment mode comprising the container cluster and the cloud host cluster, so that the high availability and flexibility of the network are improved, the gateway service in the cluster can be dynamically balanced, the network data processing capacity is enhanced, and the throughput is increased.

According to a second aspect of the embodiments of the present disclosure, there is provided a load balancing apparatus for a gateway service, including: the gateway deployment module is used for deploying gateway services on container instances in the container cluster and hosts in the cloud host cluster; and the load balancing module is used for carrying out load balancing on the gateway services deployed in the container cluster and the cloud host cluster according to respective balancing strategies of the two clusters.

According to an embodiment of the present disclosure, the load balancing apparatus of the gateway service further includes: and the monitoring module is used for monitoring the state of the container instances in the container cluster and balancing the load of the gateway service deployed in the container cluster based on the monitored state information of the container instances.

According to an embodiment of the present disclosure, the load balancing apparatus of the gateway service further includes: the fault detection module is used for carrying out fault detection on the container cluster; and the service sending module is used for responding to the failure of the container cluster, sending the service flow needing to be sent to the container cluster to the cloud host cluster, and sending the service flow to a server by a gateway service in the cloud host cluster.

According to an embodiment of the present disclosure, the service sending module further includes: the state information acquisition unit is used for acquiring the state information of each host in the cloud host cluster; a target host determination unit, configured to determine a target host from hosts in the cloud host machine cluster based on state information of the hosts; and the service flow sending unit is used for sending the service flow to the target host and sending the service flow to a server by the gateway service of the target host.

According to an embodiment of the present disclosure, the monitoring module is further configured to: determining a target balance mode of the container cluster and a target number to which container instances need to be balanced according to the state information, wherein the balance mode of the container cluster comprises container instance capacity expansion and container instance capacity reduction, and the target balance mode is one of the container instance capacity expansion and the container instance capacity reduction; adjusting the number of container instances within the container cluster to the target number in accordance with the target balancing mode.

According to an embodiment of the present disclosure, the monitoring module is further configured to: and calling a container instance monitoring control deployed in the container cluster, and monitoring the state of the container instance in the container cluster by the container instance monitoring control so as to acquire the state information of the container instance.

According to an embodiment of the present disclosure, the load balancing apparatus of the gateway service is further configured to: and in response to the fact that the data format of the container instance monitoring part is incompatible with the data format of the container cluster, performing data protocol conversion on the state information of the container instance sent by the container instance monitoring part to generate target state information matched with the data format of the container cluster.

According to an embodiment of the present disclosure, the monitoring module is further configured to: acquiring an average load index of the container cluster based on the state information of each container instance; in response to the average load index being greater than a preset index threshold, determining that the target balancing mode is capacity expansion of the container instance; or, in response to the average load index being less than or equal to the preset index threshold, determining that the target balancing mode is the container instance reduction; determining the target number based on a difference between the average load index and the preset index threshold.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement the load balancing method for gateway services provided by the embodiment of the first aspect of the disclosure.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium, where instructions of the computer-readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform a load balancing method for a gateway service as provided in the first aspect of the present disclosure.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program, wherein the computer program is configured to implement, when executed by a processor, the method for load balancing of gateway services as provided in the first aspect of the present disclosure.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1 is a schematic diagram illustrating a load balancing method for a gateway service according to an example embodiment.

Fig. 2 is a schematic diagram illustrating another method of load balancing for gateway services according to an example embodiment.

Fig. 3 is a schematic diagram illustrating another method of load balancing for gateway services according to an example embodiment.

Fig. 4 is a schematic diagram illustrating another method of load balancing for gateway services according to an example embodiment.

Fig. 5 is a schematic diagram illustrating another method of load balancing for gateway services according to an example embodiment.

Fig. 6 is a schematic diagram illustrating another method of load balancing for gateway services according to an example embodiment.

Fig. 7 is a schematic diagram illustrating another method of load balancing for gateway services in accordance with an example embodiment.

Fig. 8 is an overall schematic diagram illustrating a load balancing method of a gateway service according to an example embodiment.

Fig. 9 is a schematic diagram illustrating a load balancing apparatus for gateway services according to an example embodiment.

Fig. 10 is a schematic diagram illustrating another load balancing apparatus for gateway services according to an example embodiment.

Fig. 11 is a schematic diagram illustrating another load balancing apparatus for gateway services according to an example embodiment.

FIG. 12 is a schematic diagram illustrating an electronic device in accordance with an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Load balancing, called Load Balance in english, means that a Load is balanced and distributed to a plurality of operation units for operation, such as a File Transfer Protocol (FTP) server, a World Wide Web (Web) server, an enterprise core application server, and other main task servers, so as to cooperatively complete a work task.

Kubernetes, K8s for short, is an open source for managing containerized applications on multiple hosts in a cloud platform, the goal of K8s is to make deploying containerized applications simple and efficient, and K8s provides a mechanism for application deployment, planning, updating, and maintenance.

Fig. 1 is a flowchart illustrating a method for load balancing of a gateway service according to an exemplary embodiment, where as shown in fig. 1, the method for load balancing of a gateway service according to this embodiment includes the following steps:

s101, gateway services are deployed on container instances in the container cluster and on hosts in the cloud host cluster.

In order to enhance network data processing capacity, increase throughput, and improve availability and flexibility of a network, the load balancing provided in the embodiment of the present application includes two deployment manners, as shown in fig. 2, one manner is to arrange a gateway service on each host, and a cloud host cluster is formed by a plurality of hosts; another way is to deploy the gateway service on a container instance (pod) of the container cluster. Where Pod is the smallest basic unit created or deployed by kubernets, a Pod represents a process running on a cluster, and may be considered as a single encapsulated container.

Optionally, the container cluster may be a K8s cluster, where the K8s cluster includes a Horizontal automatic scaling controller (HPA), and the Pod is balanced by the HPA, that is, by detecting various indicators (CPU occupation, memory occupation, and network request amount) of the container running in the Pod, dynamic increase and decrease of the number of Pod instances are realized, and further, by using the HPA mechanism, lateral extension of the gateway service deployed in the container may be realized.

It should be noted that both the container cluster and the cloud host cluster can be deployed across machine rooms and regions, so that multiple places with high availability are realized, that is, the container cluster and the cloud host cluster are deployed in different machine rooms and are linked through a network dedicated line, and thus, when one machine room fails, the other machine room is still available.

Optionally, the gateway services may include services that provide network traffic proxying, forwarding.

It should be noted that in the embodiment of the present application, load balancing refers to the capability provided by a container cluster or a cloud host cluster, and can sense whether a back-end service is alive or not, and if it is sensed that the back-end service is available, service traffic is transferred to the back-end service, so that a cache is maintained; and if the service of the back end is sensed to be unavailable, the back end service is cleared from the cache, and the service is not forwarded by the service flow any more.

S102, load balancing is carried out on the gateway services deployed in the container cluster and the cloud host cluster according to respective balancing strategies of the two clusters.

In the embodiment of the application, respective balancing strategies of gateway services deployed in a container cluster and a cloud host cluster are determined respectively, and load balancing is performed according to the respective balancing strategies of the two clusters. In some implementations, in order to master the state information in the cluster to process the service traffic in time, the container cluster and the cloud host cluster may be monitored, and the load balancing of the gateway services deployed in the container cluster and the cloud host cluster is performed based on the monitored state information and respective balancing strategies given by the two clusters.

In the container cluster, the status of the pod can be monitored, so that the status information of the pod is obtained, and the load balancing of the gateway service is performed on the container cluster according to the balancing strategy of the container cluster based on the status information of the pod. In some implementations, one of pod expansion and pod contraction may be performed based on a state of the pod. The state of each host can be monitored in the cloud host cluster, so that the state information of the hosts is obtained, and load balancing of gateway services is performed on the cloud host cluster according to the balancing strategy of the container cluster on the basis of the state information of the hosts. In some implementations, one of host expansion and host contraction may be performed based on a state of the host.

Optionally, common load balancing methods include reverse proxy load balancing, Domain Name System (DNS) load balancing, and Content Delivery Network (CDN) load balancing.

In the embodiment of the application, gateway services are deployed on container instances in a container cluster and hosts in a cloud host cluster; and carrying out load balancing on the gateway services deployed in the container cluster and the cloud host cluster according to respective balancing strategies of the two clusters. The embodiment of the application adopts a dual-cluster deployment mode comprising the container cluster and the cloud host cluster, so that the high availability and flexibility of the network are improved, the gateway service in the cluster can be dynamically balanced, the network data processing capacity is enhanced, and the throughput is increased.

Fig. 3 is a flowchart illustrating a method for load balancing gateway services according to an exemplary embodiment, where as shown in fig. 3, on the basis of the foregoing embodiment, the present embodiment proposes a process for load balancing gateway services deployed in a container cluster, and includes the following steps:

s301, performing state monitoring on container instances in the container cluster, and performing load balancing on gateway services deployed in the container cluster based on the monitored state information of the container instances.

In order to master the state information in the cluster and timely process the service traffic, the container cluster and the cloud host cluster can be monitored. Optionally, a monitoring component may be deployed in the container cluster, and the status of the pod is monitored based on the monitoring component, so as to obtain the status information of the pod. Alternatively, the monitoring may be periodic or real-time.

Optionally, the status information of the pod that is monitored may include an average number of requests per second in a minute of the pod, an average CPU utilization rate or a minute memory utilization rate in a minute, and the like. And when the monitored status information of the pod cannot meet the requirement of sending the service flow at the moment, performing dynamic load balancing on the gateway service to realize dynamic addition and reduction of the pod instance number. When the traffic demand of the sending service exceeds the carrying capacity of the pod at the moment, the number of the pods is increased; and when the sending traffic flow demand is lower than the carrying capacity of the pod at the moment, reducing the number of the pod.

Taking the average number of requests/second within a minute of a Pod as an example, if the average number of requests/second within a minute of a Pod is 2000 times/second, and the traffic flow sending requirement at this moment is 3000 times/second, it indicates that the traffic flow sending requirement at this moment exceeds the bearer capacity of the Pod, and the number of pods needs to be increased.

It should be noted that, when determining whether the traffic demand of the sending service at the moment exceeds the carrying capacity of the pod, the determination may be performed based on one of the state information, two of the state information, or three of the state information, so as to implement the lateral extension of the gateway service deployed in the container.

Taking the container cluster as a K8s cluster as an example, as a possible implementation manner, the HPA has a monitoring function, and the K8s cluster can be monitored without other auxiliary components.

As another possible implementation, a dynamic elastic expansion component (Prometheus) may be deployed in the K8s cluster, and status information of pod in the container cluster is collected through the Prometheus component, where Prometheus is an open-source complete monitoring solution and has diversified functions such as centralized rule calculation, unified analysis and alarm. The Prometheus component may collect information such as load monitoring data of the current gateway service pod and provide it to the HPA.

The HPA performs dynamic load balancing on the gateway service based on the monitored state information of the pod and the set balancing strategy so as to realize the horizontal expansion of the gateway service in the container cluster through the HPA mechanism.

As an implementation manner, the method for monitoring the state of the container instance in the container cluster may be also applicable to monitoring the state of the container instance in the cloud host cluster, so as to perform load balancing on the gateway service deployed in the cloud host cluster.

According to the embodiment of the application, the monitoring component is added to monitor the state of the container instance in the container cluster, so that the dynamic load balance of gateway services is realized, the network data processing capacity is enhanced, the throughput is increased, and the high availability and flexibility of the network are improved.

Fig. 4 is a flowchart illustrating a method for load balancing of a gateway service according to an exemplary embodiment, where as shown in fig. 4, the method for load balancing of a gateway service according to this embodiment further includes the following steps:

s401, fault detection is carried out on the container cluster.

In the embodiment of the present application, the container cluster or the cloud host cluster may sometimes be in an unavailable state, and when the container cluster or the cloud host cluster is unavailable, it is considered that the container cluster or the cloud host cluster has a failure. In order to prevent the failure of the container cluster or the cloud host cluster, the failure detection needs to be performed on the container cluster or the cloud host cluster, and the failure of the container cluster or the cloud host cluster can be maintained by the cluster without failure. In implementation, one of the container cluster and the cloud host cluster may serve as a master cluster and one may serve as a slave cluster, and when the master cluster fails, the master cluster may switch to the slave cluster to maintain the continuity of the service.

In the following, the container cluster is taken as a master cluster, and the cloud host cluster is taken as a slave cluster to explain an emergency processing process of a fault scene, and optionally, the fault of the container cluster or the cloud host cluster may include network outage, machine downtime, and the like. Taking the K8s cluster as an example, optionally, whether the K8s cluster fails may be determined according to Prometheus monitoring, for example, when Prometheus monitors that the K8s cluster has no feedback for a new service request, or an existing service request exceeds a preset feedback time, or feedback for request information is a failure, it may be determined that the K8s cluster fails.

S402, responding to the container cluster, sending the service flow needing to be sent to the container cluster to the cloud host cluster, and sending the service flow to the server by the gateway service in the cloud host cluster.

When the embodiment of the application is executed, as an achievable mode, the container cluster can be used as a master cluster, the cloud host cluster is used as a slave cluster, the container cluster is used for preferentially processing the service traffic, and when the container cluster fails, the container cluster can be switched to the cloud host cluster for processing.

As another implementation manner, the cloud host cluster may be used as a master cluster, the container cluster may be used as a slave cluster, the cloud host cluster preferentially processes the service traffic, and when the cloud host cluster fails, the cloud host cluster may be switched to the container cluster to process the service traffic.

The container cluster is taken as a master cluster, the cloud host cluster is taken as a slave cluster for illustration, and as the container cluster and the cloud host cluster can realize load balance, when Prometous monitors that the container cluster has a fault, the cloud host cluster can normally work, service traffic needing to be sent to the container cluster can be sent to the cloud host cluster, and gateway service in the cloud host cluster sends the service traffic to a server, so that the gateway service is still available.

It should be noted that, since the K8s cluster has a data reporting and transmitting format defined by itself, and promemeus also has a data reporting and transmitting format defined by itself, the data formats of the promemeus and the promemeus are incompatible, and data reporting and transmitting cannot be directly performed. For convenience of data reporting and transmission, data protocol conversion needs to be performed on the two formats. In the embodiment of the application, an adapter (adapter) of K8s is added, so that the adapter can respond to custom monitoring data, and the adapter mainly functions to convert the format of data reporting and transmission of Prometheus into the format of data reporting and transmission of K8 s.

In this embodiment, a dual cluster deployment manner is adopted, and when one of the clusters fails, the service flow can be transferred to the other cluster, so that the availability of the gateway service can be ensured, and the high availability and flexibility of the network are improved.

Fig. 5 is a flowchart illustrating a method for load balancing of gateway services according to an exemplary embodiment, and as shown in fig. 5, based on the above embodiment, the method further includes the following steps:

s501, state information of each host in the cloud host cluster is obtained.

When the gateway service in the cloud host cluster sends the service traffic to the server, state information of each host in the cloud host cluster needs to be acquired. Alternatively, the status information of the hosts may include CPU occupancy, memory occupancy, or network request amount of each host, and the like.

S502, determining a target host from the hosts in the cloud host cluster based on the state information of the hosts.

According to the state information of the hosts in the cloud host cluster and according to a preset rule, determining a proper host as a target host. Optionally, when determining the target host, one host may be randomly selected from the cloud host cluster as the target host. Alternatively, when the target host is determined, the determination may be performed based on one of the state information of the host, or may be performed based on any two of the state information, or may be performed based on three kinds of state information. Alternatively, in determining the target host, one host may be selected as the target host based on the deployment location of the hosts. For example, if the preset condition is that a host with a lower CPU occupancy rate is selected as the target host, when the target host is determined, the host with the lowest CPU occupancy rate in the cloud host cluster at this moment may be selected as the target host. For example, if the cloud host cluster includes 8 hosts, and it is detected that 7 hosts are in an operating state and 1 host is in an idle state, the host in the idle state may be used as the target host.

S503, the service flow is sent to the target host, and the gateway service on the target host sends the service flow to the server.

And sending the service flow to be sent to the container cluster to the determined target host, and sending the service flow to the server by the gateway service on the target host.

And determining a target host from the hosts in the cloud host cluster, and when the container cluster fails, sending the service flow to the server by the gateway service of the target host, so that the load balance of the gateway service can be ensured to be in an available state.

Fig. 6 is a flowchart illustrating a method for load balancing gateway services according to an exemplary embodiment, where as shown in fig. 6, load balancing gateway services deployed in a container cluster based on monitoring status information of container instances includes the following steps:

s601, according to the state information, determining a target balance mode of the container cluster and the number of targets to which container instances need to be balanced, wherein the balance mode of the container cluster comprises container instance capacity expansion and container instance capacity reduction, and the target balance mode is one of the container instance capacity expansion and the container instance capacity reduction.

According to the status information of the pod in the container cluster monitored by the Prometous, whether the pod needs to be expanded or contracted can be determined, and the expansion and contraction of the pod are used as the balance mode of the container cluster. Meanwhile, according to the state information of the pod in the container cluster, determining which equilibrium mode the container cluster needs to be adjusted to, taking the equilibrium mode as a target equilibrium mode, determining the quantity of capacity expansion or capacity contraction of the pod, and taking the quantity as a target quantity. The target equalization mode is one of pod expansion and pod contraction, and the pod expansion and the pod contraction cannot occur simultaneously.

It should be noted that, in the conventional method, if a container is a 1-core 4G memory, if the service pressure of the container is high, the container needs to be destroyed and created again when the container configuration is changed from 1-core 4G to 4-core 16G, but in the embodiment of the present application, pod expansion and pod contraction are horizontal, taking pod expansion as an example, according to load balancing, 3 pods may be created again during pod expansion, each pod being 1-core 4G, so that the capacity borne by all the expanded pods is expanded by 4 times, thereby implementing pod expansion.

S602, according to the target balance mode, the number of container instances in the container cluster is adjusted to a target number. And adjusting the pod in the container cluster according to the determined target balance mode of the container cluster and the capacity expansion or the capacity reduction represented by the target balance mode, and adjusting the number of the pod in the container cluster to the target number. For example, if the number of pod in the container cluster is 7 at this time, the determined target equalization mode is pod expansion, and the target number is 10, pod expansion is performed on the container cluster, and the number of pod in the container cluster is adjusted to 10, that is, 3 pod is added on the basis of the original 7 pod.

According to the status information of the pod, the target balancing mode of the container cluster and the target number to which the pod needs to be balanced are determined, dynamic load balancing of gateway services is achieved, network data processing capacity is enhanced, throughput is increased, and high availability and flexibility of the network are improved.

Fig. 7 is a flowchart illustrating a load balancing method for a gateway service according to an exemplary embodiment, where as shown in fig. 7, determining a target balancing mode of a container cluster and a target number of container instances to be balanced according to state information includes the following steps:

s701, acquiring an average load index of the container cluster based on the state information of each container instance.

The state information of each pod may include the average number of requests per second in a minute, the average CPU utilization rate in a minute, the minute memory utilization rate, and the like, and these pieces of state information are used as the load index of the container cluster. It should be noted that, in this embodiment of the present application, regarding whether a container cluster needs to be expanded or contracted, what is seen is not the load of a single pod, but the average load of all pods in the container cluster, and according to the state information of each pod, the average load index of the container cluster may be obtained. Taking the average number of requests per second in minutes as an example, if a container cluster contains 4 pods, the average number of requests per second in minutes for the 4 pods is 200/s, 300/s, 150/s, and 350/s, respectively, then the average load per second in minutes for the 4 pods is 250/s.

S702, in response to the fact that the average load index is larger than a preset index threshold value, determining that a target balance mode is container instance expansion; or, in response to the average load index being less than or equal to the preset index threshold, determining that the target balancing mode is container instance capacity reduction.

In the HPA, a preset index threshold value related to a load index of a container cluster is preset, the size relation between the average load index of the container cluster and the preset index threshold value is judged, and if the average load index is larger than the preset index threshold value, a target balance mode of the container cluster is determined to be pod expansion; and if the average load index is smaller than or equal to the preset index threshold, determining that the target balance mode of the container cluster is pod shrinkage. Taking the index as the average request number/second in minutes as an example, if the preset index threshold of the average request number/second in minutes of the container cluster is 2000/s, and the average load index of the average request number/second in minutes of the container cluster at a certain time is 2500/s, that is, the average load index of the container cluster at the moment is greater than the preset index threshold, the target balance mode of the container cluster is determined to be pod capacity expansion.

S703, determining the target number based on the difference between the average load index and the preset index threshold value.

And determining the target number of the pod in the container cluster according to the determined target balance mode of the container cluster and based on the difference value between the average load index of the container cluster and a preset index threshold value. If the difference value between the average load index of the container cluster and the preset index threshold value is large, the capacity expansion or capacity reduction quantity of the pod in the container cluster is large; if the difference between the average load index of the container cluster and the preset index threshold is small, the pod number in the container cluster is small in expansion or contraction. Taking the index as the average number of requests/second in a minute as an example, if the container cluster comprises 4 pod, the preset index threshold value of the average number of requests/second in a minute of the container cluster is 2000/s, and the average load index of the average number of requests/second in a minute of the container cluster at a certain time is 2500/s, that is, the average load index of the container cluster at the moment is greater than the preset index threshold value, the target balance mode of the container cluster is determined to be pod expansion. To make the average load metric for a container cluster not greater than the preset metric threshold, the target number of pods for that container cluster may be determined to be 5.

The target number is determined according to the difference between the average load index of the container cluster and the preset index threshold, so that the dynamic load balance of the gateway service is realized, the network data processing capacity is enhanced, the throughput is increased, and the high availability and the flexibility of the network are improved.

Fig. 8 is a general flowchart illustrating a method for load balancing of gateway services according to an exemplary embodiment, as shown in fig. 8, including the following steps:

s801, gateway services are deployed on container instances in the container cluster and on hosts in the cloud host cluster.

S802, calling a container instance monitoring control deployed in the container cluster, and monitoring the state of the container instance in the container cluster by the container instance monitoring control to acquire the state information of the container instance.

And S803, in response to the fact that the data format of the container instance monitoring part is incompatible with the data format of the container cluster, performing data protocol conversion on the state information of the container instance sent by the container instance monitoring part, and generating target state information matched with the data format of the container cluster.

S804, based on the state information of each container instance, the average load index of the container cluster is obtained.

S805, judging whether the average load index is larger than a preset index threshold value.

And S806, in response to the average load index being larger than the preset index threshold, determining that the target balancing mode is container instance expansion.

And S807, determining that the target balance mode is the container instance shrinkage capacity in response to the average load index being less than or equal to the preset index threshold.

And S808, determining the target number based on the difference between the average load index and the preset index threshold value.

And S809, adjusting the number of the container instances in the container cluster to a target number according to the target balance mode.

Steps S804 to S809 are specifically described above, and are not described herein again.

And S810, carrying out fault detection on the container cluster.

S811, responding to the container cluster having a fault, acquiring the state information of each host in the cloud host cluster.

And S812, determining a target host from the hosts in the cloud host machine cluster based on the state information of the hosts.

And S813, sending the traffic flow to the target host, and sending the traffic flow to the server by the gateway service on the target host.

The steps S810 to S813 are specifically described above, and are not described again here.

Fig. 9 is a load balancing apparatus for a gateway service according to an exemplary embodiment, and as shown in fig. 9, the load balancing apparatus 900 for a gateway service includes a gateway deployment module 91 and a load balancing module 92. Wherein:

a gateway deployment module 91, configured to deploy gateway services on container instances in the container cluster and on hosts in the cloud host cluster.

And the load balancing module 92 is configured to perform load balancing on the gateway services deployed in the container cluster and the cloud host cluster according to respective balancing strategies of the two clusters.

Fig. 10 is a load balancing apparatus of a gateway service according to an exemplary embodiment, and as shown in fig. 10, the load balancing apparatus 900 of the gateway service further includes a monitoring module 1001, a failure detection module 1002, and a traffic transmission module 1003, wherein,

the monitoring module 1001 is configured to perform state monitoring on container instances in the container cluster, and perform load balancing on gateway services deployed in the container cluster based on the monitored state information of the container instances.

A failure detection module 1002, configured to perform failure detection on the container cluster.

The service sending module 1003 is configured to send, in response to a failure of the container cluster, the service traffic that needs to be sent to the container cluster to the cloud host cluster, and send, by a gateway service in the cloud host cluster, the service traffic to the server.

Fig. 11 is a load balancing apparatus of a gateway service according to an exemplary embodiment, and as shown in fig. 11, a traffic sending module 1003 includes a status information obtaining unit 1101, a target host determining unit 1102, and a traffic sending unit 1103, where:

a state information obtaining unit 1101, configured to obtain state information of each host in the cloud host cluster;

a target host determination unit 1102, configured to determine a target host from hosts in the cloud host machine cluster based on state information of the hosts;

a service traffic sending unit 1103, configured to send the service traffic to the target host, where the gateway service on the target host sends the service traffic to the server.

Further, the monitoring module 1001 is further configured to determine, according to the state information, a target equalization mode of the container cluster and a target number to which the container instances need to be equalized, where the equalization mode of the container cluster includes container instance capacity expansion and container instance capacity reduction, and the target equalization mode is one of container instance capacity expansion and container instance capacity reduction; the number of container instances within the container cluster is adjusted to a target number according to a target balancing pattern.

Further, the monitoring module 1001 is further configured to: and calling a container instance monitoring control deployed in the container cluster, and monitoring the state of the container instance in the container cluster by the container instance monitoring control to acquire the state information of the container instance.

Further, the load balancing device of the gateway service is further configured to, in response to that the data format of the container instance monitoring component is incompatible with the data format of the container cluster, perform data protocol conversion on the state information of the container instance sent by the container instance monitoring component, and generate target state information matched with the data format of the container cluster.

Further, the monitoring module 1001 is further configured to: acquiring an average load index of the container cluster based on the state information of each container instance; in response to the average load index being larger than a preset index threshold, determining that the target balancing mode is container instance expansion; or, in response to the average load index being less than or equal to the preset index threshold, determining that the target balancing mode is container instance capacity reduction; and determining the target number based on the difference between the average load index and the preset index threshold.

In the load balancing device of the gateway service, the gateway service is deployed on a container instance in a container cluster and a host in a cloud host cluster; and carrying out load balancing on the gateway services deployed in the container cluster and the cloud host cluster according to respective balancing strategies of the two clusters. According to the embodiment of the application, the container cluster and the cloud host cluster are combined, dynamic load balance of gateway services is achieved in the two clusters, network data processing capacity is enhanced, throughput is increased, meanwhile, the deployment mode of the double clusters can transfer service flow to the other cluster when one of the two clusters fails, the availability of the gateway services can be guaranteed, and high availability and flexibility of a network are improved.

In order to implement the above embodiments, the present disclosure also provides an electronic device, as shown in fig. 12, an electronic device 1200 includes: a processor 1201; one or more memories 1202 for storing instructions executable by the processor 1201; wherein, the processor 1201 is configured to execute the load balancing method of the gateway service of the above embodiment. The processor 1201 and the memory 1202 are connected by a communication bus.

To implement the above embodiments, the present disclosure also provides a computer-readable storage medium, where instructions in the computer-readable storage medium, when executed by the processor 1201 of the electronic device 1200, enable the electronic device 1200 to execute to complete the load balancing method of the gateway service of the above embodiments. Alternatively, the storage medium may be a non-transitory computer readable storage medium, for example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In order to implement the foregoing embodiments, the present disclosure also provides a computer program product, which includes a computer program, and is characterized in that when the computer program is executed by a processor, the method for load balancing of gateway services according to the foregoing embodiments is implemented.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for load balancing of gateway services, comprising:

deploying a gateway service on container instances within a container cluster and on hosts within a cloud host cluster;

and carrying out load balancing on the gateway services deployed in the container cluster and the cloud host cluster according to respective balancing strategies of the two clusters.

2. The method of claim 1, further comprising:

and monitoring the state of the container instances in the container cluster, and balancing the load of the gateway service deployed in the container cluster based on the monitored state information of the container instances.

3. The method of claim 1, further comprising:

performing fault detection on the container cluster;

and responding to the container cluster with a fault, sending the service traffic needing to be sent to the container cluster to the cloud host cluster, and sending the service traffic to a server by a gateway service in the cloud host cluster.

4. The method of claim 3, further comprising:

acquiring state information of each host in the cloud host cluster;

determining a target host from the hosts in the cloud host machine cluster based on the state information of the hosts;

and sending the service flow to the target host, and sending the service flow to a server by a gateway service on the target host.

5. The method of claim 2, wherein the load balancing the gateway services deployed in the container cluster based on the monitored state information of the container instance comprises:

determining a target balance mode of the container cluster and a target number to which container instances need to be balanced according to the state information, wherein the balance mode of the container cluster comprises container instance capacity expansion and container instance capacity reduction, and the target balance mode is one of the container instance capacity expansion and the container instance capacity reduction;

adjusting the number of container instances within the container cluster to the target number in accordance with the target balancing mode.

6. The method according to any of claims 1-5, wherein said monitoring the status of said container instances within said container cluster comprises:

and calling a container instance monitoring control deployed in the container cluster, and monitoring the state of the container instance in the container cluster by the container instance monitoring control so as to acquire the state information of the container instance.

7. The method of claim 6, further comprising:

and in response to the fact that the data format of the container instance monitoring part is incompatible with the data format of the container cluster, performing data protocol conversion on the state information of the container instance sent by the container instance monitoring part to generate target state information matched with the data format of the container cluster.

8. The method of claim 5, wherein determining a target balancing pattern for the container cluster and a target number of container instances to be balanced based on the state information comprises:

acquiring an average load index of the container cluster based on the state information of each container instance;

in response to the average load index being greater than a preset index threshold, determining that the target balancing mode is capacity expansion of the container instance; or, in response to the average load index being less than or equal to the preset index threshold, determining that the target balancing mode is the container instance reduction;

determining the target number based on a difference between the average load index and the preset index threshold.

9. An apparatus for load balancing of gateway services, comprising:

the gateway deployment module is used for deploying gateway services on container instances in the container cluster and hosts in the cloud host cluster;

and the load balancing module is used for carrying out load balancing on the gateway services deployed in the container cluster and the cloud host cluster according to respective balancing strategies of the two clusters.

10. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

11. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-8.

12. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-8.