CN113946450A - Self-adaptive authorized polling load balancing system for K8S micro service framework - Google Patents

Self-adaptive authorized polling load balancing system for K8S micro service framework Download PDF

Info

Publication number
CN113946450A
CN113946450A CN202111293069.4A CN202111293069A CN113946450A CN 113946450 A CN113946450 A CN 113946450A CN 202111293069 A CN202111293069 A CN 202111293069A CN 113946450 A CN113946450 A CN 113946450A
Authority
CN
China
Prior art keywords
service
downstream
instance
layer
load balancing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111293069.4A
Other languages
Chinese (zh)
Inventor
沃天宇
谢一凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202111293069.4A priority Critical patent/CN113946450A/en
Publication of CN113946450A publication Critical patent/CN113946450A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/545Gui

Abstract

The invention realizes a self-adaptive authorized polling load balancing system and a self-adaptive authorized polling load balancing system for a K8S micro service framework by a method in the field of neural network parameter updating. The method comprises the following steps: the system comprises a service load layer, a service control layer and a user interaction layer, wherein the service load layer is of a three-layer structure from bottom to top, provides weight information acquired from the service control layer, and accesses downstream services by using a weighted polling algorithm; the service control layer calculates the weight proportion which is adopted by each group of micro-services when calling the downstream services by using an algorithm strategy, and sends the proportion to the micro-service example; the user interaction layer provides a web interaction interface. The invention realizes a load balancing system which calculates the optimal weight proportion of the request quantity which should be received between each instance of each group of services under the current state according to indexes such as physical resource load condition, historical service call response time condition and the like in the micro-service cluster, and solves the problems that the traditional load balancing method is empirical, difficult to migrate and depends on architecture optimization.

Description

Self-adaptive authorized polling load balancing system for K8S micro service framework
Technical Field
The invention relates to the technical field of neural network parameter updating, in particular to a self-adaptive authorized polling load balancing system for a K8S micro-service framework.
Background
Most of the early internet backend service architectures are single architectures and are deployed on a single machine. This architecture is limited by the bottleneck of stand-alone performance and cannot cope with increasing access traffic, and program code becomes difficult to maintain as the amount of service code and the degree of coupling between modules increase. Therefore, the backend architecture gradually develops towards distributed and decoupling, and the micro-service architecture is an answer obtained after practical experience. The micro-service architecture is to decouple single software integrated by multiple modules into multiple modules to be respectively and independently deployed, and the modules communicate with each other in a remote network calling mode. Therefore, each module is maintained independently, the capability of the distributed cluster is fully utilized, and the problem of access flow bottleneck is greatly relieved. Kubernetes (hereinafter K8S) is the most widely used microservice management framework in the present day.
The K8S is essentially a container arrangement framework, a Pod is a basic unit managed by the K8S, and a Pod can contain a plurality of containers, and the containers are often cooperated together to complete a business unit, namely, the Pod can be understood as a basic service unit instance. K8S in turn abstracts multiple identical Pod Service instances into Service resources, so-called microservices. Therefore, when calling the downstream Service, the upstream Service does not need to know the details of the specific instance of the Service, and only needs to uniformly request the Service resource of K8S, and then the K8S is responsible for forwarding the request sent to the Service to the specific Pod instance. The mode of the design of K8S can embody the advantages of the micro service architecture, each service instance can be deployed in a distributed mode and in multiple copies, and meanwhile, when the load of the service is increased, the capacity can be rapidly expanded by increasing the copies of the Pod instances.
However, there are some problems with the microservice architecture. Because the logic of each service of the micro-service architecture is too basic and light, a complete business logic needs to be completed by cooperation of a plurality of micro-service program instances, which means that executing a request often generates a call chain or even a call network, thereby naturally causing the increase of the business response time, and the corresponding time of the business directly affects the user experience. Reducing the total response time of a service invocation is therefore a valuable point of research. The prior art is mainly based on a container level, that is, a reasonable placement strategy and a capacity expansion and contraction strategy are designed for an instance of a service, so that each instance of a group of services keeps a low load as much as possible, and the group of services can continuously provide high-quality services. In addition, some software starts from a load balancing strategy of the flow, and provides an interface for modifying the weight for the operation and maintenance personnel of the micro service, so that the access weight of the low-quality container instances in the service is reduced, and the access quantity of the upstream service to the instances with the overhigh load is reduced.
Two feasible ideas are provided for reducing service time delay and improving service quality. One is from the perspective of service instances, and for services with high load, the number of instances of the services is increased, and such schemes usually have good effects, but waste of cluster resources is inevitably caused. In addition, the cluster physical resources are costly, and under the condition that the cluster physical resources are limited, the effect of the instance capacity expansion strategy can be trapped in a bottleneck, and the resources of other instances can be preempted. Therefore, a second class of thinking for improving the service quality is introduced, namely starting from load balancing under the condition of a certain number of service instances, flow access weight proportion among a group of micro service instances is adjusted, and the service quality is further improved. The related prior art solutions starting from the second category of ideas will be briefly described below.
There are several service management systems applied to micro service architecture, such as Istio, Traetik, Kong, etc. that compare the heat in the industry. Since microservice architecture itself is a concept emerging from the industry, it is often of interest to researchers how to make an architecture better serve business products. Some of the above mentioned products are open sources and some have commercial versions, but their functions are all more focused on designing for the concept of "service" in the micro service architecture, for example, they all provide gateway authentication, gray release, etc. But for the "traffic" management implied under the "service" management, especially the load balancing capability of the traffic provides only simple functions, such as a basic polling algorithm and a weighted polling load balancing algorithm by manually configuring weights.
There are some fundamental research works that are more apt to serve business functions than mature products. Hao Zhou et al measures the load of the micro-service according to the average waiting time of the requests in the queue, and preferentially processes the service requests with high priority and discards the service requests with low priority according to the indexes of the service priority and the like which are subjected to fine design, thereby ensuring the service quality of the important service. Ding Z et al focuses on the service invocation link, calculates the deadline of each task by analyzing instance processing speed, network transmission speed and task concurrency to obtain the urgency of the task, and then uses an algorithm based on list scheduling to process the more urgent request to ensure the quality of service, which then obtains good results on the simulator.
The existing technology adopting the weighted polling load balancing algorithm usually needs to manually match the weights of the service instances, and the method based on experience is not timely enough and brings extra workload to development and maintenance personnel. In addition, some techniques for improving the service quality rely on optimizing the conventional K8S micro service architecture, which is not universal and is not convenient for service maintenance and migration. Still other studies on load balancing algorithm strategies have only stayed in the laboratory and have not given a complete system that could be applied to actual production.
Disclosure of Invention
Therefore, the invention firstly provides a self-adaptive right-carrying polling load balancing system for a K8S micro-service framework, which comprises a three-layer structure of a service load layer, a service control layer and a user interaction layer from bottom to top, wherein the service load layer provides weight information acquired from the service control layer for each service instance of micro-service and accesses downstream services by using a right-carrying polling algorithm; the service control layer calculates the weight proportion which is adopted by each group of micro-services when calling the downstream services according to the monitoring data of the service load layer collected by the monitoring module and by using an algorithm strategy, and sends the proportion to the micro-service example; the user interaction layer provides a web interaction interface for monitoring flow calling and weight information of the micro-services in the cluster and service quality conditions, and the usability of the system is improved.
The specific implementation manner of the service load layer is as follows: when the K8S framework accesses the downstream Service through Service IP, the IP address is intercepted by the physical machine where the upstream container is located, and is converted into an actual instance IP according to an IPtables routing policy table configured by K8S; the scheme needs to complete two parts of work, the first part of work is to control the flow of the service outflow to the service agent. This part can be realized by an Iptables tool provided by Linux, and the second part is that the service agent sends traffic to a specific instance downstream, and selects the cloud native network agent Envoy to realize.
The service control layer is composed of a monitoring module and a decision module.
The monitoring module monitors three dimensional indexes: the first dimension is cluster resource monitoring, and physical machine information, service information and instance information contained in service in a cluster are acquired through an Api-server interface provided by K8S; the second dimension is physical resource information, after the information of the first dimension is obtained, the conditions of physical machines and container examples in the cluster are mastered, and then a MetricsServer plug-in provided by a K8S community is used for collecting the CPU load conditions of the physical machines and the examples; the third dimension is service calling condition, a service calling relation directed graph and time consumption of service calling are constructed by monitoring the flow direction condition and calling time of service flow, monitoring is carried out by Envoy network agent software deployed in a service container, and data collection and historical data storage are carried out by Prometheus software.
The decision module calculates the weight proportion, and the specific method comprises the following steps: one service instance A for an upstream service AiInstance B of invoking downstream service BjIs expressed as
Figure BDA0003335600130000041
The type is an integer, where the upstream instance calls the total sum of the weights of all instances of the downstream service to 100, i.e.
Figure BDA0003335600130000042
Upstream instance AiInvoking downstream instance BjResponse time of
Figure BDA0003335600130000043
Formed of two parts, the processing time of the downstream service itself
Figure BDA0003335600130000044
And time consumption of downstream services to continue invoking lower level service chains
Figure BDA0003335600130000045
Namely, it is
Figure BDA0003335600130000046
Time consuming for the downstream service to continue invoking a lower level service chain
Figure BDA0003335600130000047
Using softmax function to schedule calls to various instances of downstream services
Figure BDA0003335600130000048
Converting into a weight factor, then adjusting the previous round of weight by a factor of:
Figure BDA0003335600130000049
wherein k is BjThe number of the physical machine is located, and alpha is a hyper-parameter: CPU occupancy P of a machineiAnd degree of idleness AiThe relationship of (1):
Figure BDA00033356001300000410
where threshold is an empirical threshold, then the idleness of the machine is paired with instance B of service BjThe impact factor of (c) can be simply modeled as:
Figure BDA00033356001300000411
since the sum of the results of the Softmax function is 1 and the sum of the weights is 100, the result is here scaled up by a factor of 100 and the result of the Softmax function scaled up by a factor of 100 is subtracted from the theoretical average of the set of weights, where | B | is the number of instances of service B. Each service completing a request has an inherent CPU consumption and excessive accesses have an additional CPU consumption, which can be modeled by the softmax function:
Figure BDA00033356001300000412
the operation of normalization is not described herein.
The expression for the final weight update according to the above is:
Figure BDA00033356001300000413
the method of (2) assigns an initial value to the parameter:
Figure BDA00033356001300000414
and finally, obtaining the weight matching information of each group of services, and providing the weight matching information to a service load layer, thereby realizing the self-adaptive weighted polling load balancing system.
The user interaction layer provides a relevant command line tool and a visual interface, and the K8S service deployment can realize the Yaml file deployment of the capability of realizing the network agent by additionally injecting scripts into the service container through modification.
The technical effects to be realized by the invention are as follows:
the invention provides an algorithm strategy, which is used for calculating the optimal weight ratio of the request quantity which is required to be received among all instances of each group of services under the current state according to indexes such as physical resource load conditions, historical service call response time conditions and the like in a micro service cluster.
The invention provides a self-adaptive weighted polling load balancing system applied to a K8S framework, which comprises a service load layer, a service control layer and a user interaction layer. And the network agent container injected into the Pod where the service container is located in the service load layer completes the process of accessing the downstream flow by using the authorized polling algorithm. And the service control layer calculates the optimal access weight ratio of each service instance in the micro-service cluster according to the monitoring index. The user interaction layer provides a command line tool and a visual interface, and the usability of the system is improved.
Drawings
FIG. 1 is a schematic diagram of the overall architecture of the system;
FIG. 2 is a schematic diagram of service load layer traffic flow;
FIG. 3 is a schematic diagram of a monitoring sub-module of a service control layer
Detailed Description
The following is a preferred embodiment of the present invention and is further described with reference to the accompanying drawings, but the present invention is not limited to this embodiment.
The invention provides an adaptive weighted polling load balancing system for a K8S micro service framework. The structure of the system is shown in fig. 1, and the system comprises a service load layer, a service control layer and a user interaction layer. In the service load layer, each service instance of the micro service accesses downstream services by using a weighted polling algorithm according to the weight information acquired from the service control layer; the service control layer generates weight information used by a weighted polling algorithm for each group of micro-services by using an algorithm strategy according to the monitoring data of the service load layer collected by the monitoring module; the user interaction layer provides a web interaction interface for monitoring the flow calling and weight information of the micro-service in the cluster and the service quality condition, and improves the usability of the system.
The system calculates the weight proportion that each group of service should adopt when calling the downstream service according to the load condition of the service instance in the current cluster and the algorithm strategy, and sends the proportion to the service instance. And each service instance is matched according to the received weight and uses a weighted polling algorithm to call the service, so that the aim of reducing the total time consumption of service response is fulfilled. Therefore, a layer of guarantee can be provided on the basis that the cluster adopts a container expansion and contraction strategy, and meanwhile, the defect of manual weight setting is overcome by the self-adaptive weight adjustment strategy.
The system realized by the invention is essentially a load balancer, and after the load balancer is deployed into a K8S cluster, service instances in the cluster access downstream services by adopting a self-adaptive weighted polling algorithm, thereby finally achieving the effect of reducing the response time of service call. The system needs to be deployed into the K8S microservice framework and become part of the K8S framework. In an actual production environment, micro-service developers do not need to perform additional operation, and all used auxiliary plug-ins such as auxiliary containers for monitoring service instances and the like can be automatically deployed.
Service load layer:
the service load layer is a place where the micro-service container is deployed and is also a place where the access traffic really forms a load. The service load layer will use the weighted polling algorithm to access the downstream services it needs to access according to the weight ratio received from the service control layer.
K8S provides an abstract Pod concept, where multiple containers (containers) under a Pod are under a network stack, so that the network stack of the Pod where a normal service Container is located can be modified by an extra injected Container, the network access traffic flow direction of the service Container is controlled, and a network proxy is deployed for the Pod, so that all traffic flow of a service can be controlled to the proxy Container, and then the traffic flow is controlled to a specific instance of the downstream service through the proxy Container. As in fig. 2, a container within an upstream pod will access the container service in a downstream instance, with the service having three instances. The load balancing capability provided by the K8S framework itself is that when an upstream container accesses a downstream Service through Service IP, the IP address will be intercepted by the physical machine where the container is located, and converted into an actual instance IP according to the Iptables routing policy table configured by K8S. Compared with a native method of a K8S framework, the scheme adopted by the system transfers the step to the inside of the Pod of K8S for processing, and as the Pod provides a virtualized Linux environment with more perfect functions, the conversion from Service IP to Pod IP can be completed inside the Pod.
This solution requires two parts of work to be done. The first part of the work is to control the flow of service outgoing traffic to the service agents. This may be accomplished through the Linux-provided Iptables tool, which is a user-mode command line tool provided by the Linux kernel for allowing a user to operate the Netfilter firewall in the kernel space. The project uses the Iptables to hijack all traffic flowing out of the space of a specific user to the monitoring port of the agent. The second part of the work is that the service agent sends traffic downstream to a particular instance. Here, first, the type selection of the service agent needs to be considered, and the service agent needs to satisfy the following characteristics: first, the weight is light, and each service Pod has a service agent therein, so that the agent needs to be as light as possible; secondly, the method is configurable, and the forwarding rules of the agent can be conveniently configured outside; and thirdly, the proxy service is native to the cloud, and as the proxy service runs under a micro-service framework, the functions of the proxy service are realized by depending on a network interface instead of a configuration file and the like. In conclusion, a cloud native network agent Envoy is finally selected. Envoy can obtain the traffic forwarding strategy by sending a request to the server, thereby realizing the purpose of specifically forwarding the traffic of the service container according to the rules.
In conclusion, the scheme hijacks the traffic of the service container to the Envoy network proxy container belonging to the service container Pod, and then implements the access policy with right polling in the Envoy container. In addition, the weight matching information of the downstream service is accessed and is also acquired by the service request of the Envoy container to the service control layer. The container traffic hijacking scheme under the micro-service is also a relatively mature solution in the industry.
Service control layer:
the service control layer in the architecture diagram of fig. 1 is basically composed of two parts in terms of functional logic, namely a monitoring part responsible for acquiring cluster state data and a decision part for making a decision according to monitoring data of the monitoring part.
The architecture diagram of the monitoring submodule is shown in fig. 3. The monitoring module needs to monitor three dimensional indicators. The first dimension is cluster resource monitoring, and physical machine information, service information and instance information contained in a service in a cluster are acquired through an Api-server interface provided by K8S. The second dimension is physical resource information, after the information of the first dimension is obtained, the conditions of physical machines and container instances in the cluster can be mastered, and then a MetricsServer plug-in provided by a K8S community is used for collecting the CPU load conditions of the physical machines and the instances. The third dimension is service calling condition, a service calling relation directed graph and time consumption of service calling can be constructed by monitoring the flow direction condition and calling time of service traffic, the index is monitored by Envoy network agent software deployed in a service container, and data collection and historical data storage are performed by Prometheus software. Prometheus is also an open source, sophisticated monitoring assistance component that can access the metrics data interface at intervals and save the metrics data in its own time-series database, while providing sql-like language to query for relevant metrics across time periods. Prometheus also features cloud-native, which is compatible with the K8S framework and allows for automated incorporation of newly created services into the monitoring through simple configuration. And finally, the monitoring submodule of the service control layer can obtain stable cluster state monitoring index data by using the three types of index information, so that a decision module is assisted to make a decision.
And the other part of the decision module of the service control layer is responsible for calculating a reasonable weight ratio for the process of calling the downstream service for each group of services according to the cluster state collected by the monitoring part at the moment. One service instance A for an upstream service AiIn other words, it invokes instance B of downstream service BjIs expressed as
Figure BDA0003335600130000081
The types are integers. Where the upstream instance calls the total sum of the weights of all instances of the downstream service to 100, i.e.
Figure BDA0003335600130000082
Upstream instance AiInvoking downstream instance BjResponse time of
Figure BDA0003335600130000083
Formed of two parts, the processing time of the downstream service itself
Figure BDA0003335600130000084
And time consumption of downstream services to continue invoking lower level service chains
Figure BDA0003335600130000085
Namely, it is
Figure BDA0003335600130000086
The update of the weight parameters also needs to take these two parts into account.
First is the time consumption of reading downstream services to continue invoking lower-level service chains
Figure BDA0003335600130000087
Using a thought process of dynamic programming, i.e. consider instance BjIs already optimal, then only need to be based on
Figure BDA0003335600130000088
Modeling is performed. Here, the call time of each instance of the downstream service is taken using the softmax function
Figure BDA0003335600130000089
Converted into a weighting factor. The softmax function was chosen to take advantage of its constant invariant property:
softmax(X)=softmax(X+C)
since each service will have an inherent processing response time
Figure BDA00033356001300000810
This time assumption is considered notIn addition, the response time increment caused by the excessive extra request
Figure BDA00033356001300000811
Then for each instance downstream call time of the service, the result of modeling it using the softmax function can be considered as the time consumption caused by the influence of excessive access to this instance, and further, a factor for adjusting the previous round of weight is derived:
Figure BDA00033356001300000812
wherein k is BjThe physical machine in which it is located.
The second part is the modeling of the downstream service's own processing time. The largest impact on processing time, except for external IO time, is the number of CPU slots. In an environment where multiple services compete for resources, the more time slices a service can be divided into, the shorter it will be to process a request. The number of time slices a service can be divided into is related to how strongly the physical machines compete, and therefore traffic should be distributed to machines where the physical machine at the downstream instance is relatively idle. A simple function is used to describe the CPU occupation P of a certain machineiAnd degree of idleness AiThe relationship of (1):
Figure BDA0003335600130000091
where threshold is an empirical threshold. The above equation can be simply understood as how idle is dependent on the CPU footprint PiIncreases and decreases linearly, and after a threshold is reached, the complete machine is considered fully busy. Then the machine's idleness is matched to instance B of service BjThe impact factor of (c) can be simply modeled as:
Figure BDA0003335600130000092
since the sum of the results of the Softmax function is 1 and the sum of the weights is 100, the result is here scaled up by a factor of 100 and the result of the Softmax function scaled up by a factor of 100 is subtracted from the average of the set of weights, where | B | is the number of instances of service B. However, it should be noted that distributing traffic to instances where the machine in the downstream service is idle may result in overloading the instance and thus affecting processing time, and an additional penalty is added to avoid this. This penalty term can be modeled by using softmax for the CPU occupancy that services all services. As with request time, there is an inherent CPU consumption per service completion request and an excess of accesses brings an additional CPU consumption, which can be modeled by the softmax function:
Figure BDA0003335600130000093
wherein P isBIs the CPU consumption value of all instances of service B, furthermore
Figure BDA0003335600130000094
Formula (I) is carried out
Figure BDA0003335600130000095
The normalization operations with similar formulas are not repeated herein
The expression for the final weight update according to the above is:
Figure BDA0003335600130000096
learning, an initial value needs to be given to the parameter, and the initial value is simply an average distribution method:
Figure BDA0003335600130000097
finally, according to the formula, the weight matching information of each group of services can be obtained and provided to a service load layer, so that the self-adaptive weighted polling load balancing system is realized.
User interaction layer:
in view of the ease of use of the system, the proposed system also provides a related command line tool and a visualization interface. The command line tool may help facilitate deployment of the present system into the K8S microservice framework, as part of the K8S framework. In addition, the command line tool also provides a service deployment tool, and since the K8S service deployment needs to be deployed through the Yaml file and describe the detailed attributes of the service in the file, the system needs to additionally inject some scripts into the service container to realize the capability of the network proxy, the original Yaml file needs to be modified. By using the command line tool, the microservice developer does not need to perform additional operation, the tool can convert the original Yaml file into the file required by the system, and all the used plug-ins and auxiliary containers can be automatically deployed. The visual web interface provides resource information in the K8S cluster, calling relation of the service and the like.

Claims (7)

1. An adaptive weighted polling load balancing system for a K8S microservice framework, comprising: the micro-service access system comprises a three-layer structure of a service load layer, a service control layer and a user interaction layer from bottom to top, wherein the service load layer provides weight information acquired from the service control layer for each service instance of the micro-service and accesses downstream services by using a weighted polling algorithm; the service control layer calculates the weight proportion which is adopted by each group of micro-services when calling the downstream services according to the monitoring data of the service load layer collected by the monitoring module and by using an algorithm strategy, and sends the proportion to the micro-service example; the user interaction layer provides a web interaction interface.
2. The adaptive weighted polling load balancing system for a K8S microservice framework of claim 1, wherein: the specific implementation manner of the service load layer is as follows: when the K8S framework accesses the downstream Service through Service IP, the IP address is intercepted by the physical machine where the upstream container is located, and is converted into an actual instance IP according to an IPtables routing policy table configured by K8S; the scheme needs to complete two parts of work, the first part of work is to control the flow of the service outflow to the service agent. This part can be realized by an Iptables tool provided by Linux, and the second part is that the service agent sends traffic to a specific instance downstream, and selects the cloud native network agent Envoy to realize.
3. The adaptive weighted polling load balancing system for the K8S microservice framework of claim 2, wherein: the service control layer is composed of a monitoring module and a decision module.
4. The adaptive weighted polling load balancing system for the K8S microservice framework of claim 3, wherein: the monitoring module monitors three dimensional indexes: the first dimension is cluster resource monitoring, and physical machine information, service information and instance information contained in service in a cluster are acquired through an Api-server interface provided by K8S; the second dimension is physical resource information, after the information of the first dimension is obtained, the conditions of physical machines and container examples in the cluster are mastered, and then a MetricsServer plug-in provided by a K8S community is used for collecting the CPU load conditions of the physical machines and the examples; the third dimension is service calling condition, a service calling relation directed graph and time consumption of service calling are constructed by monitoring the flow direction condition and calling time of service flow, monitoring is carried out by Envoy network agent software deployed in a service container, and data collection and historical data storage are carried out by Prometheus software.
5. The adaptive weighted polling load balancing system for the K8S microservice framework of claim 4, wherein: the decision module calculates the weight proportion, and the specific method comprises the following steps: for upstream servicesOne service instance of AiInstance B of invoking downstream service BjIs expressed as
Figure FDA0003335600120000011
The type is an integer, where the upstream instance calls the total sum of the weights of all instances of the downstream service to 100, i.e.
Figure FDA0003335600120000021
Upstream instance AiInvoking downstream instance BjResponse time of
Figure FDA0003335600120000022
Formed of two parts, the processing time of the downstream service itself
Figure FDA0003335600120000023
And time consumption of downstream services to continue invoking lower level service chains
Figure FDA0003335600120000024
Namely, it is
Figure FDA0003335600120000025
Represents modeling of service invocation time, where B is the downstream invocation service of a.
6. The adaptive weighted polling load balancing system for the K8S microservice framework of claim 5, wherein: time consuming for the downstream service to continue invoking a lower level service chain
Figure FDA0003335600120000026
Using softmax function to schedule calls to various instances of downstream services
Figure FDA0003335600120000027
Converting into a weight factor, then adjusting the previous round of weight by a factor of:
Figure FDA0003335600120000028
wherein k is BjThe physical machine is located at the same time as the request time;
the processing time of the service itself
Figure FDA0003335600120000029
The generated weight factor is calculated by the following method: CPU occupancy P of a machineiAnd degree of idleness AiThe relationship of (1):
Figure FDA00033356001200000210
where threshold is an empirical threshold, then the idleness of the machine is paired with instance B of service BjThe impact factor of (c) can be simply modeled as:
Figure FDA00033356001200000211
| B | is the number of instances of service B; each service completing a request has an inherent CPU consumption and excessive accesses have an additional CPU consumption, which can be modeled by the softmax function:
Figure FDA00033356001200000212
wherein P isBCPU consumption values for all instances of service B;
the expression for the final weight update according to the above is:
Figure FDA00033356001200000213
Figure FDA00033356001200000214
and carrying out normalization operation on weight updating, keeping the sum of the weights about 100, and assigning an initial value to the parameters by using an average distribution method:
Figure FDA00033356001200000215
and finally, obtaining the weight matching information of each group of services, and providing the weight matching information to a service load layer, thereby realizing the self-adaptive weighted polling load balancing system.
7. The adaptive weighted polling load balancing system for the K8S microservice framework of claim 6, wherein: the user interaction layer provides a relevant command line tool and a visual interface, and the K8S service deployment can realize the Yaml file deployment of the capability of realizing the network agent by additionally injecting scripts into the service container through modification.
CN202111293069.4A 2021-11-03 2021-11-03 Self-adaptive authorized polling load balancing system for K8S micro service framework Pending CN113946450A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111293069.4A CN113946450A (en) 2021-11-03 2021-11-03 Self-adaptive authorized polling load balancing system for K8S micro service framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111293069.4A CN113946450A (en) 2021-11-03 2021-11-03 Self-adaptive authorized polling load balancing system for K8S micro service framework

Publications (1)

Publication Number Publication Date
CN113946450A true CN113946450A (en) 2022-01-18

Family

ID=79337546

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111293069.4A Pending CN113946450A (en) 2021-11-03 2021-11-03 Self-adaptive authorized polling load balancing system for K8S micro service framework

Country Status (1)

Country Link
CN (1) CN113946450A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115412530A (en) * 2022-08-30 2022-11-29 上海道客网络科技有限公司 Domain name resolution method and system for service in multi-cluster scene
CN117221323A (en) * 2023-11-09 2023-12-12 北京飞渡科技股份有限公司 Service dynamic forwarding method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115412530A (en) * 2022-08-30 2022-11-29 上海道客网络科技有限公司 Domain name resolution method and system for service in multi-cluster scene
CN115412530B (en) * 2022-08-30 2024-01-30 上海道客网络科技有限公司 Domain name resolution method and system for service under multi-cluster scene
CN117221323A (en) * 2023-11-09 2023-12-12 北京飞渡科技股份有限公司 Service dynamic forwarding method
CN117221323B (en) * 2023-11-09 2024-02-02 北京飞渡科技股份有限公司 Service dynamic forwarding method

Similar Documents

Publication Publication Date Title
Zhang et al. Adaptive interference-aware VNF placement for service-customized 5G network slices
CN103207814B (en) Managing and task scheduling system and dispatching method across cluster resource of a kind of decentration
Li et al. SSLB: self-similarity-based load balancing for large-scale fog computing
JP3658420B2 (en) Distributed processing system
CN103516807B (en) A kind of cloud computing platform server load balancing system and method
CN110231976B (en) Load prediction-based edge computing platform container deployment method and system
CN107066319A (en) A kind of multidimensional towards heterogeneous resource dispatches system
CN100440891C (en) Method for balancing gridding load
CN113946450A (en) Self-adaptive authorized polling load balancing system for K8S micro service framework
CN105245617A (en) Container-based server resource supply method
US10303128B2 (en) System and method for control and/or analytics of an industrial process
CN104050042A (en) Resource allocation method and resource allocation device for ETL (Extraction-Transformation-Loading) jobs
Ullah et al. Task classification and scheduling based on K-means clustering for edge computing
Al-Sinayyid et al. Job scheduler for streaming applications in heterogeneous distributed processing systems
CN113806018A (en) Kubernetes cluster resource hybrid scheduling method based on neural network and distributed cache
Petrov et al. Adaptive performance model for dynamic scaling Apache Spark Streaming
CN114356587B (en) Calculation power task cross-region scheduling method, system and equipment
KR101055548B1 (en) Semantic Computing-based Dynamic Job Scheduling System for Distributed Processing
CN116360972A (en) Resource management method, device and resource management platform
CN116244081B (en) Multi-core calculation integrated accelerator network topology structure control system
CN111324460B (en) Power monitoring control system and method based on cloud computing platform
Bali et al. Rule based auto-scalability of IoT services for efficient edge device resource utilization
CN103442087B (en) A kind of Web service system visit capacity based on response time trend analysis controls apparatus and method
Nunes et al. State of the art on microservices autoscaling: An overview
CN103049326A (en) Method and system for managing job program of job management and scheduling system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination