WO2019001092A1 - 负载均衡引擎,客户端,分布式计算系统以及负载均衡方法 - Google Patents

负载均衡引擎,客户端,分布式计算系统以及负载均衡方法 Download PDF

Info

Publication number
WO2019001092A1
WO2019001092A1 PCT/CN2018/083088 CN2018083088W WO2019001092A1 WO 2019001092 A1 WO2019001092 A1 WO 2019001092A1 CN 2018083088 W CN2018083088 W CN 2018083088W WO 2019001092 A1 WO2019001092 A1 WO 2019001092A1
Authority
WO
WIPO (PCT)
Prior art keywords
service
load balancing
policy
information
load
Prior art date
Application number
PCT/CN2018/083088
Other languages
English (en)
French (fr)
Inventor
迟建春
郑伟
王克敏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP18825057.5A priority Critical patent/EP3637733B1/en
Publication of WO2019001092A1 publication Critical patent/WO2019001092A1/zh
Priority to US16/725,854 priority patent/US20200137151A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1023Server selection for load balancing based on a hash applied to IP addresses or costs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/51Discovery or management thereof, e.g. service location protocol [SLP] or web services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/563Data redirection of data network streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5682Policies or rules for updating, deleting or replacing the stored data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5683Storage of data provided by user terminals, i.e. reverse caching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • H04L67/61Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources taking into account QoS or priority requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • H04L67/63Routing a service request depending on the request content or context
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1029Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers using data related to the state of servers by a load balancer

Definitions

  • the present invention relates to the field of electronics, and in particular, to a load balancing engine, a client, a distributed computing system, and a load balancing method.
  • the task scheduling problem refers to: given a set of tasks and several computing nodes that can execute these tasks in parallel, looking for one that can A set of tasks is effectively scheduled to each computing node for calculation to obtain better task completion time, throughput, and resource utilization.
  • Load Balance (LB) is the key factor to be considered when scheduling tasks. It is also the key to optimize distributed computing performance.
  • the load balancing problem is solved well and directly, which directly determines the efficiency and application of distributed computing resources. Performance level.
  • the prior art provides a centralized load balancing scheme.
  • an independent is set between the client 11 and the plurality of computing nodes 13.
  • the load balancer 12, wherein the load balancer 12 can be a dedicated hardware device, such as various load balancing hardware provided by the F5 company, or load balancing software such as LVS, HAproxy, and Nginx.
  • load balancer 12 When the client 11 invokes a certain target service, it initiates a service request to the load balancer 12, and the load balancer 12 forwards the service request to the compute node 13 providing the target service according to a certain load balancing policy.
  • the client 11 needs to discover the load balancer 12 by using a Domain Name System (DNS) 14.
  • DNS Domain Name System
  • the domain name system 14 configures a DNS domain name for each service, and the domain name points to the load balancer 12.
  • the disadvantage of the centralized load balancing scheme is that the traffic of all service calls passes through the load balancer 12.
  • the load balancer 12 easily becomes a bottleneck restricting the performance of the distributed computing system 10. And once the load balancer 12 fails, the impact on the entire distributed computing system 10 is catastrophic.
  • the prior art also provides another client load balancing scheme, which may also be referred to as a Soft Load Balancing scheme, as shown in FIG. 2, in the distributed computing system 20
  • the load balancing LB component 211 is integrated into the service process of the client 21 in the form of a library file.
  • the server 23 provides a Service Registry support service self-registration and self-discovery. When each computing node 22 starts, it first registers with the server 23, registers the address of the service provided by the computing node into the service registry, and each computing node 22 can also periodically report the heartbeat to the service registry to indicate The survival state of the service of the compute node 22.
  • the built-in LB component 211 When the service process in the client 21 wants to access the target service, the built-in LB component 211 first queries the registry to query the address list corresponding to the target service, and then selects a target service address based on a certain load balancing policy, and finally The computing node 22 indicated by the target service address initiates a request. It should be noted that the load balancing policy used in the solution only needs to consider the load balancing problem of the computing node that provides the target service.
  • the drawbacks of the client load balancing scheme are: First, if there are many different language stacks in the development enterprise, correspondingly, it is necessary to develop a variety of different clients, which will significantly increase the cost of R&D and maintenance; secondly, in the customer After the terminal is delivered to the user, if the library file is to be upgraded and the code of the library file is modified, the user's cooperation is required, and the user may not be able to cooperate due to insufficient cooperation.
  • the present invention provides a new load balancing scheme to overcome many of the problems in the prior art.
  • the embodiment of the invention provides a new load balancing solution to solve the problem that the prior art cannot handle the service call of the large traffic, and at the same time, the development cost can be saved and the upgrade and maintenance can be facilitated.
  • an embodiment of the present invention provides a load balancing engine, which is applied to a distributed computing system, and includes: a load information management module, configured to acquire global load information of the distributed computing system, where the global load information indication a load of each of the M computing nodes in the distributed computing system; a service information management module, configured to acquire global service information of the distributed computing system, where the global service information indicates the M computing nodes a type of service provided, where M is a natural number greater than 1; a policy calculation module, configured to perform load balancing calculation using the global load information and the global service information for the first service type, and generate the first a first load balancing policy corresponding to the service type, wherein the first service type is at least one of types of services provided by the M computing nodes, and the first load balancing policy indicates Distribution information of the service message corresponding to the first service type in the M computing nodes; a policy issuing module, configured to A load balancing policy issued to the client.
  • each client independently schedules the service message, thereby avoiding the impact of a large number of service calls on the load balancing engine. Because of the centralized processing of service calls, system failures are caused by the inability to cope with large-volume service calls.
  • developers upgrade distributed systems, they only need to upgrade the load balancing engine to facilitate upgrades. In addition, even if developers use multiple language stacks, they only need to develop a load balancing engine for different language stacks.
  • the client uses common code to call the load balancing strategy released by the load balancing engine, which can save a lot of development costs.
  • the load balancing engine further includes: a service global view, configured to acquire a service invocation relationship between the M computing nodes; the policy calculation module is configured to be used for The first service type is configured to perform load balancing calculation by using the load information, the global service information, and the service calling relationship, to generate the first load balancing policy. Since a service may need to call other services to process the client's service message, if the load of the compute node where the service of the first service type is low, but the load of the compute node where other services invoked by the compute node is high, the service will also be affected.
  • Quality therefore, when generating the first load balancing policy, taking into account the load of the computing node where the service of the first service type is located and the other computing nodes having the calling relationship with the computing node, which is beneficial to improve distributed computing.
  • the overall computing performance of the system reduces service delay.
  • the policy calculation module is specifically configured to: determine, according to the global service information, a target computing node that provides a service of the first service type from the M computing nodes; Determining a service invocation relationship, determining, from the M computing nodes, a related computing node that has a call existence relationship with a service of the first service type provided by the target computing node; determining the target computing according to the global load information Generating a load of the node and the associated computing node and performing load balancing calculation to generate the first load balancing policy.
  • the policy calculation module is specifically configured to generate the first load balancing policy according to an objective function:
  • t(S i ) represents the service delay of the message chain of the i-th service message
  • the policy calculation module is further configured to: perform load balancing calculation by using the global load information, the global service information, and the service calling relationship, based on a preset service delay, and generate a second load balancing policy, the second load balancing policy is used to instruct the M computing nodes to perform service adjustment, and the policy issuing module is further configured to issue the second load balancing policy to the M computing node.
  • the second load balancing policy indicates that a distribution ratio of service messages between at least two computing nodes that have a service invocation relationship is adjusted.
  • the second load balancing policy indicates that the service location between the computing nodes that have the service calling relationship is adjusted.
  • the second load balancing policy indicates that the service between the computing nodes that have the service calling relationship is expanded or deleted.
  • the global load information, the global service information, and the service call relationship are all periodically acquired; the policy calculation module is configured to periodically calculate the first load.
  • the equalization policy or the second load balancing policy is periodically issued by the policy issuing module.
  • an embodiment of the present invention provides a client, which is applied to a distributed computing system, where the distributed computing system includes a load balancing engine and M computing nodes, where M is a natural number greater than 1, and the client
  • the terminal includes: a local cache, configured to acquire and cache a first load balancing policy issued by the load balancing engine, where the first load balancing policy indicates distribution information of a service message of the first service type; and the service management module uses Receiving a first service request; the load policy calculation module is configured to query the local cache, and in a case that the first load balancing policy stored in the local cache matches the first service request, according to the a distribution information indicated by a load balancing policy, and a target computing node that matches the first service request is determined from the M computing nodes; the service management module is further configured to indicate according to the first load balancing policy The distribution information is sent to the target computing node by the service message corresponding to the first service request.
  • the general-purpose code can be used to receive and query the load balancing strategy, thereby saving a lot of research and development costs. Reduced resistance when upgrading distributed computing systems.
  • the embodiment of the present invention further provides a distributed computing system, comprising: the load balancing engine according to the first aspect and any possible implementation manner of the first aspect, and coupled to the load balancing The M compute nodes of the engine.
  • the load balancing engine is responsible for the calculation of the load balancing policy, and each client performs a service call according to the load balancing policy, that is, the calculation and execution of the load balancing policy are separated.
  • the load balancer 12 is difficult to handle large-flow service calls and restricts system performance is avoided.
  • the distributed computing system further includes: the client as described in the second aspect.
  • the distributed computing system further includes: a registration server, configured to collect the global service information of the M computing nodes, and send the global service information to the load balancing engine.
  • the distributed computing system further includes:
  • a monitoring module configured to obtain the global load information by sending the load of the M computing nodes, and send the global load information to the load balancing engine.
  • the distributed computing system further includes: a management node, configured to receive the second load balancing policy sent by the load balancing engine, and according to the second load balancing policy, The M computing nodes perform service adjustment.
  • the embodiment of the present invention further provides a load balancing method, which is applied to a distributed computing system, including: acquiring global load information of the distributed computing system, where the global load information indicates the distributed computing The respective load of the M computing nodes in the system; obtaining the global service information of the distributed computing system, the global service information indicating the type of the service provided by the M computing nodes, where M is greater than 1 a first load balancing policy corresponding to the first service type, where the first service type is generated by using the global load information and the global service information for load balancing calculation And at least one of types of services provided by the M computing nodes, the first load balancing policy indicating distribution information of a service message corresponding to the first service type; balancing the first load The policy is released to the client.
  • a load balancing method which is applied to a distributed computing system, including: acquiring global load information of the distributed computing system, where the global load information indicates the distributed computing The respective load of the M computing nodes in the system; obtaining the global service information of the
  • the load balancing engine is responsible for the calculation of the load balancing policy, and each client performs a service call according to the load balancing policy, that is, the calculation and execution of the load balancing policy are separated, thereby avoiding It is difficult to deal with high-traffic service calls and restrict system performance.
  • the load balancing engine can be used. In subsequent upgrades, only the load balancing engine needs to be updated, which can save development costs and reduce upgrade resistance.
  • the method further includes: acquiring a service invocation relationship between the M computing nodes; and performing load balancing on the first service type by using the global service information and the service calling relationship.
  • Calculating, the step of generating a first load balancing policy corresponding to the first service type includes: performing load on the first service type, using the global load information, the global service information, and the service calling relationship The equalization calculation generates the first load balancing policy.
  • the step of generating the first load balancing policy by using the global load information, the global service information, and the service calling relationship to perform load balancing calculation for the first service type includes: determining, according to the global service information, a target computing node that provides a service of the first service type from the M computing nodes; determining, according to the service calling relationship, from the M computing nodes The service of the first service type provided by the target computing node has a related computing node of the calling relationship; determining, according to the global load information, the load of the target computing node and the related computing node, and performing load balancing calculation, Generating the first load balancing policy.
  • the method further includes: performing load balancing calculation by using the global load information, the global service information, and the service calling relationship to generate a second load based on a preset service delay.
  • a balancing policy the second load balancing policy is used to instruct the M computing nodes to perform service adjustment; and the second load balancing policy is advertised to the M computing nodes.
  • the second load balancing policy indicates that a distribution ratio of service messages between at least two computing nodes that have a service invocation relationship is adjusted.
  • the second load balancing policy indicates that the service location between the computing nodes that have the service calling relationship is adjusted.
  • the second load balancing policy indicates that the service between the computing nodes that have the service calling relationship is expanded or deleted.
  • the embodiment of the present invention further provides a load balancing method, which is applied to a client in a distributed computing system, where the method includes: acquiring and buffering a first load balancing policy issued by the load balancing engine, The first load balancing policy indicates distribution information of a service message of a first service type; receiving a first service request; querying the first load balancing policy of the cache, the first load balancing policy that is cached, and the If the first service request is matched, determining, according to the distribution information indicated by the first load balancing policy, a target computing node that matches the first service request from the M computing nodes; The distribution information indicated by the load balancing policy is sent to the target computing node by the service message corresponding to the first service request.
  • the embodiment of the present invention further provides a load balancing method, which is applied to a computing node or a management node in a distributed computing system, where the method includes: receiving a second load balancing policy sent by a load balancing engine; The second load balancing strategy is performed to perform service adjustment on the computing node.
  • FIG. 1 is a schematic diagram of a centralized load balancing solution provided by the prior art
  • FIG. 2 is a schematic diagram of another load balancing scheme provided by the prior art
  • FIG. 3 is a structural diagram of a distributed computing system according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of each device in the distributed computing system shown in FIG. 3;
  • FIG. 5 is a schematic diagram of a call relationship between services according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of load balancing for a computing node having a service invocation relationship according to an embodiment of the present invention
  • FIG. 7a and 7b are schematic diagrams of adjusting a distribution ratio of a service message according to an embodiment of the present invention.
  • FIG. 8a and 8b are schematic diagrams of adjusting a service location according to an embodiment of the present invention.
  • 9a, 9b, and 9c are schematic diagrams of expanding a service according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of another apparatus for load balancing engine according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic diagram of a device of a client according to an embodiment of the present disclosure.
  • FIG. 12 is a schematic flowchart of a load balancing method applied to a load balancing engine according to an embodiment of the present disclosure
  • FIG. 13 is a schematic flowchart of a load balancing method applied to a client according to an embodiment of the present disclosure
  • FIG. 14 is a schematic flowchart of a load balancing method applied to a computing node (or a management node) according to an embodiment of the present invention.
  • Embodiments of the present invention provide an architectural diagram of a distributed computing system.
  • the distributed computing system 30 can include a client 31, a load balancing engine 32, and a service provider 33 including M computing nodes, wherein the client 31, the load balancing engine 32 And each of the computing nodes in the service provider 33 communicates with each other via the network 34, and M is an integer greater than one.
  • the network 34 may be a wired network, a wireless network, a local area network (LAN), a wide area network (WAN), and a mobile communication network.
  • the client 31 can access the network 34 through an Access Point (AP) 341 to communicate with any of the load balancing engine 32 or the service provider 33.
  • AP Access Point
  • a computing node is usually deployed in a cluster, that is, all computing nodes in the distributed computing system can be divided into multiple clusters, and the M computing nodes in this embodiment are It can be a compute node in all clusters or a compute node in one or more of the clusters.
  • FIG. 4 further illustrates the internal structure of the various devices in distributed computing system 30.
  • the distributed computing system 30 is further described below in conjunction with FIG.
  • the load balancing engine 32 may include: a load information management module 321, a service information management module 322, a policy calculation module 323, and a policy release module 324;
  • the load information management module 321 is configured to acquire global load information of the distributed computing system 30, where the global load information indicates respective loads of the M computing nodes in the service provider 33;
  • the service information management module 322 is configured to obtain global service information, where the global service information indicates a type of service provided by the M computing nodes, where each computing node can provide at least one type of service, Skilled artisans will appreciate that in a distributed computing system, a compute node can be a personal computer, workstation, server or other type of physical machine, or it can be a virtual machine. The services on the compute nodes usually run on physical or virtual machines in the form of processes. Therefore, multiple services can usually be provided on one compute node.
  • the policy calculation module 323 is configured to perform load balancing calculation by using the global load information and the global service information for the first service type, to obtain a first load balancing policy corresponding to the first service type, where
  • the first service type is at least one of types of services provided by the M computing nodes, and the first load balancing policy indicates that the service message corresponding to the first service type is in the M calculations
  • Distribution information in the node the distribution information includes: a distribution object, or the distribution information includes: a distribution object and a distribution ratio; exemplarily, in an application scenario, if the calculation node 331, the calculation node 332, and the calculation The node 333 provides the service of the first service type.
  • the policy calculation module 323 learns that the load of the computing node 331 and the computing node 332 is too high according to the global load information, and then generates a first load balancing policy to indicate that the computing node 333 is used as the node 333. a distribution object of the service message with the first service type; in another application scenario, the same calculation node 331 is also assumed Both the compute node 332 and the compute node 333 provide the service of the first service type.
  • the policy calculation module 323 learns that the load of the compute node 331 is too high according to the global load information, and the compute node 332 and the compute node 333 can provide a part of the processing capability, respectively.
  • a first load balancing policy may be generated to indicate that the computing node 332 and the computing node 333 are used as distribution objects of the service message with the first service type, and the first load balancing policy may also indicate the computing node 332 and the computing node 333, respectively.
  • the policy issuing module 323 is configured to issue the first load balancing policy to the client 31.
  • the client 31 may include:
  • a local cache 311, configured to acquire and cache a first load balancing policy issued by the load balancing engine 32, where the first load balancing policy indicates distribution information of a service message of the first service type;
  • the service management module 312 is configured to receive a first service request of the client.
  • the load policy calculation module 313 is configured to determine whether the first load balancing policy stored by the local cache 311 matches the first service request by querying the local cache 311 in response to the first service request. When the first load balancing policy stored by the local cache 311 matches the first service request, determining, according to the distribution information indicated by the first load balancing policy, from the M computing nodes The target computing node that matches the first service request should know that, because the first load balancing policy is specified for the first service type, and when the service corresponding to the first service request also belongs to the first service type, the load policy calculation module 313 It can be considered that the first load balancing policy matches the first service request, and vice versa, that is not matched;
  • the service management module 312 is further configured to send, according to the distribution information indicated by the first load balancing policy, a service message corresponding to the first service request to the target computing node, so that the target node responds The first service requests and provides a service.
  • the relationship between the service request and the service message is briefly described.
  • the client 31 requests the service provider 33 to provide the service, it first needs to make a service request to the service provider 33, when the service provider 33 responds to the request. Then, the corresponding service message is sent to the service provider 33, and the service provider 33 performs corresponding operation processing, and finally the processed result is fed back to the client 31 to complete the service.
  • the load balancing engine 32 is responsible for the calculation of the load balancing policy, and each client 31 performs a service call according to the load balancing policy, that is, the calculation and execution of the load balancing policy. Separation is performed to avoid the problem that the load balancer 12 is difficult to handle large-flow service calls and restrict system performance in the centralized load-side equalization scheme shown in FIG. 1.
  • the load balancing engine 32 may directly collect the load information of the M computing nodes, and obtain the global load information by collecting the collected load information.
  • a monitoring module 35 for monitoring the working status of the M computing nodes for example, a metric monitoring module, is obtained in the computing system 30, and the global load information is acquired by the monitoring module 35.
  • the global load information collected from the M computing nodes includes but is not limited to the following types of loads:
  • resource occupancy information such as: CPU (CPU) occupancy rate, memory usage, network bandwidth and other occupancy rates;
  • Throughput information for example, the number of service messages received by each service per unit time, and the number of service messages sent by each service per unit time and the number of objects to be sent;
  • Service delay information such as: average processing delay of service messages, average waiting delay of service messages before processing, communication delay between services, etc.
  • processing delay of a service message is as follows Several factors are related: 1. The capabilities of the physical hardware such as the central processing unit (CPU) or input/output device (I/O) of the computing node where the service is located. 2. Whether other types of services are occupied on the computing node where the service is located.
  • the sampling is determined within a certain period of time; and the communication delay of the service message is related to the following factors: 1.
  • the network capability of the computing node where the service is located for example, whether the network bandwidth is 1 GB or 10 GB, 2.
  • the service is located Whether the network of the computing node has other services preempted, 3, the communication distance between the two services, for example, when the two services are on the same computing node, Minimum signal delay, the inter-node communication computing take longer communication delay, communication across the data center, the longer communication delay;
  • Remaining resource information for example, the remaining condition of the physical resources of the computing node where the service is located.
  • the service information management module 322 may separately collect service information from the M computing nodes, and then summarize the collected service information to obtain the global service information; or may be configured in the distributed computing system 30.
  • the service register 34 obtains the global service information of the M computing nodes, and each computing node registers the service information of the computing node to the registration server 34 upon initialization.
  • the global service information may specifically include: service group information and deployment information, where the service group information indicates the service deployed on each computing node, and the grouped information is grouped according to the service type, and the deployment information indicates The processing power of the services deployed on each compute node and the total processing power of each compute node.
  • a service deployed on a computing node is usually called a service instance, and refers to a specific running entity of a service on a computing node
  • a service group refers to a service type (Service Type). A collection of several instances that together provide a service.
  • the policy calculation module 323 may use the global load information and the global service information for the first service type, and perform load balancing calculation based on a preset load balancing algorithm to obtain the first service.
  • the type of load balancing policy that corresponds to the type.
  • the load balancing algorithm adopted by the policy calculation module 323 can be generally divided into two types: a static load balancing algorithm and a dynamic load balancing algorithm.
  • the static load balancing algorithm may include:
  • Round Robin In each round, the M computing nodes are sequentially queried. When one of the computing nodes is overloaded or faulty, the computing node is taken out from the sequential cyclic queue consisting of the M computing nodes, and does not participate in the next polling until the computing node returns to normal;
  • Ratio Set a weighting value for each computing node to represent the proportion of message allocation. Based on this ratio, assign the service message sent by the client to each computing node, and when one of the computing nodes When an overload or fault occurs, the computing node is taken out from the queue composed of the M computing nodes, and the next service message is not allocated until it returns to normal;
  • Priority grouping the M compute nodes, setting different priorities for each group, and then assigning the client's service message to the highest priority compute node group (in the same compute node group,
  • the service request is sent by using a polling or proportional algorithm.
  • the service request is sent to the computing node group with the second highest priority.
  • the dynamic load balancing algorithm can include:
  • the service message is allocated to the computing nodes with the least connection. When the computing node with the least connection is overloaded or faulty, the computing node is not allowed to participate in the next service message. Allocating until it returns to normal, where the connection refers to the communication connection maintained between the client and the computing node in order to receive or send a service message, and the number of connections is proportional to the throughput of the computing node;
  • the fastest mode the service message is assigned to the fastest responding computing node processing, when a response to the fastest computing node overload or failure, the computing node is not allowed to participate in the next service message Assigned until it returns to normal, where the response time of each compute node includes the time at which the service message was received and sent, and the time at which the service message was processed. It should be known that the faster the response, the shorter the time that the compute node processes the service message. Or, the shorter the communication time between the computing node and the client;
  • Predictive mode collect current performance indexes of the M computing nodes, perform prediction analysis, and allocate service messages to the performance computing node to achieve optimal computing node processing in the next time period;
  • Dynamic performance allocation (DynamicRatio-APM): collecting performance parameters of the M computing nodes in real time and analyzing, and dynamically distributing service messages according to the performance parameters;
  • Dynamic Computing Act Set a part of the M computing nodes as the main computing node group, and the rest as backup computing nodes. When the number of the main computing node group is reduced due to overload or failure, the dynamic The backup compute node is supplemented to the primary compute node group.
  • the load balancing algorithm in this embodiment includes, but is not limited to, the above algorithm.
  • it may be a combination of the above several algorithms, or may be an algorithm specified by a client according to a customized rule, and Various algorithms used in the prior art.
  • the load balancing algorithm may be imported into the policy calculation module 323 by the load balancing algorithm plug-in 326 and participate in the calculation of the load balancing policy.
  • the operator of the distributed computing system can participate in maintenance in a more convenient manner, for example, the load balancing algorithm is updated by the load balancing algorithm plug-in 326 to implement system upgrade.
  • a service itself is a provider of services and a consumer of services, especially with the rapid development of distributed microservices.
  • the micro-service message chain depth is generally greater than 1, wherein the message chain is greater than 1, indicating that a service needs to call at least one other service.
  • Service A when Service A is regarded as a consumer of a service, Service A needs to call Service B, and Service B depends on Service C to provide the service. Therefore, Service A has a message chain of 2.
  • each service only pays attention to the load of the next level of service, that is, when Service A calls two Service Bs, its load balancing policy only considers 2 services.
  • Service B calls Service C, it will be based on three
  • the load of Service B is load balancing independently. That is to say, the current distributed computing system calculates the load balancing policy and does not pay attention to the overall load balancing of Service A ⁇ Service C.
  • the distributed computing system 30 shown in FIG. 4 may further include: a service global view 325, configured to acquire a service invocation relationship between the M computing nodes.
  • a service global view 325 configured to acquire a service invocation relationship between the M computing nodes.
  • Service A, Service B, and Service C shown in FIG. 4 may be provided by the same computing node, or may be provided by different computing nodes respectively. Therefore, the service obtained by the service global view 325 is obtained.
  • the calling relationship includes both the calling relationship between different services on the same computing node and the service calling relationship between different computing nodes.
  • Service A calls Service B, which means that Service A needs to provide a complete service, depending on some of the services provided by Service B.
  • the policy calculation module 323 may be specifically configured to perform load balancing calculation on the first service type, using the global load information, the global service information, and the service calling relationship, to generate the first Load balancing strategy.
  • the policy calculation module 323 may be specifically configured to:
  • the calculation node is only for convenience of description. It should be understood that the target computing node and the related computing node refer to the computing nodes corresponding to the first service type in the M computing nodes.
  • the service of the first service type is Service A
  • the policy calculation module 323 in the load balancing engine 32 determines the Service A computing node 1 and the computing node 2 as the target computing nodes respectively;
  • the policy calculation module 323 can also call the relationship according to the obtained service, because the service node provided by the computing node 1 or the computing node 2 has a calling relationship with the service node 4 and the service C provided by the computing node 5.
  • the computing node 4 and the computing node 5 are determined as related computing nodes; next, the policy calculating module 323 can obtain the target computing node and the related computing node (ie, the computing node 1 and the computing node 2 according to the global load information, The respective loads of the node 4 and the computing node 5) are calculated, and then the first load balancing policy is generated by load balancing calculation. Subsequently, when the client 31 issues a service request to the service A, the load balancing engine 32 can respond to the service request and determine the distribution information of the service message corresponding to the service request according to the first load balancing policy.
  • the load balancing engine may also generate a one-to-one load balancing policy for each service type provided by the distributed computing system to support scheduling of service messages of different service types from different clients, where
  • the method of generating a load balancing policy corresponding to other service types refer to the method for generating the first load balancing policy, and details are not described herein.
  • the following uses the observation mode as a load balancing algorithm as an example to illustrate the calculation of the load balancing strategy:
  • represents a set of n service messages, ie, a message flow
  • ⁇ i represents an i-th service message
  • i and n are both natural numbers, and 1 ⁇ i ⁇ n;
  • S denotes a set of message chains of n service messages
  • S i denotes a set of message chains of the i-th service message
  • S i ki denotes a k-th service in the message chain of the i-th service message
  • k is a natural number
  • the message chain refers to a link formed by all the services to be invoked when processing any service message in the distributed computing system, and the message chain can be determined by the service call relationship obtained by the service global view 325;
  • t(S i ) represents the total time taken for the message chain of the i-th service message, ie the service delay, Represents the processing delay required in the message chain of the i-th service message, Representing the required communication delay in the message chain of the i-th service message, the processing delay and the communication delay may be determined by the global load information acquired by the load information management module 321;
  • the policy calculation module 323 is further configured to perform load balancing calculation by using the global load information and the global service information to generate a second load balancing policy based on a preset service delay.
  • the second load balancing policy is a policy for indicating service adjustment in the M computing nodes.
  • the policy issuing module 324 may be further configured to advertise the second load balancing policy to the M. Compute nodes.
  • the service delay refers to the node receiving the service message from the response service request, and processing The time spent on the service message and the entire process of returning the processed service message, the service delay can also be called end-to-end delay.
  • the service delay may be specifically set according to different services' tolerance to delay.
  • the service delay may be set based on the principle of minimum end-to-end delay.
  • the delay service can be used to integrate the overall performance of the distributed computing system and the tolerance of the high-latency service to the delay, and set the service delay, which is not specifically limited herein.
  • the second load balancing policy may indicate that a message distribution ratio of services of at least two computing nodes that have a service invocation relationship is adjusted. This is further explained below in connection with Figures 7a and 7b.
  • Service A1 and Service A2 may be two different types of services, or two services of the same type, Service B1.
  • Service B2 and Service B3 are the same type of service.
  • Service B1 in 331, Service B2 in the computing node 332 and Service B3 in the computing node 333 have a calling relationship; in addition, Service A1 and Service A2 can send 3000 messages (3000msg/s) to Service B1, respectively.
  • Service B2 and Service B3, and 3000 messages are equally distributed between Service B1, Service B2, and Service B3, while Service B1, Service B2, and Service B3 have processing capabilities of 2000 messages per second (2000 msg/s).
  • a total of 4000 messages need to be spanned. The manner in which the nodes communicate is transmitted.
  • the communication delay between the services in the same computing node is much smaller than the delay of the communication across the nodes. Therefore, the message is transmitted according to the message distribution ratio shown in FIG. 7a. This can result in large delays affecting the performance of distributed computing systems.
  • the policy calculation module may generate a second load balancing policy based on the preset service delay to indicate that the message distribution ratios of the services of the computing node 331, the computing node 332, and the computing node 333 are adjusted.
  • the service A1 may be sent to the service B1 located in the same computing node, and the 1000 message may be sent to the service B2 in the computing node 332.
  • the service may be indicated.
  • A2 sends the 2000 message to Service B3 located in the same computing node, and sends the 1000 message to Service B2 in the computing node 332.
  • only 2000 messages need to be sent in a cross-node communication manner, and in FIG.
  • the second load balancing policy may indicate that the service location between the computing nodes having the service invocation relationship is adjusted. This is further explained below in connection with Figures 8a and 8b.
  • Service B1 and Service B2 are two types of services
  • Service C1, Service C2 and Service C3 are another type.
  • Service A1 in the computing node 331 has a calling relationship with Service B1 and Service B2 in the computing node 332, respectively, and Service B1 and Service B2 respectively have a calling relationship with Service C1, Service C2 and Service C3 in the computing node 333.
  • each service can only perceive the load information of the service that is called next, that is, Service A can only perceive the load of Service B1 and Service B2 (ie, compute node 332) in the existing load balancing policy.
  • Service B1 and Service B2 Under the average allocation is the best load balancing strategy, so the 3000 information sent by Service A per second is distributed to Service B1 and Service B2 on average. Similarly, Service B1 and Service B2 also send their respective 1500 messages to Service C1, Service C2 and Service C3 on average.
  • the load balancing engine may integrate the load of the computing node 332 and the computing node 333, and the preset service delay.
  • a second load balancing policy is calculated to indicate adjustments to the locations of services deployed on compute nodes 332 and compute nodes 333.
  • the second load balancing policy may be configured to deploy the Service C1 originally deployed on the computing node 333 to the computing node 332, and deploy the Service C2 originally deployed on the computing node 332 to the computing node 333.
  • Service A can distribute 2000 messages to Service B2, and then Service B2 averages. It is distributed to Service C2 and Service C3, and the remaining 1000 messages are distributed to Service B1, which is then distributed to Service C1 by Service B1. It is not difficult to see from Figure 8a that a total of 6000 messages require cross-node communication. In FIG. 8b, after the service location adjustment according to the second load balancing policy, only 3000 messages are required to communicate across nodes, which significantly reduces the communication delay of the distributed computing system.
  • the second load balancing policy may further indicate that the service between the computing nodes that have the service calling relationship is expanded or deleted. This is further explained below in connection with Figures 9a to 9c.
  • the message sending path of Service A is: Service A ⁇ Service B ⁇ Service C ⁇ Service D ⁇ Service E.
  • the load balancing engine may indicate that the same type of Service B1 as the Service B is extended in the message sending path. Share the load. Further, the load balancing engine may determine whether to extend Service B1 on the computing node 331 or the computing node 332 according to the global service information and the service calling relationship, in consideration of the service delay minimum principle, if the Service B1 is extended on the computing node 332.
  • the load balancing engine may indicate that the service B1 is extended on the computing node 331 by the second load balancing policy to share the load to avoid causing additional cross-node communication.
  • the load balancing engine may indicate that the computing node 332 is expanding.
  • a service B2 of the same type as Service B is used to share the load of Service Bd, and the service Service C2 of the same type as Service C is extended.
  • Figure 9b no additional cross-node communication is added, just because there are many compute nodes 332.
  • a Service C2 has been extended, and the delay in communication within the node has increased slightly.
  • the global load information is The global service information and the service call relationship are periodically acquired.
  • the policy calculation module 323 is also configured to periodically calculate the first load balancing policy and the second load balancing policy. And periodically issued by the policy issuing module 323.
  • the client 31 may also periodically acquire the first load balancing policy issued by the load balancing engine 32, and the computing node (or the management node) may also periodically obtain the foregoing issued by the load balancing engine 32.
  • the second load balancing strategy is also periodically acquire the first load balancing policy issued by the load balancing engine 32, and the computing node (or the management node) may also periodically obtain the foregoing issued by the load balancing engine 32.
  • each module in the load balancing engine 32 can be implemented by an integrated circuit (LSI), a digital signal processor (DSP), a field programmable gate array (FPGA), a digital circuit, or the like.
  • each module in the client 31 can also be implemented by the above hardware.
  • load balancing engine 400 can be implemented using a general purpose computing device as shown in FIG.
  • elements in load balancing engine 400 may include, but are not limited to, system bus 410, processor 420, and system memory 430.
  • System bus 410 may include an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an Extended ISA (EISA) bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.
  • ISA Industry Standard Architecture
  • MCA Micro Channel Architecture
  • EISA Extended ISA
  • VESA Video Electronics Standards Association
  • PCI Peripheral Component Interconnect
  • System memory 430 includes volatile and non-volatile memory such as read only memory (ROM) 431 and random access memory (RAM) 432.
  • ROM read only memory
  • RAM random access memory
  • a basic input/output system (BIOS) 433 is typically stored in ROM 431, which contains basic routines that facilitate the transfer of information between the various components through system bus 410.
  • BIOS basic input/output system
  • RAM 432 typically contains data and/or program modules that are immediately accessible and/or immediately operational by processor 420.
  • the data or program modules stored in the RAM 432 include, but are not limited to, an operating system 434, an application 435, other program modules 436, program data 437, and the like.
  • the load balancing engine 400 may also include other removable/non-removable, volatile/non-volatile storage media.
  • the hard disk storage 441 which may be a non-removable and non-volatile readable and writable magnetic medium external memory 451 may be a variety of external storage such as a compact disk and a non-volatile memory such as a compact disk, a magnetic disk, or a flash memory.
  • the hard disk storage 441 is generally connected to the system bus 410 through a non-removable storage interface 440, and the external memory is generally connected to the system bus 410 through the detachable storage interface 450.
  • hard drive 441 illustrates storage operating system 442, application 443, other program modules 444, and program data 445. It should be noted that these elements may be the same or different from operating system 434, application 435, other program modules 436, and program data 437 stored in system memory 430.
  • the functions of the respective modules in the foregoing embodiment and the load balancing engine 32 shown in FIG. 4 may be read and executed by the processor 420 by executing or executing code or readable instructions stored in the above storage medium. achieve.
  • I/O device 471 typically communicates with processor 420 via input/output interface 460.
  • the client can provide a customized load balancing algorithm to the processor 420 through the I/O device 471, so that the processor 420 calculates a load balancing policy according to the custom load balancing algorithm. .
  • the load balancing engine 400 can include a network interface 460 through which the processor 420 communicates with the remote computer 461 (i.e., the compute nodes in the distributed computing system 30) to obtain the global load information described in the previous embodiments.
  • the service information and the service invocation relationship and based on the information, calculate a load balancing policy (the first load balancing policy and the second load balancing policy) by executing instructions in the storage medium; and then publishing the calculated load balancing policy to the client or calculate node.
  • the client can be implemented using the structure shown in FIG. As shown in FIG. 11, the client 500 can include a processor 51, a memory 52, and a network interface 53, wherein the processor 51, the memory 52, and the network interface 53 communicate with each other through the system bus 54.
  • the network interface 53 is configured to obtain a first load balancing policy issued by the load balancing engine and cached in the memory 52, where the first load balancing policy indicates the distribution information of the service message of the first service type;
  • the processor 51 is configured to receive the first service request, and query the memory 52 to determine whether the first load balancing policy stored by the memory 52 matches the first service request.
  • the processor 51 is further configured to: when the first load balancing policy stored by the memory 52 matches the first service request, according to the distribution information indicated by the first load balancing policy, from the M And determining, by the computing node, the target computing node that matches the first service request, and sending, according to the distribution information indicated by the first load balancing policy, the service message corresponding to the first service request by using the network interface 53 Calculate the node for the target.
  • the processor 51 performs the corresponding functions according to the instructions stored in the memory 52 or other storage devices.
  • the client shown in FIG. 11 adopts a general-purpose computer structure, and the foregoing computing node may also be a physical machine adopting a general-purpose computer structure. Therefore, the structure of the computing node may also refer to the structure shown in FIG. Only the processor performs different functions, and will not be described again.
  • the various services provided by the computing node that is, the various processes running on the processor.
  • an embodiment of the present invention further provides a load balancing method, which is applied to a distributed computing system as shown in FIG. 3 or FIG. 4.
  • the method includes the following steps:
  • S101 Acquire global load information of the distributed computing system, where the global load information indicates a load of each of the M computing nodes in the distributed computing system;
  • S102 Obtain global service information of the distributed computing system, where the global service information indicates a type of service provided by the M computing nodes, where M is a natural number greater than 1.
  • S104 Perform load balancing calculation by using the global load information and the global service information, and generate a first load balancing policy corresponding to the first service type, where the first service type is And at least one of the types of services provided by the M computing nodes, the first load balancing policy indicating distribution information of a service message corresponding to the first service type;
  • S105 Publish the first load balancing policy to the client.
  • the method can also include
  • S105 Acquire a service call relationship between the M computing nodes.
  • step S104 may include:
  • the global load information, the global service information, and the service invocation relationship are used for load balancing calculation to generate the first load balancing policy.
  • step S104 may include:
  • the method may further include:
  • S106 Perform, according to the preset service delay, the global load information, the global service information, and the service invocation relationship to perform load balancing calculation, to generate a second load balancing policy, where the second load balancing policy is used to indicate The M computing nodes perform service adjustment; for example, the second load balancing policy may indicate that a distribution ratio of service messages between at least two computing nodes having a service calling relationship is adjusted, or The second load balancing policy may indicate that the service location between the computing nodes having the service calling relationship is adjusted, or the second load balancing policy may further indicate that the service between the computing nodes having the service calling relationship is expanded or deleted. .
  • load balancing methods are all implemented by the load balancing engine, and the sequence between the steps is not limited. Since the method corresponds to the device embodiment of the foregoing load balancing engine, the related method details may refer to the device embodiment of the foregoing load balancing engine. I will not repeat them here.
  • the embodiment further provides a load balancing method, which is applied to a client in a distributed computing system as shown in FIG. 3 or FIG. 4, and the method includes the following steps:
  • S201 Acquire and cache a first load balancing policy issued by the load balancing engine, where the first load balancing policy indicates distribution information of a service message of the first service type.
  • S203 Query the cached first load balancing policy, if the cached first load balancing policy matches the first service request, according to the distribution information indicated by the first load balancing policy, Determining, in the M computing nodes, a target computing node that matches the first service request;
  • S204 Send, according to the distribution information indicated by the first load balancing policy, a service message corresponding to the first service request to the target computing node.
  • the embodiment further provides a load balancing method, which is applied to a computing node or a management node in a distributed computing system as shown in FIG. 3 or FIG. 4, and the method includes the following steps. :
  • S301 Receive a second load balancing policy sent by the load balancing engine.
  • S302 Perform service adjustment on the computing node according to the second load balancing policy.

Abstract

本文公开一种负载均衡引擎,客户端,分布式计算系统以及负载均衡方法,所述负载均衡引擎包括:负载信息管理模块,用于获取分布式计算系统的全局负载信息;服务信息管理模块,用于获取所述分布式计算系统的全局服务信息;策略计算模块,用于针对第一服务类型,利用所述全局负载信息以及所述全局服务信息进行负载均衡计算,生成与所述第一服务类型相对应的第一负载均衡策略,所述第一负载均衡策略指示了与所述第一服务类型对应的服务消息在所述M个计算节点中的分发信息;策略发布模块,用于将所述第一负载均衡策略发布给客户端。采用本发明提供的分布式计算系统,由于策略计算与执行分离,可以处理大流量的服务调用,且便于升级。

Description

负载均衡引擎,客户端,分布式计算系统以及负载均衡方法 技术领域
本发明涉及电子领域,尤其涉及一种负载均衡引擎,客户端,分布式计算系统以及负载均衡方法。
背景技术
随着计算技术的发展,有些任务的计算需要非常大的计算能力才能完成,如果采用集中式计算,需要耗费相当长的时间来完成计算;而采用分布式计算,则可以将该任务分解成许多小的子任务,然后分配给多台计算机进行处理,这样可以节约整体计算时间,大大提高计算效率。
对于分布式计算而言,任务调度是一个最基本且具有挑战性的问题,其中,任务调度问题是指:给定一组任务和若干个可并行执行这些任务的计算节点,寻找一个能够将这一组任务有效调度到各个计算节点进行计算的方法,以获得更好的任务完成时间、吞吐量和资源利用率等。而负载均衡(Load Balance,LB)是进行任务调度时需要考虑的关键因素,也是优化分布式计算性能的关键,负载均衡问题解决得好与坏,直接决定了分布式计算资源的使用效率以及应用性能的高低。
为了解决负载均衡问题,现有技术提供了一种集中式负载均衡方案,如图1所示,在分布式计算系统10中,在客户端11与多个计算节点13之间,设置了独立的负载均衡器12,其中,负载均衡器12可以是专门的硬件设备,如F5公司提供的各种处理负载均衡的硬件,还可以是LVS,HAproxy,Nginx等负载均衡软件。当客户端11调用某个目标服务时,它向负载均衡器12发起服务请求,负载均衡器12根据某种负载均衡策略,将该服务请求转发到提供该目标服务的计算节点13上。而客户端11要发现负载均衡器12,则是通过域名系统(Domain Name System,DNS)14实现,具体地,域名系统14为每个服务配置一个DNS域名,这个域名指向负载均衡器12。然而,集中式负载均衡方案的缺点在于:所有服务调用的流量都经过负载均衡器12,当服务数量和调用量很大的时候,负载均衡器12容易成为制约分布式计算系统10的性能的瓶颈,且一旦负载均衡器12发生故障,对整个分布式计算系统10的影响是灾难性的。
针对集中式负载均衡方案的不足,现有技术还提供了另一种客户端负载均衡方案,也可以称为软负载均衡(Soft Load Balancing)方案,如图2所示,在分布式计算系统20中,负载均衡LB组件211被以库(Library)文件的形式集成到客户端21的服务进程里。此外,服务器23会提供一个服务注册表(Service Registry)支持服务自注册和自发现。每个计算节点22启动时,首先到服务器23注册,将该计算节点所提供的服务的地址注册到服务注册表中,同时,每个计算节点22还可以定期上报心跳到服务注册表,以表明该计算节点22的服务的存活状态。当客户端21中的服务进程要访问目标服务时,首先通过内置的LB组件211去服务注册表中查询与目标服务对应的地址列表,然后基于某种负载均衡策略选择一个目标服务地址,最后向该目标服务地址所指示的计算节点22发起请求,需要说明的是,本方案中所采用的负载均衡策略,仅需考虑提供目标服务的计算节点的负载均衡问题。然而, 客户端负载均衡方案的弊端在于:首先,如果开发企业内使用多种不同的语言栈,相应的,就需要开发多种不同的客户端,会显著增加研发和维护成本;其次,在客户端交付给使用者之后,如果要对库文件进行升级,修改库文件的代码,就需要使用者的配合,可能因使用者的配合程度不够,使得升级过程遭受阻力。
基于此,本发明提供了一种新的负载均衡方案,以克服现有技术中存在的诸多问题。
发明内容
本发明实施例提供一种新的负载均衡方案,以解决现有技术无法处理大流量的服务调用的问题,同时,可以节约开发成本,利于升级维护。
第一方面,本发明实施例提供了一种负载均衡引擎,应用于分布式计算系统,包括:负载信息管理模块,用于获取所述分布式计算系统的全局负载信息,所述全局负载信息指示了所述分布式计算系统中的M个计算节点各自的负载;服务信息管理模块,用于获取所述分布式计算系统的全局服务信息,所述全局服务信息指示了所述M个计算节点所提供的服务的类型,其中,M为大于1的自然数;策略计算模块,用于针对第一服务类型,利用所述全局负载信息以及所述全局服务信息进行负载均衡计算,生成与所述第一服务类型相对应的第一负载均衡策略,其中,所述第一服务类型为所述M个计算节点所提供的服务的类型中的至少一种,所述第一负载均衡策略指示了与所述第一服务类型对应的服务消息在所述M个计算节点中的分发信息;策略发布模块,用于将所述第一负载均衡策略发布给客户端。由于负载均衡引擎只负责负载均衡策略的计算,并将生成的负载均衡策略发送给客户端,由各个客户端独立进行服务消息的调度,从而可以避免大量服务调用对于负载均衡引擎的冲击,不会因为集中处理服务调用时,因无法应对大流量的服务调用而导致系统障碍。同时,当开发者升级分布式系统时,只需要升级负载均衡引擎即可,便于升级。此外,即便开发者使用多种语言栈,也只需针对不同语言栈开发一个负载均衡引擎即可,而客户端使用通用代码来调用负载均衡引擎发布的负载均衡策略,可以节省大量的开发成本。
在一种可能的实施方式中,所述负载均衡引擎还包括:服务全局视图,用于获取所述M个计算节点之间的服务调用关系;所述策略计算模块则用于用于针对所述第一服务类型,利用所述负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成所述第一负载均衡策略。由于一个服务可能需要调用其它服务来处理客户端的服务消息,如果第一服务类型的服务所在的计算节点的负载低,但是该计算节点调用的其它服务所在的计算节点的负载高,同样会影响服务质量,因此,在生成第一负载均衡策略时,将与第一服务类型的服务所在的计算节点,以及与该计算节点存在调用关系的其他计算节点的负载均考虑进来,有利于提高分布式计算系统整体的计算性能,降低服务时延。
在一种可能的实施方式中,所述策略计算模块具体用于:根据所述全局服务信息,从所述M个计算节点中确定提供所述第一服务类型的服务的目标计算节点;根据所述服务调用关系,从所述M个计算节点中确定与所述目标计算节点提供的所述第一服务类型的服务存在调用关系的相关计算节点;根据所述全局负载信息,确定所述目标计算节点以及所述相关计算节点的负载,并进行负载均衡计算,生成所述第一负载均衡策略。
在一种可能的实施方式中,所述策略计算模块具体用于根据如下目标函数生成所述第 一负载均衡策略,
Figure PCTCN2018083088-appb-000001
其中,t(S i)表示第i个服务消息的消息链的服务时延,
Figure PCTCN2018083088-appb-000002
表示n个服务消息的消息链的服务时延的平均值。本实施方式中,通过平衡吞吐量与响应时间的关系,可以实现较好的负载均衡。
在一种可能的实施方式中,所述策略计算模块还用于:基于预设的服务时延,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成第二负载均衡策略,所述第二负载均衡策略用于指示所述M个计算节点进行服务调整;所述策略发布模块,还用于将所述第二负载均衡策略发布给所述M个计算节点。通过对M个计算节点现有的服务进行调整,可以进一步优化分布式计算系统的计算性能,降低服务时延。
在一种可能的实施方式中,所述第二负载均衡策略指示了对存在服务调用关系的至少两个计算节点之间的服务消息的分发比例进行调整。
在一种可能的实施方式中,所述第二负载均衡策略指示了对存在服务调用关系的计算节点之间的服务位置进行调整。
在一种可能的实施方式中,所述第二负载均衡策略指示了对存在服务调用关系的计算节点之间的服务进行扩容或者删减。
在一种可能的实施方式中,所述全局负载信息,所述全局服务信息以及所述服务调用关系均是周期性获取的;所述策略计算模块,用于周期性地计算所述第一负载均衡策略或所述第二负载均衡策略,并通过所述策略发布模块进行周期性地发布。通过周期性地更新负载均衡策略,可以使分布式计算系统的性能,始终保持在一个较高的水平上。
第二方面,本发明实施例提供了一种客户端,应用于分布式计算系统,所述分布式计算系统包括负载均衡引擎以及M个计算节点,其中,M为大于1的自然数,所述客户端包括:本地缓存,用于获取并缓存由所述负载均衡引擎发布的第一负载均衡策略,所述第一负载均衡策略指示了第一服务类型的服务消息的分发信息;服务管理模块,用于接收第一服务请求;负载策略计算模块,用于查询所述本地缓存,在所述本地缓存存储的所述第一负载均衡策略与所述第一服务请求匹配的情况下,根据所述第一负载均衡策略指示的分发信息,从所述M个计算节点中确定与所述第一服务请求相匹配的目标计算节点;所述服务管理模块,还用于根据所述第一负载均衡策略指示的分发信息,将所述第一服务请求对应的服务消息,发送给所述目标计算节点。由于客户端只需接收并缓存与每种服务类型对应的负载均衡策略,当与某种服务类型对应的服务消息需要调用时,查询缓存的负载均衡策略,即可完成服务消息调度。因此,即使开发者使用不同的语言栈,也不需要针对每种语言栈分别开发客户端,可以采用一些通用的代码即可实现负载均衡策略的接收以及查询,从而节约了大量的研发成本,也降低了升级分布式计算系统时的阻力。
第三方面,本发明实施例还提供了一种分布式计算系统,包括:如第一方面以及第一方面的任一可能的实施方式中所述的负载均衡引擎,以及耦合至所述负载均衡引擎的M个计算节点。本实施例提供的分布式计算系统中,由于负载均衡引擎负责负载均衡策略的计算,而各个客户端分别根据负载均衡策略进行服务调用,也就是说,对负载均衡策略的计 算及执行进行了分离,避免了图1所示的集中式负载方均衡案中,因负载均衡器12难以处理大流量的服务调用而制约系统性能的问题,同时,当开发者采用了多种不同的语言栈时,由于客户端中进行服务调用的代码可以保持一致,因此不需要开发多种版本的客户端,而只需要针对不同的语言栈开发不同的负载均衡引擎即可,后续如果分布式计算系统需要升级时,也只需要对负载均衡引擎进行更新,因而可以节约开发成本,以及减少升级阻力。
在一种可能的实施方式中,所述分布式计算系统还包括:如第二方面所述的客户端。
在一种可能的实施方式中,所述分布式计算系统还包括:注册服务器,用于收集所述M个计算节点的所述全局服务信息,并将所述全局服务信息发送给所述负载均衡引擎。
在一种可能的实施方式中,所述分布式计算系统还包括:
监控模块,用于通过监控所述M个计算节点的负载,获取所述全局负载信息并发送给所述负载均衡引擎。
在一种可能的实施方式中,所述分布式计算系统还包括:管理节点,用于接收所述负载均衡引擎发送的所述第二负载均衡策略,并根据所述第二负载均衡策略,对所述M个计算节点进行服务调整。
第四方面,本发明实施例还提供了一种负载均衡方法,应用于分布式计算系统,包括:获取所述分布式计算系统的全局负载信息,所述全局负载信息指示了所述分布式计算系统中的M个计算节点各自的负载;获取所述分布式计算系统的全局服务信息,所述全局服务信息指示了所述M个计算节点所提供的服务的类型,其中,M为大于1的自然数;针对第一服务类型,利用所述全局负载信息以及所述全局服务信息进行负载均衡计算,生成与所述第一服务类型相对应的第一负载均衡策略,其中,所述第一服务类型为所述M个计算节点所提供的服务的类型中的至少一种,所述第一负载均衡策略指示了与所述第一服务类型对应的服务消息的分发信息;将所述第一负载均衡策略发布给客户端。本实施例提供的方法中,由于负载均衡引擎负责负载均衡策略的计算,而各个客户端分别根据负载均衡策略进行服务调用,也就是说,对负载均衡策略的计算及执行进行了分离,避免了难以处理大流量的服务调用而制约系统性能的问题,同时,当开发者采用了多种不同的语言栈时,不需要开发多种版本的客户端,而只需要针对不同的语言栈开发不同的负载均衡引擎即可,后续升级时,也只需要对负载均衡引擎进行更新,因而可以节约开发成本,以及减少升级阻力。
在一种可能的实施方式中,所述方法还包括获取所述M个计算节点之间的服务调用关系;则针对第一服务类型,利用所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成与所述第一服务类型相对应的第一负载均衡策略的步骤包括:针对所述第一服务类型,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成所述第一负载均衡策略。
在一种可能的实施方式中,针对所述第一服务类型,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成所述第一负载均衡策略的步骤包括:根据所述全局服务信息,从所述M个计算节点中确定提供所述第一服务类型的服务的目标计算节点;根据所述服务调用关系,从所述M个计算节点中确定与所述目标计算节点提供的所述第一服务类型的服务存在调用关系的相关计算节点;根据所述全局负载信息, 确定所述目标计算节点以及所述相关计算节点的负载,并进行负载均衡计算,生成所述第一负载均衡策略。
在一种可能的实施方式中,所述方法还包括:基于预设的服务时延,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成第二负载均衡策略,所述第二负载均衡策略用于指示所述M个计算节点进行服务调整;将所述第二负载均衡策略发布给所述M个计算节点。
在一种可能的实施方式中,所述第二负载均衡策略指示了对存在服务调用关系的至少两个计算节点之间的服务消息的分发比例进行调整。
在一种可能的实施方式中,所述第二负载均衡策略指示了对存在服务调用关系的计算节点之间的服务位置进行调整。
在一种可能的实施方式中,所述第二负载均衡策略指示了对存在服务调用关系的计算节点之间的服务进行扩容或者删减。
第五方面,本发明实施例还提供了一种负载均衡方法,应用于分布式计算系统中的客户端,所述方法包括:获取并缓存由所述负载均衡引擎发布的第一负载均衡策略,所述第一负载均衡策略指示了第一服务类型的服务消息的分发信息;接收第一服务请求;查询缓存的所述第一负载均衡策略,在缓存的所述第一负载均衡策略与所述第一服务请求匹配的情况下,根据所述第一负载均衡策略指示的分发信息,从所述M个计算节点中确定与所述第一服务请求相匹配的目标计算节点;根据所述第一负载均衡策略指示的分发信息,将所述第一服务请求对应的服务消息,发送给所述目标计算节点。
第六方面,本发明实施例还提供了一种负载均衡方法,应用于分布式计算系统中的计算节点或者管理节点,所述方法包括:接收负载均衡引擎发送的第二负载均衡策略;根据所述第二负载均衡策略,对计算节点进行服务调整。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍。
图1为现有技术提供的一种集中式负载均衡方案的示意图;
图2为现有技术提供的另一种负载均衡方案的示意图;
图3为本发明实施例提供的一种分布式计算系统的架构图;
图4为图3所示的分布式计算系统中的各个装置的结构示意图;
图5为本发明实施例提供的一种服务间调用关系的示意图;
图6为本发明实施例提供的一种针对存在服务调用关系的计算节点进行负载均衡的示意图;
图7a、7b为本发明实施例提供的一种对服务消息的分发比例进行调整的示意图;
图8a、8b为本发明实施例提供的一种对服务位置进行调整的示意图;
图9a、9b、9c为本发明实施例提供的一种对服务进行扩容的示意图;
图10为本发明实施例提供的另一种负载均衡引擎的装置示意图;
图11为本发明实施例提供的一种客户端的装置示意图;
图12为本发明实施例提供的一种应用于负载均衡引擎的负载均衡方法的流程示意图;
图13为本发明实施例提供的一种应用于客户端的负载均衡方法的流程示意图;
图14为本发明实施例提供的一种应用于计算节点(或管理节点)的负载均衡方法的流程示意图。
具体实施方式
本发明实施例提供了一种分布式计算系统的架构图。如图3所示,该分布式计算系统30可以包括:客户端31,负载均衡引擎32,以及包括M个计算节点的服务提供方(service provider)33,其中,客户端31,负载均衡引擎32以及服务提供方33中的各个计算节点,分别通过网络34进行相互通信,M为大于1的整数。本实施例中,网络34可以是有线网络,无线网络,局域网(LAN),广域网(WAN),以及移动通信网络等。示例性的,客户端31可以通过接入点(Access Point,AP)341接入网络34,以便与负载均衡引擎32或服务提供方33中的任一计算节点进行通信。
需要说明的是,为了简便起见,在图3中只示出了3个计算节点,即计算节点331,计算节点332,以及计算节点333,在实际应用中,计算节点的数量可以根据分布式计算系统对计算资源的需求进行部署,不限于3个。此外,在分布式计算系统中,计算节点通常采用集群式部署,也就是说,分布式计算系统中的所有计算节点可以分为多个集群,而本实施例中所说的M个计算节点,可以是所有集群中的计算节点,也可以是其中一个或者多个集群中的计算节点。
图4进一步示出了分布式计算系统30中的各个装置的内部结构,以下结合图4,对分布式计算系统30做进一步说明。
如图4所示,负载均衡引擎32可以包括:负载信息管理模块321,服务信息管理模块322,策略计算模块323以及策略发布模块324;其中,
负载信息管理模块321,用于获取分布式计算系统30的全局负载信息,所述全局负载信息指示了服务提供方33中的所述M个计算节点各自的负载;
服务信息管理模块322,用于获取全局服务信息,所述全局服务信息指示了所述M个计算节点所提供的服务的类型,其中,每个计算节点可以提供至少一种类型的服务,本领域技术人员应当知道,在分布式计算系统中,计算节点,可以是个人电脑,工作站,服务器或者其它类型的物理机,或者还可以是虚拟机。而计算节点上的服务通常以进程的形式,运行在物理机或者虚拟机上,因此,一个计算节点上通常可以提供多种服务;
策略计算模块323,用于针对第一服务类型,利用所述全局负载信息以及所述全局服务信息进行负载均衡计算,得到与所述第一服务类型相对应的第一负载均衡策略,其中,所述第一服务类型为所述M个计算节点所提供的服务的类型中的至少一种,所述第一负载均衡策略指示了与所述第一服务类型对应的服务消息在所述M个计算节点中的分发信息,所述分发信息包括:分发对象,或者,所述分发信息包括:分发对象和分发比例;示例性的,在一种应用场景下,如果计算节点331、计算节点332和计算节点333都提供第一服务类型的服务,策略计算模块323根据全局负载信息,了解到计算节点331和计算节点332的负载过高,则可以生成第一负载均衡策略,以指示将计算节点333作为与第一服务类型 的服务消息的分发对象;在另一种应用场景下,同样假设计算节点331、计算节点332和计算节点333都提供第一服务类型的服务,策略计算模块323根据全局负载信息,了解到计算节点331的负载过高,而计算节点332和计算节点333分别可以提供一部分处理能力,则可以生成第一负载均衡策略,以指示将计算节点332和计算节点333作为与第一服务类型的服务消息的分发对象,同时,第一负载均衡策略还可以分别指示计算节点332和计算节点333各自的分发比例;
策略发布模块323,用于将所述第一负载均衡策略发布给客户端31。
本实施例中,客户端31可以包括:
本地缓存311,用于获取并缓存由负载均衡引擎32发布的第一负载均衡策略,所述第一负载均衡策略指示了第一服务类型的服务消息的分发信息;
服务管理模块312,用于接收客户的第一服务请求;
负载策略计算模块313,用于响应所述第一服务请求,通过查询所述本地缓存311,确定所述本地缓存311存储的所述第一负载均衡策略与所述第一服务请求是否匹配,当所述本地缓存311存储的所述第一负载均衡策略与所述第一服务请求匹配时,就根据所述第一负载均衡策略指示的分发信息,从所述M个计算节点中确定与所述第一服务请求相匹配的目标计算节点,应当知道,由于第一负载均衡策略是针对第一服务类型指定的,当第一服务请求对应的服务也属于第一服务类型时,负载策略计算模块313就可以认为所述第一负载均衡策略与所述第一服务请求匹配,反之,则认为不匹配;
所述服务管理模块312,还用于根据所述第一负载均衡策略指示的分发信息,将所述第一服务请求对应的服务消息,发送给所述目标计算节点,从而使所述目标节点响应所述第一服务请求并提供服务。
这里对服务请求和服务消息的关系进行简要说明,在分布式计算领域,客户端31请求服务提供方33提供服务时,首先需要向服务提供方33提出服务请求,当服务提供方33相应该请求之后,将相应的服务消息发送给服务提供方33,由服务提供方33进行相应的运算处理,最后将处理后的结果反馈给客户端31,才算完成了一次服务。
本实施例提供的分布式计算系统30中,由于负载均衡引擎32负责负载均衡策略的计算,而各个客户端31分别根据负载均衡策略进行服务调用,也就是说,对负载均衡策略的计算及执行进行了分离,避免了图1所示的集中式负载方均衡案中,因负载均衡器12难以处理大流量的服务调用而制约系统性能的问题,同时,当开发者采用了多种不同的语言栈时,由于客户端31中进行服务调用的代码可以保持一致,因此不需要开发多种版本的客户端,而只需要针对不同的语言栈开发不同的负载均衡引擎32即可,后续如果分布式计算系统需要升级时,也只需要对负载均衡引擎32进行更新,因而可以节约开发成本,以及减少升级阻力。
本实施例中,负载均衡引擎32在获取全局负载信息时,可以直接收集所述M个计算节点各自的负载信息,通过汇总收集到的负载信息,得到所述全局负载信息;也可以在分布式计算系统30中设置用于监控所述M个计算节点的工作状态的监控模块35,例如:指标监控模块(Metric Monitor),并通过监控模块35获取所述全局负载信息。
进一步地,从所述M个计算节点收集的全局负载信息,包括但不限于如下几类负载:
1、资源占用信息,例如:中央处理器(CPU)占用率,内存占用率,网络带宽等占用率等;
2、吞吐量信息,例如:单位时间内每个服务收到的服务消息的数量,以及单位时间内每个服务对外发送的服务消息的数量以及发送对象的数量等;
3、服务时延信息,例如:服务消息的平均处理时延,服务消息在处理前的平均等待时延,服务间的通信时延等,需要说明的是,一个服务消息的处理时延与如下几项因素相关:1、服务所在的计算节点的中央处理器(CPU)或输入输出设备(I/O)等物理硬件的能力,2、服务所在的计算节点上是否有其它类型的服务会占用物理硬件等资源,3、与服务自身的处理逻辑相关,逻辑越复杂,相应的消息处理时延就会越多,本领域技术人员应当知道,与处理逻辑相关的消息处理时延,可以通过在一定时间段内采样进行确定;而服务消息的通信时延则与如下几项因素相关:1、该服务所在的计算节点的网络能力,例如网络带宽是1GB的还是10GB的,2、该服务所在的计算节点的网络是否有其它服务抢占,3、两个服务之间的通信距离,比如在两个服务在同一个计算节点上,则通信时延最小,跨计算节点通信则通信时延会长一些,跨数据中心通信,则通信时延更长;
4、剩余资源信息,例如:服务所在的计算节点的物理资源的剩余情况等。
本实施例中,服务信息管理模块322可以分别从所述M个计算节点收集服务信息,然后汇总收集到的服务信息,得到所述全局服务信息;也可以通过在分布式计算系统30中设置的注册服务器(Service Register)34获取所述M个计算节点的全局服务信息,其中,每个计算节点在初始化时,会将该计算节点的服务信息注册到注册服务器34。
进一步地,全局服务信息具体可以包括:服务群组信息和部署信息,其中,服务群组信息指示了每个计算节点上部署的服务,按照服务类型进行分组后的分组信息,部署信息则指示了每个计算节点上部署的服务的处理能力以及每个计算节点的总处理能力等。需要说明的是,在计算机术语中,计算节点上部署的服务通常叫做服务实例(service instance),指计算节点上的服务的具体运行实体,而服务群组指由某一种服务类型(Service Type)的若干实例组成的集合,共同提供一种服务。
本实施例中,策略计算模块323,可以针对第一服务类型,利用所述全局负载信息以及所述全局服务信息,并基于预设的负载均衡算法进行负载均衡计算,得到与所述第一服务类型相对应的负载均衡策略。需要说明的是,策略计算模块323所采用的负载均衡的算法,通常可以分为两种:静态负载均衡算法和动态负载均衡算法。
其中,静态负载均衡算法可以包括:
1、轮询(Round Robin):每一次轮循中,顺序地询问所述M个计算节点。当其中某个计算节点出现过载或故障时,就将该计算节点从由所述M个计算节点组成的顺序循环队列中拿出,不参加下一次的轮询,直到该计算节点恢复正常;
2、比例(Ratio):给每个计算节点分别设置一个加权值,以表征消息分配的比例,根椐这个比例,把客户端发送的服务消息分配到每个计算节点,当其中某个计算节点发生过载或故障时,就把该计算节点从所述M个计算节点组成的队列中拿出,不参加下一次的服务消息的分配,直到其恢复正常;
3、优先权(Priority):给所述M个计算节点分组,给每个组设置不同的优先级,然后 将客户端的服务消息分配给优先级最高的计算节点组(在同一计算节点组内,采用轮询或比例算法,分配服务消息),当最高优先级对应的计算节点组中所有计算节点出现过载或故障后,就将服务请求发送给优先级次高的计算节点组。
而动态负载均衡算法则可以包括:
1、最少的连接方式(Least Connection):将服务消息分配给那些连接最少的计算节点处理,当连接最少的某个计算节点发生过载或故障,就不让该计算节点参加下一次的服务消息的分配,直到其恢复正常,其中,连接是指客户端与计算节点之间为了接收或者发送服务消息而保持的通信连接,连接数与该计算节点的吞吐量成正比;
2、最快模式(Fastest):将服务消息分配给那些响应最快的计算节点处理,当某个响应最快的计算节点发生过载或故障,就不让该计算节点参加下一次的服务消息的分配,直到其恢复正常,其中,每个计算节点的响应时间包括接收和发送服务消息的时间,以及处理服务消息的时间,应当知道,响应越快,则说明计算节点处理服务消息的时间越短,或者该计算节点与客户端之间的通信时间越短;
3、观察模式(Observed):以连接数目和响应时间之间的平衡为依据,将服务消息分配给平衡度最佳的计算节点处理,本领域技术人员应当知道,连接数目与响应时间是相互矛盾的,连接数越多,则意味着服务消息的吞吐量越大,相应的,处理服务消息所需的时间也就越多,因此,需要在两者之间寻求平衡,在不显著降低响应速度的前提下,尽可能处理更多的服务消息;
4、预测模式(Predictive):收集所述M个计算节点当前的各项性能指标,进行预测分析,在下一个时间段内,将服务消息分配给性能预测将达到最佳的计算节点处理;
5、动态性能分配(DynamicRatio-APM):实时收集所述M个计算节点的各项性能参数并分析,并根据这些性能参数动态地进行服务消息的分配;
6、动态计算节点补充(DynamicServer Act.):将所述M个计算节点的一部分设置为主计算节点群,其余的作为备份计算节点,当主计算节点群中因过载或故障导致数量减少时,动态地将备份计算节点补充至主计算节点群。
需要说明的是,本实施例中的负载均衡算法包括但不限于以上算法,示例性的,可以是以上的几种算法的组合,或者,还可以是客户根据自定义的规则指定的算法,以及现有技术中使用的各种算法。
本实施例中,可以通过负载均衡算法插件326,将客户自定义的负载均衡算法导入策略计算模块323并参与负载均衡策略的计算。通过这种方式,可以使分布式计算系统的运营者以更便利的方式参与的维护,比如:通过负载均衡算法插件326更新负载均衡算法,以实现系统升级。
随着分布式计算技术的发展,不同的服务间也会存在相互调用,也就是说,一个服务自身即是服务的提供者,又是服务的消费者,特别是随着分布式微服务的快速发展,微服务的消息链深度一般都大于1,其中,消息链大于1,表示一个服务需要调用至少一个其它服务。如图5所示,将服务(Service)A看成是一个服务的消费者时,由于Service A需要调用Service B,而Service B依赖Service C来提供服务,因此,Service A的消息链为2。进一步地,现有的分布式计算系统中,每个服务只关注了下一级的服务的负载,也就是说, 当Service A调用两个Service B时,它的负载均衡策略只考虑2个Service B自身的负载,以及2个Service B所在的计算节点的负载,而不会考虑3个Service C以及3个Service C所在的计算节点的负载,而Service B在调用Service C时,会根据3个Service B的负载独立地做一次负载均衡,也就是说,当前的分布式计算系统在计算负载均衡策略是,并未关注Service A→Service C的整体负载均衡。
基于此,如图4所示的分布式计算系统30还可以包括:服务全局视图325,用于获取所述M个计算节点之间的服务调用关系。需要说明的是,图4中所示的Service A,Service B以及Service C,可以是由同一个计算节点提供的,也可以是由不同的计算节点分别提供,因此,服务全局视图325获取的服务调用关系,既包括同一个计算节点上的不同服务之间的调用关系,又包括不同计算节点之间服务调用关系。此外,Service A调用Service B,意味着Service A要提供完整的服务,依赖于Service B所提供的部分服务。
相应的,所述策略计算模块323,可以具体用于针对所述第一服务类型,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成所述第一负载均衡策略。
在一种可能的实施方式中,所述策略计算模块323可以具体用于:
基于全局服务信息,从所述M个计算节点中确定提供第一服务类型的服务的目标计算节点,其中,目标计算节点的数量可以为一个或者多个;
基于所述服务调用关系,从所述M个计算节点中确定与所述目标计算节点提供的第一服务类型的服务存在调用关系的相关计算节点,需要说明的是,这里使用目标计算节点和相关计算节点的说法,只是为了便于表述,应当知道,目标计算节点和相关计算节点,指的都是所述M个计算节点中,所提供的服务与第一服务类型相对应的计算节点;
根据全局负载信息,确定所述目标计算节点以及所述相关计算节点的负载,并基于预设的负载均衡算法进行负载均衡计算,生成所述第一负载均衡策略,其中,负载均衡算法可以参考前面关于静态负载均衡算法以及动态负载均衡算法的描述,不再赘述。
本实施例中,结合图6,假设第一服务类型的服务是Service A,则负载均衡引擎32中的策略计算模块323,将分别提供Service A计算节点1和计算节点2确定为目标计算节点;同时,由于无论计算节点1还是计算节点2所提供的Service A,与计算节点4和计算节点5提供的Service C之间都存在调用关系,因此,策略计算模块323还可以根据获取的服务调用关系,将计算节点4和计算节点5确定为相关计算节点;接下来,策略计算模块323根据全局负载信息,可以得到所述目标计算节点以及所述相关计算节点(即计算节点1,计算节点2,计算节点4以及计算节点5)各自的负载,然后通过负载均衡计算,生成所述第一负载均衡策略。后续当客户端31发出服务请求是Service A时,负载均衡引擎32就可以响应该服务请求并根据第一负载均衡策略,确定该服务请求相对应的服务消息的分发信息。
需要说明的是,本实施例中虽然只描述了如何计算与第一服务类型相匹配的第一负载均衡策略,本领域技术人员应当知道,分布式计算系统通常提供不止一种服务类型的服务,在实际应用中,负载均衡引擎还可以针对分布式计算系统所提供的每一种服务类型,生成一一对应的负载均衡策略,以支持对来自不同客户端的不同服务类型的服务消息的调度, 其中,生成其它服务类型对应的负载均衡策略的方法,可以参考生成第一负载均衡策略的方法,不再赘述。
为了便于更好地说明本发明的技术方案,下面以采用观察模式作为负载均衡算法为例,对负载均衡策略的计算进行举例说明:
假设在当前的吞吐量水平下,分布式计算系统30所需调度的消息流中包括n个服务消息,如公式(1)所示,:
σ={σ 12,...σ i,...,σ n}  (1)
其中,σ表示n个服务消息的集合,即消息流,σ i表示第i个服务消息,i和n均为自然数,且1≤i≤n;
n个服务消息的消息链如公式(2)所示:
Figure PCTCN2018083088-appb-000003
其中,S表示n个服务消息的消息链的集合,S i表示第i个服务消息的消息链的集合,S i ki表示第i个服务消息的消息链中的第k个服务,k为自然数;其中,消息链是指分布式计算系统中,处理任一个服务消息时所要调用的所有服务形成的链路,消息链可以通过服务全局视图325获取的服务调用关系确定;
Figure PCTCN2018083088-appb-000004
其中,t(S i)表示第i个服务消息的消息链所需花费的总时间,即服务时延,
Figure PCTCN2018083088-appb-000005
表示在第i个服务消息的消息链中所需的处理时延,
Figure PCTCN2018083088-appb-000006
表示在第i个服务消息的消息链中所需的通信时延,处理时延和通信时延可以由负载信息管理模块321获取的全局负载信息所确定;
Figure PCTCN2018083088-appb-000007
如公式(4)所示,
Figure PCTCN2018083088-appb-000008
表示n个服务消息的消息链的服务时延的平均值;
基于以上公式,可以得到评估吞吐量与响应时间的目标函数,如公式(5)所示:
Figure PCTCN2018083088-appb-000009
当Φ(S)的值最小时,表示吞吐量与响应时间之间的平衡最好。
本实施例中,进一步地,所述策略计算模块323,还可以用于基于预设的服务时延,利用所述全局负载信息以及所述全局服务信息进行负载均衡计算,生成第二负载均衡策略,所述第二负载均衡策略为指示所述M个计算节点中进行服务调整的策略;相应的,所述策略发布模块324,还可以用于将所述第二负载均衡策略发布给所述M个计算节点。需要说明的是,当一个节点(例如:计算节点A)为另一个节点(例如:计算节点B或者客户端)提供服务时,服务时延是指该节点从响应服务请求,接收服务消息,处理服务消息以及返回处理后的服务消息的全流程所花费的时间,服务时延又可以称为端到端时延。
本实施例中,服务时延具体可以根据不同服务对于时延的容忍度来设定,示例性的,对于低时延服务,则可以基于端到端时延最小的原则来设定,对于高时延服务,则可以综合分布式计算系统的整体性能,以及高时延服务对时延的容忍度,设定服务时延,此处不做具体限定。
在一种可能的实施方式中,所述第二负载均衡策略可以指示对存在服务调用关系的至少两个计算节点的服务的消息分发比例进行调整。以下结合图7a和7b对此做进一步说明。
假设分布式计算系统30当前的服务调用关系以及消息的分发比例如图7a所示,其中,Service A1和Service A2可以是两种不同类型的服务,也可以同一种类型的两个服务,Service B1,Service B2和Service B3则是同一种类型的服务。计算节点331中的Service A1分别与计算节点331中的Service B1,计算节点332中的Service B2以及计算节点333中的Service B3之间存在调用关系,而计算节点333中的Service A2分别与计算节点331中的Service B1,计算节点332中的Service B2以及计算节点333中的Service B3之间存在调用关系;此外,Service A1和Service A2分别可以每秒发送3000消息(3000msg/s)给Service B1,Service B2和Service B3,且3000消息是平均分配Service B1,Service B2和Service B3的,而Service B1,Service B2和Service B3的处理能力均为2000消息/每秒(2000msg/s)。在这种场景中,无论从Service A1发消息给Service B2和Service B3,还是从Service A2发消息给Service B1和Service B2,都需要进行跨节点通信,也就是说,总计有4000消息需要以跨节点通信的方式进行发送,应当知道,同一个计算节点内的服务之间通信时延,比跨节点通信的时延要小得多,因此,按照图7a所示的消息分发比例进行消息发送,会导致较大的时延,影响分布式计算系统的性能。
本实施例中,策略计算模块可以基于预设的服务时延,生成第二负载均衡策略,以指示对计算节点331,计算节点332以及计算节点333的服务的消息分发比例进行调整。如图7b所示,根据第二负载均衡策略,可以指示Service A1将2000消息发送给位于同一计算节点内的Service B1,将1000消息发送给计算节点332中的Service B2,类似的,可以指示Service A2将2000消息发送给位于同一计算节点内的Service B3,将1000消息发送给计算节点332中的Service B2,这样调整之后,只有2000消息需要以跨节点通信的方式进行发送,且图7b中的跨节点通信只需要跨一个计算节点,而图7a中需要跨两个计算节点。不难看出,基于服务时延最小原则进行消息分发比例的调整后,减少了不必要的跨节点通信,降低了整个分布式计算机系统的时延,从而提高了分布式计算系统的性能。
在另一种可能的实施方式中,所述第二负载均衡策略可以指示对存在服务调用关系的 计算节点之间的服务位置进行调整。以下结合图8a和8b对此做进一步说明。
假设分布式计算系统当前的服务调用关系以及消息分发比例如图8a所示,其中,Service B1和Service B2是一种类型的两个服务,Service C1,Service C2和Service C3是另一种类型的服务。计算节点331中的Service A1分别与计算节点332中的Service B1和Service B2存在调用关系,而Service B1和Service B2又分别与计算节点333中的Service C1,Service C2和Service C3存在调用关系。现有技术中,每个服务只能感知它的下一步调用的服务的负载信息,即Service A只能感知到Service B1和Service B2(即计算节点332)的负载,在现有的负载均衡策略下,平均分配是最佳的负载均衡策略,所以Service A每秒发送的3000信息是平均分发给Service B1和Service B2。与此类似,Service B1和Service B2也是分别将各自的1500消息平均发送给Service C1,Service C2和Service C3。
而本发明实施例中,由于考虑了不同计算节点之间的服务调用关系,在负载均衡策略的计算中,负载均衡引擎可以综合计算节点332以及计算节点333的负载,以及预设的服务时延来计算第二负载均衡策略,以指示对计算节点332和计算节点333上部署的服务的位置进行调整。如图8b所示,第二负载均衡策略可以指示将原来部署在计算节点333上的Service C1部署到计算节点332上,并将原来部署在计算节点332上的Service C2部署到计算节点333上;同时,由于Service B1和Service B2的处理能力可以达到2000msg/s,而Service C1,Service C2及Service C3的处理能力为1000msg/s,Service A可以将2000消息分发给Service B2,再由Service B2平均分发给Service C2和Service C3,同时将剩下的1000消息分发给Service B1,再由Service B1分发给Service C1。从图8a中不难看出,共有6000消息需要进行跨节点通信。而在图8b中,根据第二负载均衡策略进行服务位置调整后,需要跨节点通信的只有3000消息,显著降低了了分布式计算系统的通信时延。
在又一种可能的实施方式中,所述第二负载均衡策略还可以指示对存在服务调用关系的计算节点之间的服务进行扩容或者删减。以下结合图9a至9c对此做进一步说明。
如图9a所示,Service A的消息发送路径为:Service A→Service B→Service C→Service D→Service E。当负载均衡引擎通过收集的全局负载信息,发现Service B的负载过高,且计算节点331和计算节点332均未满载时,可以指示在消息发送路径中扩展一个与Service B相同类型的Service B1来分担负载。进一步地,负载均衡引擎可以根据全局服务信息和服务调用关系来判断是在计算节点331还是在计算节点332上扩展Service B1,考虑到服务时延最小原则,如果在计算节点332上在扩展Service B1的话,由于从计算节点中的Service A发消息给计算节点332中的Service B1,以及从Service B1发消息给计算节点331中的Service C,都需要跨节点通信,从而不利于降低分布式计算系统的通信时延,因此,负载均衡引擎可以确定在计算节点331上扩展Service B1将是最佳选择。
相应的,如图9b所示,负载均衡引擎可以通过第二负载均衡策略指示在计算节点331上扩展Service B1来分担负载,以避免引起额外的跨节点通信。
进一步地,如图9C所示,若负载均衡引擎根据全局负载信息,发现Service B负载过高,且计算节点331已经满载而计算节点332尚未满载时,负载均衡引擎可以指示在计算节点332上扩展一个与Service B类型相同的服务Service B2来分担Service Bd的负载,同时扩展与Service C类型相同的服务Service C2,相比图9b,同样没有增加额外的跨节 点通信,只是由于计算节点332中多扩展了一个Service C2,在节点内通信的时延上略有增加。
随着分布式计算系统的发展,在以集群的方式部署计算节点时,会引入一些先进的系统框架,例如:Hadoop,Mesos,Marathon等框架,相应的,如图4所示,分布式计算系统30还可以引入一个管理节点36(例如:Mesos Master),并通过一个管理节点去管理各个计算节点。因此,所述策略发布模块324可以将所述第二负载均衡策略发布给管理节点36,并由所述管理节点36指示所述M个计算节点进行服务调整。当然,也可以将管理节点36的管理功能,分散到各个计算节点中,由各个计算节点根据第二负载均衡策略进行服务调整。
如图4所示的分布式计算系统30中,由于每个计算节点的负载会实时变化,且由于通过第二负载均衡策略可以对各个计算节点部署的服务进行调整,因此,所述全局负载信息,所述全局服务信息以及所述服务调用关系,均是周期性获取的;相应的,策略计算模块323,也用于周期性的计算所述第一负载均衡策略以及所述第二负载均衡策略,并通过所述策略发布模块323进行周期性地发布。与此相应的,客户端31也可以周期性地获取负载均衡引擎32发布的所述第一负载均衡策略,而计算节点(或管理节点)也可以周期性地获取负载均衡引擎32发布的所述第二负载均衡策略。
在本实施例中,负载均衡引擎32中的各个模块,可以由集成电路(LSI),数字信号处理器(DSP),现场可编程门阵列(FPGA),数字电路等硬件方式实现。类似的,客户端31中的各个模块也可以通过上述硬件来实现。
在另一种实施例中,负载均衡引擎可以采用如图10所示的通用计算设备来实现。图10中,负载均衡引擎400中的元件可以包括但并不限于:系统总线410,处理器420,和系统存储器430。
处理器420通过系统总线410与包括系统存储器430在内的各种系统元件相耦合。系统总线410可以包括:工业标准结构(ISA)总线,微通道结构(MCA)总线,扩展ISA(EISA)总线,视频电子标准协会(VESA)局域总线,以及外围器件互联(PCI)总线。
系统存储器430包括:易失性和非易失性的存储器,例如,只读存储器(ROM)431和随机存取存储器(RAM)432。基本输入/输出系统(BIOS)433一般存储于ROM431中,BIOS 433包含着基本的例行程序,它有助于各元件之间通过系统总线410中进行的信息传输。RAM 432一般包含着数据和/或程序模块,它可以被处理器420即时访问和/或立即操作。RAM 432中存储的数据或者程序模块包括但不限于:操作系统434,应用程序435,其他程序模块436和程序数据437等。
负载均衡引擎400还可以包括其他可拆卸/非拆卸,易失性/非易失性的存储介质。示例性的,硬盘存储器441,它可以是非拆卸和非易失性的可读写磁媒介外部存储器451,它可以是可拆卸和非易失性的各类外部存储器,例如光盘、磁盘、闪存或者移动硬盘等;硬盘存储器441一般是通过非拆卸存储接口440与系统总线410相连接,外部存储器一般通过可拆卸存储接口450与系统总线410相连接。
上述所讨论的存储介质提供了可读指令,数据结构,程序模块和负载均衡引擎400的其它数据的存储空间。例如,硬盘驱动器441说明了用于存储操作系统442,应用程序443, 其它程序模块444以及程序数据445。值得注意的是,这些元件可以与系统存储器430中存储的操作系统434,应用程序435,其他程序模块436,以及程序数据437可以是相同的或者是不同的。
在本实施例中,前述实施例以及图4中所示的负载均衡引擎32中的各个模块的功能,可以由处理器420通过读取并执行存储在上述存储介质中的代码或者可读指令来实现。
此外,客户可以通过各类输入/输出(I/O)设备471向负载均衡引擎400输入命令和信息。I/O设备471通常是通过输入/输出接口460与处理器420进行通信。例如,当客户需要采用自定义的负载均衡算法,客户就可以通过I/O设备471将自定义的负载均衡算法提供给处理器420,以便处理器420根据该自定义负载均衡算法计算负载均衡策略。
负载均衡引擎400可以包括网络接口460,处理器420通过网络接口460,与远程计算机461(即分布式计算系统30中的计算节点)进行通信,从而获取前述实施例所描述的全局负载信息,全局服务信息以及服务调用关系,并基于这些信息,通过执行存储介质中的指令,计算负载均衡策略(第一负载均衡策略以及第二负载均衡策略);然后将计算的负载均衡策略发布给客户端或者计算节点。
在另一种实施例中,客户端可以采用如图11所示的结构来实现。如图11所示,客户端500可以包括:处理器51,存储器52以及网络接口53,其中处理器51,存储器52以及网络接口53通过系统总线54实现相互通信。
网络接口53用于获取由负载均衡引擎发布的第一负载均衡策略并缓存在存储器52中,所述第一负载均衡策略指示了第一服务类型的服务消息的分发信息;
处理器51,用于接收第一服务请求,并查询所述存储器52,确定所述存储器52存储的所述第一负载均衡策略与所述第一服务请求是否匹配;
处理器51,还用于在所述存储器52存储的所述第一负载均衡策略与所述第一服务请求匹配的情况下,根据所述第一负载均衡策略指示的分发信息,从所述M个计算节点中确定与所述第一服务请求相匹配的目标计算节点,并根据所述第一负载均衡策略指示的分发信息,通过网络接口53将所述第一服务请求对应的服务消息,发送给所述目标计算节点。
需要说明的是,处理器51在执行相应的功能时,是根据存储在存储器52或者其它存储装置中的指令进行的。进一步地,图11所示的客户端采用的是通用计算机的结构,而前述的计算节点,也可以是采用通用计算机结构的物理机,因此,计算节点的结构也可以参考图11所示的结构,只是处理器执行不同的功能而已,不再赘述,同时,计算节点所提供的各种服务,也就是运行在处理器上的各种进程。
如图12所示,本发明实施例还提供了一种负载均衡方法,应用于如图3或图4所示的分布式计算系统。所述方法包括如下步骤:
S101:获取所述分布式计算系统的全局负载信息,所述全局负载信息指示了所述分布式计算系统中的M个计算节点各自的负载;
S102:获取所述分布式计算系统的全局服务信息,所述全局服务信息指示了所述M个计算节点所提供的服务的类型,其中,M为大于1的自然数;
S104:针对第一服务类型,利用所述全局负载信息以及所述全局服务信息进行负载均衡计算,生成与所述第一服务类型相对应的第一负载均衡策略,其中,所述第一服务类型 为所述M个计算节点所提供的服务的类型中的至少一种,所述第一负载均衡策略指示了与所述第一服务类型对应的服务消息的分发信息;
S105:将所述第一负载均衡策略发布给客户端。
所述方法还可以包括
S105:获取所述M个计算节点之间的服务调用关系;
则步骤S104可以包括:
针对所述第一服务类型,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成所述第一负载均衡策略。
更具体地,步骤S104可以包括:
根据所述全局服务信息,从所述M个计算节点中确定提供所述第一服务类型的服务的目标计算节点;
根据所述服务调用关系,从所述M个计算节点中确定与所述目标计算节点提供的所述第一服务类型的服务存在调用关系的相关计算节点;
根据所述全局负载信息,确定所述目标计算节点以及所述相关计算节点的负载,并进行负载均衡计算,生成所述第一负载均衡策略。
所述方法还可以包括:
S106:基于预设的服务时延,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成第二负载均衡策略,所述第二负载均衡策略用于指示所述M个计算节点进行服务调整;示例性的,所述第二负载均衡策略可以指示对存在服务调用关系的至少两个计算节点之间的服务消息的分发比例进行调整,或者,所述第二负载均衡策略可以指示对存在服务调用关系的计算节点之间的服务位置进行调整,或者所述第二负载均衡策略还可以指示对存在服务调用关系的计算节点之间的服务进行扩容或者删减。
S107:将所述第二负载均衡策略发布给所述M个计算节点。
需要说明的是,上述负载均衡方法均是由负载均衡引擎实施的,且并未限定各个步骤之间的先后顺序。由于该方法与前述负载均衡引擎的装置实施例相对应,因此,相关的方法细节可以参考前述的负载均衡引擎的装置实施例。此处不再赘述。
进一步地,如图13所示,本实施例还提供了一种负载均衡方法,应用于如图3或图4所示的分布式计算系统中的客户端,所述方法包括如下步骤:
S201:获取并缓存由所述负载均衡引擎发布的第一负载均衡策略,所述第一负载均衡策略指示了第一服务类型的服务消息的分发信息;
S202:接收第一服务请求;
S203:查询缓存的所述第一负载均衡策略,在缓存的所述第一负载均衡策略与所述第一服务请求匹配的情况下,根据所述第一负载均衡策略指示的分发信息,从所述M个计算节点中确定与所述第一服务请求相匹配的目标计算节点;
S204:根据所述第一负载均衡策略指示的分发信息,将所述第一服务请求对应的服务消息,发送给所述目标计算节点。
进一步地,如图14所示,本实施例还提供了一种负载均衡方法,应用于如图3或图4 所示的分布式计算系统中的计算节点或者管理节点,所述方法包括如下步骤:
S301:接收负载均衡引擎发送的第二负载均衡策略;
S302:根据所述第二负载均衡策略,对计算节点进行服务调整。
应当理解,此处所描述的具体实施例仅为本发明的普通实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (21)

  1. 一种负载均衡引擎,应用于分布式计算系统,其特征在于,所述负载均衡引擎包括:
    负载信息管理模块,用于获取所述分布式计算系统的全局负载信息,所述全局负载信息指示了所述分布式计算系统中的M个计算节点各自的负载;
    服务信息管理模块,用于获取所述分布式计算系统的全局服务信息,所述全局服务信息指示了所述M个计算节点所提供的服务的类型,其中,M为大于1的自然数;
    策略计算模块,用于针对第一服务类型,利用所述全局负载信息以及所述全局服务信息进行负载均衡计算,生成与所述第一服务类型相对应的第一负载均衡策略,其中,所述第一服务类型为所述M个计算节点所提供的服务的类型中的至少一种,所述第一负载均衡策略指示了与所述第一服务类型对应的服务消息在所述M个计算节点中的分发信息;
    策略发布模块,用于将所述第一负载均衡策略发布给客户端。
  2. 如权利要求1所述的负载均衡引擎,其特征在于,所述负载均衡引擎还包括:服务全局视图,用于获取所述M个计算节点之间的服务调用关系;
    所述策略计算模块则用于用于针对所述第一服务类型,利用所述负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成所述第一负载均衡策略。
  3. 如权利要求2所述的负载均衡引擎,其特征在于,所述策略计算模块具体用于:
    根据所述全局服务信息,从所述M个计算节点中确定提供所述第一服务类型的服务的目标计算节点;
    根据所述服务调用关系,从所述M个计算节点中确定与所述目标计算节点提供的所述第一服务类型的服务存在调用关系的相关计算节点;
    根据所述全局负载信息,确定所述目标计算节点以及所述相关计算节点的负载,并进行负载均衡计算,生成所述第一负载均衡策略。
  4. 如权利要求2所述的负载均衡引擎,其特征在于,所述策略计算模块还用于:基于预设的服务时延,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成第二负载均衡策略,所述第二负载均衡策略用于指示所述M个计算节点进行服务调整;
    所述策略发布模块,还用于将所述第二负载均衡策略发布给所述M个计算节点。
  5. 如权利要求4所述的负载均衡引擎,其特征在于,所述第二负载均衡策略指示了对存在服务调用关系的至少两个计算节点之间的服务消息的分发比例进行调整。
  6. 如权利要求4所述的负载均衡引擎,其特征在于,所述第二负载均衡策略指示了对存在服务调用关系的计算节点之间的服务位置进行调整。
  7. 如权利要求4所述的负载均衡引擎,其特征在于,所述第二负载均衡策略指示了对存在服务调用关系的计算节点之间的服务进行扩容或者删减。
  8. 如权利要求1至7任一所述的负载均衡引擎,其特征在于,所述全局负载信息,所述全局服务信息以及所述服务调用关系均是周期性获取的;所述策略计算模块,用于周期性地计算所述第一负载均衡策略或所述第二负载均衡策略,并通过所述策略发布模块进行周期性地发布。
  9. 一种客户端,应用于分布式计算系统,其特征在于,所述分布式计算系统包括负载 均衡引擎以及M个计算节点,其中,M为大于1的自然数,所述客户端包括:
    本地缓存,用于获取并缓存由所述负载均衡引擎发布的第一负载均衡策略,所述第一负载均衡策略指示了第一服务类型的服务消息的分发信息;
    服务管理模块,用于接收第一服务请求;
    负载策略计算模块,用于查询所述本地缓存,在所述本地缓存存储的所述第一负载均衡策略与所述第一服务请求匹配的情况下,根据所述第一负载均衡策略指示的分发信息,从所述M个计算节点中确定与所述第一服务请求相匹配的目标计算节点;
    所述服务管理模块,还用于根据所述第一负载均衡策略指示的分发信息,将所述第一服务请求对应的服务消息,发送给所述目标计算节点。
  10. 一种分布式计算系统,其特征在于,包括:如权利要求1至8所述的负载均衡引擎,以及耦合至所述负载均衡引擎的M个计算节点。
  11. 如权利要求10所述的分布式计算系统,其特征在于,还包括:如权利要求9所述的客户端。
  12. 如权利要求10或11所述的分布式计算系统,其特征在于,还包括:
    注册服务器,用于收集所述M个计算节点的所述全局服务信息,并将所述全局服务信息发送给所述负载均衡引擎。
  13. 如权利要求10至12任一所述的分布式计算系统,其特征在于,还包括:
    监控模块,用于通过监控所述M个计算节点的负载,获取所述全局负载信息并发送给所述负载均衡引擎。
  14. 如权利要求10至13任一所述的分布式计算系统,其特征在于,还包括:
    管理节点,用于接收所述负载均衡引擎发送的所述第二负载均衡策略,并根据所述第二负载均衡策略,对所述M个计算节点进行服务调整。
  15. 一种负载均衡方法,应用于分布式计算系统,其特征在于,包括:
    获取所述分布式计算系统的全局负载信息,所述全局负载信息指示了所述分布式计算系统中的M个计算节点各自的负载;
    获取所述分布式计算系统的全局服务信息,所述全局服务信息指示了所述M个计算节点所提供的服务的类型,其中,M为大于1的自然数;
    针对第一服务类型,利用所述全局负载信息以及所述全局服务信息进行负载均衡计算,生成与所述第一服务类型相对应的第一负载均衡策略,其中,所述第一服务类型为所述M个计算节点所提供的服务的类型中的至少一种,所述第一负载均衡策略指示了与所述第一服务类型对应的服务消息的分发信息;
    将所述第一负载均衡策略发布给客户端。
  16. 如权利要求15所述的负载均衡方法,其特征在于,所述方法还包括
    获取所述M个计算节点之间的服务调用关系;
    则针对第一服务类型,利用所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成与所述第一服务类型相对应的第一负载均衡策略的步骤包括:
    针对所述第一服务类型,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成所述第一负载均衡策略。
  17. 如权利要求16所述的负载均衡方法,其特征在于,针对所述第一服务类型,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成所述第一负载均衡策略的步骤包括:
    根据所述全局服务信息,从所述M个计算节点中确定提供所述第一服务类型的服务的目标计算节点;
    根据所述服务调用关系,从所述M个计算节点中确定与所述目标计算节点提供的所述第一服务类型的服务存在调用关系的相关计算节点;
    根据所述全局负载信息,确定所述目标计算节点以及所述相关计算节点的负载,并进行负载均衡计算,生成所述第一负载均衡策略。
  18. 如权利要求16或17所述的负载均衡方法,其特征在于,还包括:
    基于预设的服务时延,利用所述全局负载信息,所述全局服务信息以及所述服务调用关系进行负载均衡计算,生成第二负载均衡策略,所述第二负载均衡策略用于指示所述M个计算节点进行服务调整;
    将所述第二负载均衡策略发布给所述M个计算节点。
  19. 如权利要求18所述的负载均衡方法,其特征在于,所述第二负载均衡策略指示了对存在服务调用关系的至少两个计算节点之间的服务消息的分发比例进行调整。
  20. 如权利要求18所述的负载均衡方法,其特征在于,所述第二负载均衡策略指示了对存在服务调用关系的计算节点之间的服务位置进行调整。
  21. 如权利要求18所述的负载均衡方法,其特征在于,所述第二负载均衡策略指示了对存在服务调用关系的计算节点之间的服务进行扩容或者删减。
PCT/CN2018/083088 2017-06-30 2018-04-13 负载均衡引擎,客户端,分布式计算系统以及负载均衡方法 WO2019001092A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP18825057.5A EP3637733B1 (en) 2017-06-30 2018-04-13 Load balancing engine, client, distributed computing system, and load balancing method
US16/725,854 US20200137151A1 (en) 2017-06-30 2019-12-23 Load balancing engine, client, distributed computing system, and load balancing method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710526509.3A CN109218355B (zh) 2017-06-30 2017-06-30 负载均衡引擎,客户端,分布式计算系统以及负载均衡方法
CN201710526509.3 2017-06-30

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/725,854 Continuation US20200137151A1 (en) 2017-06-30 2019-12-23 Load balancing engine, client, distributed computing system, and load balancing method

Publications (1)

Publication Number Publication Date
WO2019001092A1 true WO2019001092A1 (zh) 2019-01-03

Family

ID=64740940

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/083088 WO2019001092A1 (zh) 2017-06-30 2018-04-13 负载均衡引擎,客户端,分布式计算系统以及负载均衡方法

Country Status (4)

Country Link
US (1) US20200137151A1 (zh)
EP (1) EP3637733B1 (zh)
CN (1) CN109218355B (zh)
WO (1) WO2019001092A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110262872A (zh) * 2019-05-17 2019-09-20 平安科技(深圳)有限公司 负载均衡应用管理方法、装置、计算机设备及存储介质
CN111796768A (zh) * 2020-06-30 2020-10-20 中国工商银行股份有限公司 分布式服务协调方法、装置及系统
CN112202845A (zh) * 2020-09-10 2021-01-08 广东电网有限责任公司 一种面向配用电业务的边缘计算网关负荷系统、分析方法以及其配电系统
CN112764926A (zh) * 2021-01-19 2021-05-07 汉纳森(厦门)数据股份有限公司 一种基于负载感知的数据流动态负载均衡策略分析方法
CN115550368A (zh) * 2022-11-30 2022-12-30 苏州浪潮智能科技有限公司 一种元数据上报方法、装置、设备及存储介质
US11973823B1 (en) * 2023-01-11 2024-04-30 Dell Products L.P. Offloading namespace redirection to backup clients in a scale out cluster

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110297699B (zh) * 2018-03-23 2021-09-14 华为技术有限公司 调度方法、调度器、存储介质及系统
US11194800B2 (en) * 2018-04-26 2021-12-07 Microsoft Technology Licensing, Llc Parallel search in program synthesis
CN108769271A (zh) * 2018-08-20 2018-11-06 北京百度网讯科技有限公司 负载均衡的方法、装置、存储介质和终端设备
CN110113399A (zh) * 2019-04-24 2019-08-09 华为技术有限公司 负载均衡管理方法及相关装置
CN110442447B (zh) * 2019-07-05 2023-07-28 中国平安人寿保险股份有限公司 基于消息队列的负载均衡方法、装置和计算机设备
CN110601994B (zh) * 2019-10-14 2021-07-16 南京航空航天大学 云环境下微服务链感知的负载均衡方法
CN112751897B (zh) * 2019-10-31 2022-08-26 贵州白山云科技股份有限公司 负载均衡方法、装置、介质及设备
CN112995265A (zh) * 2019-12-18 2021-06-18 中国移动通信集团四川有限公司 请求分发方法、装置及电子设备
CN111092948A (zh) * 2019-12-20 2020-05-01 深圳前海达闼云端智能科技有限公司 一种引导的方法、引导服务器、服务器及存储介质
CN111030938B (zh) * 2019-12-20 2022-08-16 锐捷网络股份有限公司 基于clos架构的网络设备负载均衡方法及装置
CN111522661A (zh) 2020-04-22 2020-08-11 腾讯科技(深圳)有限公司 一种微服务管理系统、部署方法及相关设备
CN113810443A (zh) * 2020-06-16 2021-12-17 中兴通讯股份有限公司 资源管理方法、系统、代理服务器及存储介质
CN111737017B (zh) * 2020-08-20 2020-12-18 北京东方通科技股份有限公司 一种分布式元数据管理方法和系统
US11245608B1 (en) * 2020-09-11 2022-02-08 Juniper Networks, Inc. Tunnel processing distribution based on traffic type and learned traffic processing metrics
CN113079504A (zh) * 2021-03-23 2021-07-06 广州讯鸿网络技术有限公司 5g消息dm多负载均衡器接入实现方法、装置及系统
CN113472901B (zh) * 2021-09-02 2022-01-11 深圳市信润富联数字科技有限公司 负载均衡方法、装置、设备、存储介质及程序产品
CN114466019B (zh) * 2022-04-11 2022-09-16 阿里巴巴(中国)有限公司 分布式计算系统、负载均衡方法、设备及存储介质
CN114827276B (zh) * 2022-04-22 2023-10-24 网宿科技股份有限公司 基于边缘计算的数据处理方法、设备及可读存储介质
US11917000B2 (en) * 2022-05-12 2024-02-27 Bank Of America Corporation Message queue routing system
CN115580901B (zh) * 2022-12-08 2023-05-16 深圳市永达电子信息股份有限公司 通信基站组网方法、通信系统、电子设备及可读存储介质
CN117014375B (zh) * 2023-10-07 2024-02-09 联通在线信息科技有限公司 Cdn设备自适应流量控制和快速上下线的方法及设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101741907A (zh) * 2009-12-23 2010-06-16 金蝶软件(中国)有限公司 一种均衡服务器负载的方法、系统和主服务器
CN101753558A (zh) * 2009-12-11 2010-06-23 安徽科大讯飞信息科技股份有限公司 一种分布式mrcp服务器负载均衡系统及其均衡方法
CN102571849A (zh) * 2010-12-24 2012-07-11 中兴通讯股份有限公司 云计算系统及方法
WO2016054272A1 (en) * 2014-09-30 2016-04-07 Nicira, Inc. Inline service switch

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9219686B2 (en) * 2006-03-31 2015-12-22 Alcatel Lucent Network load balancing and overload control
ATE471025T1 (de) * 2006-09-13 2010-06-15 Alcatel Lucent Verkettung von web services
CN101355522B (zh) * 2008-09-18 2011-02-23 中兴通讯股份有限公司 一种媒体服务器的控制方法和系统
CN103051551B (zh) * 2011-10-13 2017-12-19 中兴通讯股份有限公司 一种分布式系统及其自动维护方法
US8661136B2 (en) * 2011-10-17 2014-02-25 Yahoo! Inc. Method and system for work load balancing
US9667711B2 (en) * 2014-03-26 2017-05-30 International Business Machines Corporation Load balancing of distributed services
CN103945000B (zh) * 2014-05-05 2017-06-13 科大讯飞股份有限公司 一种负载均衡方法及负载均衡器
US9774537B2 (en) * 2014-09-30 2017-09-26 Nicira, Inc. Dynamically adjusting load balancing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101753558A (zh) * 2009-12-11 2010-06-23 安徽科大讯飞信息科技股份有限公司 一种分布式mrcp服务器负载均衡系统及其均衡方法
CN101741907A (zh) * 2009-12-23 2010-06-16 金蝶软件(中国)有限公司 一种均衡服务器负载的方法、系统和主服务器
CN102571849A (zh) * 2010-12-24 2012-07-11 中兴通讯股份有限公司 云计算系统及方法
WO2016054272A1 (en) * 2014-09-30 2016-04-07 Nicira, Inc. Inline service switch

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3637733A4 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110262872A (zh) * 2019-05-17 2019-09-20 平安科技(深圳)有限公司 负载均衡应用管理方法、装置、计算机设备及存储介质
CN110262872B (zh) * 2019-05-17 2023-09-01 平安科技(深圳)有限公司 负载均衡应用管理方法、装置、计算机设备及存储介质
CN111796768A (zh) * 2020-06-30 2020-10-20 中国工商银行股份有限公司 分布式服务协调方法、装置及系统
CN111796768B (zh) * 2020-06-30 2023-08-22 中国工商银行股份有限公司 分布式服务协调方法、装置及系统
CN112202845A (zh) * 2020-09-10 2021-01-08 广东电网有限责任公司 一种面向配用电业务的边缘计算网关负荷系统、分析方法以及其配电系统
CN112202845B (zh) * 2020-09-10 2024-01-23 广东电网有限责任公司 一种面向配用电业务的边缘计算网关负荷系统、分析方法以及其配电系统
CN112764926A (zh) * 2021-01-19 2021-05-07 汉纳森(厦门)数据股份有限公司 一种基于负载感知的数据流动态负载均衡策略分析方法
CN115550368A (zh) * 2022-11-30 2022-12-30 苏州浪潮智能科技有限公司 一种元数据上报方法、装置、设备及存储介质
CN115550368B (zh) * 2022-11-30 2023-03-10 苏州浪潮智能科技有限公司 一种元数据上报方法、装置、设备及存储介质
US11973823B1 (en) * 2023-01-11 2024-04-30 Dell Products L.P. Offloading namespace redirection to backup clients in a scale out cluster

Also Published As

Publication number Publication date
CN109218355B (zh) 2021-06-15
EP3637733A1 (en) 2020-04-15
EP3637733B1 (en) 2021-07-28
CN109218355A (zh) 2019-01-15
US20200137151A1 (en) 2020-04-30
EP3637733A4 (en) 2020-04-22

Similar Documents

Publication Publication Date Title
WO2019001092A1 (zh) 负载均衡引擎,客户端,分布式计算系统以及负载均衡方法
Lu et al. Join-idle-queue: A novel load balancing algorithm for dynamically scalable web services
CN107087019B (zh) 一种基于端云协同计算架构的任务调度方法及装置
CN108776934B (zh) 分布式数据计算方法、装置、计算机设备及可读存储介质
WO2020143164A1 (zh) 一种网络资源的分配方法及设备
US9158586B2 (en) Systems and methods for managing cloud computing resources
WO2016119412A1 (zh) 一种云平台上的资源伸缩方法和一种云平台
US20170141944A1 (en) Verifier for network function virtualization resource allocation
US20120117242A1 (en) Service linkage system and information processing system
US20210103456A1 (en) Virtualized network function deployment
JP2012079242A (ja) 複合イベント分散装置、複合イベント分散方法および複合イベント分散プログラム
CN110149377A (zh) 一种视频服务节点资源分配方法、系统、装置及存储介质
CN107430526B (zh) 用于调度数据处理的方法和节点
US20230136612A1 (en) Optimizing concurrent execution using networked processing units
CN109614227A (zh) 任务资源调配方法、装置、电子设备及计算机可读介质
Al-Sinayyid et al. Job scheduler for streaming applications in heterogeneous distributed processing systems
CN111078516A (zh) 分布式性能测试方法、装置、电子设备
US11042413B1 (en) Dynamic allocation of FPGA resources
JP4834622B2 (ja) ビジネスプロセス運用管理システム、方法、プロセス運用管理装置およびそのプログラム
US9594596B2 (en) Dynamically tuning server placement
Zhang et al. Dynamic workload management in heterogeneous cloud computing environments
WO2020108337A1 (zh) 一种cpu资源调度方法及电子设备
KR20130060350A (ko) Atca-기반 장비에서 통신 트래픽을 스케줄링하기 위한 방법 및 장치
US11513856B2 (en) Method, devices and computer program products for resource usage
Yang et al. High-performance docker integration scheme based on OpenStack

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18825057

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018825057

Country of ref document: EP

Effective date: 20200107