CN112866394A - Load balancing method, device, system, computer equipment and storage medium - Google Patents

Load balancing method, device, system, computer equipment and storage medium Download PDF

Info

Publication number
CN112866394A
CN112866394A CN202110103614.2A CN202110103614A CN112866394A CN 112866394 A CN112866394 A CN 112866394A CN 202110103614 A CN202110103614 A CN 202110103614A CN 112866394 A CN112866394 A CN 112866394A
Authority
CN
China
Prior art keywords
client
load
server
long connection
reset event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110103614.2A
Other languages
Chinese (zh)
Other versions
CN112866394B (en
Inventor
陈键冬
李旦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huya Technology Co Ltd
Original Assignee
Guangzhou Huya Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huya Technology Co Ltd filed Critical Guangzhou Huya Technology Co Ltd
Priority to CN202110103614.2A priority Critical patent/CN112866394B/en
Publication of CN112866394A publication Critical patent/CN112866394A/en
Application granted granted Critical
Publication of CN112866394B publication Critical patent/CN112866394B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1012Server selection for load balancing based on compliance of requirements or conditions with available server resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1014Server selection for load balancing based on the content of a request

Abstract

The embodiment of the invention provides a load balancing method, a device, a system, computer equipment and a storage medium, wherein the method is applied to a client, the client is used for acquiring running data of the computer equipment and transmitting the running data to a server in a service cluster through long connection, and the method comprises the following steps: the method comprises the steps of receiving a reset event, wherein the reset event represents load unbalance of a service cluster, responding to the reset event, disconnecting long connection between a client and a server, requesting a load balancer to establish long connection for the client and the server again so as to balance the load of the service cluster, actively executing Rebalance operation by the client, and not depending on a load balancer of a third party.

Description

Load balancing method, device, system, computer equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computer processing, in particular to a load balancing method, a device, a system, computer equipment and a storage medium.
Background
The multiple platforms on the network provide services such as live broadcast, short video, instant messaging, shopping and the like for users, and because the number of the users is large and the amount of data to be processed is large, a large amount of computer equipment is deployed on each platform.
In order to ensure that computer devices operate safely and reliably, clients are generally deployed in the computer devices to monitor the operating states of the computer devices for a long time, and the operating states are transmitted to a service cluster through long connection for analysis, so that problems occurring in the operating process of the computer devices are discovered in time.
The service cluster is provided with a plurality of service terminals, and some service terminals can fail in the running process and rejoin the service cluster after restarting. The computer equipment is constant in a certain time, correspondingly, the client is constant in a certain time, frequent increase and decrease of the client can not occur, and the long connection can be always maintained after the client establishes the long connection with the server, so that only a small number of clients are connected with the newly added server, most of the clients are connected with the original server, and the load of the service cluster is unbalanced.
At present, the long connection between the client and the server can be disconnected at the load balancer, so that the long connection is reestablished between all the clients and the server, and at the moment, the load balancing of the service cluster can be realized through the adjustment of the load balancer.
However, most load balancers do not support the above functions, and additional customized secondary development is required, which not only increases the development cost, but also complicates the logic of the load balancer, and the secondary development may affect the normal operation of the load balancer.
Disclosure of Invention
The embodiment of the invention provides a load balancing method, a device, a system, computer equipment and a storage medium, which are used for solving the problems of reducing the cost and ensuring the normal operation of a load balancer when the load of a service cluster is balanced.
In a first aspect, an embodiment of the present invention provides a load balancing method, which is applied to a client, where the client is configured to collect operation data of a computer device and transmit the operation data to a server in a service cluster through a long connection, and the method includes:
receiving a reset event, the reset event representing the service cluster load imbalance;
in response to the reset event, disconnecting the long connection between the client and the server;
and requesting a load balancer to establish long connection between the client and the server again so as to balance the load of the service cluster.
In a second aspect, an embodiment of the present invention further provides a load balancing apparatus, which is applied to a client, where the client is configured to collect operation data of a computer device and transmit the operation data to a server in a service cluster through a long connection, and the apparatus includes:
a reset event receiving module to receive a reset event, the reset event representing the service cluster load imbalance;
a long connection disconnection module for disconnecting the long connection between the client and the server in response to the reset event;
and the long connection establishing module is used for requesting the load balancer to establish long connection for the client and the server again so as to balance the load of the service cluster.
In a third aspect, an embodiment of the present invention further provides a load balancing system, where the system includes a configuration center, multiple clients, a load balancer, and a service cluster, where the service cluster has multiple servers;
the client is used for acquiring the operating data of the computer equipment and transmitting the operating data to the server in the service cluster through long connection;
the configuration center is used for receiving load recorded when the plurality of service terminals provide services for the plurality of client terminals, determining load unbalance of the service cluster according to the load, responding to the load unbalance, and sending a reset event to the plurality of client terminals;
the client is further configured to respond to the reset event, disconnect the long connection with the server, and request the load balancer to re-establish the long connection between the client and the server, so as to load balance the service cluster.
In a fourth aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the load balancing method of the first aspect.
In a fifth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when executed by a processor, the computer program implements the load balancing method according to the first aspect.
In this embodiment, a client is used for acquiring running data of a computer device, the running data is transmitted to a server in a service cluster through long connection, the client receives a reset event, the reset event represents load unbalance of the service cluster, the long connection between the client and the server is disconnected in response to the reset event, the load balancer is requested to establish long connection for the client and the server again so as to balance load of the service cluster, the client actively executes Rebalance operation, the client does not depend on a load balancer of a third party, logic of the client is simple, the addition of Rebalance operation has little influence on the client, flexibility is higher, customization degree is higher, the load balancer of the third party does not need to be developed additionally, and development cost is greatly reduced.
Drawings
Fig. 1 is a flowchart of a load balancing method according to an embodiment of the present invention;
fig. 2 is a topology diagram of a load balancing system according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating an example of calculating an imbalance event according to an embodiment of the present invention;
fig. 4 is a flowchart of a load balancing method according to a second embodiment of the present invention;
fig. 5 is a schematic structural diagram of a load balancing apparatus according to a third embodiment of the present invention;
fig. 6 is a schematic structural diagram of a load balancing system according to a fourth embodiment of the present invention;
fig. 7 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a load balancing method according to an embodiment of the present invention, where the embodiment is applicable to a situation where a client actively reconnects a server in a monitored business scenario, and the method may be executed by a load balancing apparatus, where the load balancing apparatus may be implemented by software and/or hardware, and may be configured in a computer device, such as a server, a workstation, a personal computer, and the like.
As shown in fig. 2, the following roles are typically included in a monitored business scenario:
1. configuration center 201
In this embodiment, the configuration center 201 may be configured to monitor a load state of the server 206, determine overall load balancing or load imbalance of the service cluster 205 according to the load state of the server 206, and control the client 203 to perform Rebalance operation (disconnection reconnection operation, i.e., disconnecting and reconnecting to achieve load balancing operation) when the load of the service cluster 205 is unbalanced.
Of course, besides the above functions, the configuration center 201 may also configure other functions in other service scenarios, for example, send metadata to the client 203, where the metadata includes configuration parameters for the client 203 to collect the operation data of the computer device 202, and the like, which is not limited in this embodiment.
2. Computer device 202
The computer device 202 may provide online services for the platform, such as live broadcast, on demand, publishing short videos, ordering purchases, etc., and offline services for the platform, such as clustering video data, screening audience users for anchor users who may be interested in, training, etc.
The services provided by the computer devices 202 are large, and therefore, the number of the computer devices 202 is large, and reaches ten thousand levels generally.
3. Client 203
The computer device 202 is used as a monitored device, each computer device 202 may be deployed with a client 203, and the client 203 may collect data representing an operating state of each component (such as a memory, a Central Processing Unit (CPU), a disk, a network interface, and the like) of the computer device 202 when the component is operating, as operating data, for example, a memory usage rate, a CPU usage rate, a disk usage rate, an Input/Output (I/O) amount of the disk, a bandwidth of the network interface, and the like.
Thereafter, the client 203 transmits the operation data to the server 206 in the service cluster 205 through the long connection.
4. Load Balance (LB) 204
The load balancer 204 is usually deployed in the form of a Cluster (LB Cluster), such as nginx, LVS, keepalived, and the like, which has a global view, and can know all the connections between the clients 203 and the servers 206, the clients 203 connect IP addresses provided by the load balancer 204, and the load balancer 204 can be responsible for scheduling long connections between the clients 203 and the servers 206, so that the service Cluster 205 is load balanced.
The long connection means that a plurality of data packets can be continuously transmitted (i.e., continuous communication) over one connection, so that overhead for frequently establishing the connection is reduced, and if no data packet is transmitted during the connection holding period, both sides transmit a link detection packet.
The load balancer 204 is adapted to handle HTTP (hypertext Transfer Protocol) services, which are essentially short connections and are stateless requests to the load balancer 204. For long connections, such as TCP (Transmission Control Protocol) services, which are different, the load balancer 204 selects a server 206 for the client 203 through a load balancing algorithm, and binds the client 203 and an instance of the server 206 for Transmission of subsequent operation data.
5. Service cluster 205
The service cluster 205 may collect the operation data monitored by the client 203 for the computer device 202, analyze the operation data, and monitor the operation state of the computer device 202.
Further, the service cluster 205 provides a plurality of instances, which are denoted as the service end 206, the service end 206 establishes a long connection with the client 203, so as to collect the operation data, which is monitored by the client 203 on the computer device 202, through the long connection, the services provided by the service cluster 205 are more centralized, and therefore, the number of the service ends 206 in the service cluster 205 is less, and is usually in the order of ten.
As shown in fig. 1, the method specifically includes the following steps:
step 101, receiving a reset event.
In this embodiment, if the configuration center monitors that the load of the service cluster is unbalanced, a reset event RST may be generated, so that the reset event indicates that the load of the service cluster is unbalanced, that is, the client accesses a part of the service terminals in the service cluster in a centralized manner, so that the load of the part of the service terminals is significantly higher than that of other service terminals, and the load between the service terminals is unbalanced.
In a specific implementation, each server may periodically (e.g., 10 seconds) collect its own load, and transmit the load to the configuration center, so that the configuration center may periodically receive the load recorded when the plurality of servers provide services for the plurality of clients, and since the load collection and load transmission of the servers have a delay, the servers are not necessarily synchronized, but the delay is low and is usually much lower than a period (e.g., 10 seconds) during which the load is collected and transmitted within a range of 1 second, so that the configuration center may usually collect and complete the loads of all the servers in one period, and when each period ends, call a customized load decision algorithm to determine whether the service cluster is currently load balanced or load unbalanced according to the load in the same period.
The load is a numerical embodiment of a load state when the server provides service for the client.
For example, in a monitored business scenario, when an instantiated server side provides a service for collecting operation data of a computer device to a client side, the load amount may be a Query Per Second (QPS), and the QPS is fair for each server side and can describe a load condition of the server side more accurately, so as to measure a load condition between different server sides.
Of course, other data besides QPS may be used as the load, for example, the usage rate of the CPU, the usage rate of the memory, and the like, which is not limited in this embodiment.
Further, because the load amounts collected by the respective service terminals in the service cluster are different, the configuration center has different definitions for load balancing or load imbalance of the service cluster, that is, different load decision algorithms.
In an example, the configuration center may sort the load amounts of the servers, and filter out n (n is a positive integer, e.g., 2) load amounts with the highest numerical values provided by the servers, and m (m is a positive integer, e.g., 2) load amounts with the lowest numerical values provided by the servers, so as to calculate a first load amount and a second load amount, where the first load amount is an average value of the n load amounts with the highest numerical values provided by the servers, and the second load amount is an average value of the m load amounts with the lowest numerical values provided by the servers.
If the ratio of the first load amount to the second load amount exceeds a preset difference threshold, it indicates that the load difference between the service ends is large, and determines that an imbalance event occurs, where the difference threshold is a constant greater than 1, such as 3.
At this time, the condition for the occurrence of an imbalance event may be expressed as:
avg(head_n)/avg(tail_m)>z
wherein avg represents a function of averaging, head _ n represents n load amounts with the highest values, tail _ m represents m load amounts with the lowest values, and z represents a difference threshold.
For example, as shown in fig. 3, if there are currently 10 servers, Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8, Q9, and Q10 are respectively sorted in descending QPS order, the QPS calculation average values of Q1 and Q2 are taken, the QPS calculation average values of Q9 and Q10 are taken, and an imbalance event can be considered to occur when avg (QPS _ Q1)/avg (QPS _ Q3) > z is satisfied.
The configuration center may set a global counter, and the global counter may increment by 1 each time an unbalance event occurs, and may count the frequency P of occurrence of the unbalance event as R/T by taking out the value R of the global counter at consecutive times T (T is a positive integer).
If the frequency of the occurrence of the imbalance event exceeds a preset frequency threshold, the imbalance event is a common phenomenon and is not an accidental phenomenon, and at this time, the load imbalance of the service cluster can be determined.
Thus, in this example, the service cluster load imbalance may be represented by the frequency of occurrence of the imbalance event exceeding a preset frequency threshold, and the imbalance event is the ratio between the first load amount and the second load amount exceeding a preset difference threshold; the first load capacity is an average value of n load capacities with the highest numerical value provided by the server, and the second load capacity is an average value of m load capacities with the lowest numerical value provided by the server.
Of course, the load decision algorithm is only an example, and when implementing the embodiment of the present invention, other load decision algorithms may be set according to actual situations, for example, when the number of the clients is determined, a specified proportion (e.g. 40%) of the number of the clients is taken as a number threshold, the load amount may be the number of long connections, the number of long connections maintained by x (x is a positive integer, e.g. 4) service terminals with the highest long connection amount exceeds the number threshold, and at this time, the load of the service cluster may be considered to be unbalanced, and so on, which is not limited in the embodiment of the present invention. In addition, besides the load decision algorithm, a person skilled in the art may also use other load decision algorithms according to actual needs, and the embodiment of the present invention is not limited thereto.
Thereafter, the configuration center may send a reset event to each client, at which time the reset event, in addition to representing a service cluster load imbalance, may also be used to notify the client to actively perform a Rebalance operation.
Step 102, responding to a reset event, and disconnecting the long connection between the client and the server.
If the client receives the reset event issued by the configuration center, the client can actively disconnect the long connection between the current client and the server according to the indication of the reset event.
For long connections of different protocols, the manner of actively disconnecting the long connection by the client is different, and this embodiment does not limit this.
Taking TCP long connection as an example, after establishing TCP long connection, the client and the server are both in established state, at this time, the client initiates a request for disconnection:
1) and after calling the close () function, the client sends a FIN data packet (Finish data packet, which represents that the task is disconnected) to the server, and enters a FIN _ WAIT _1 state.
2) And after receiving the FIN data packet, the server detects that the FIN flag bit is set, and sends a confirmation data packet to the client instead of immediately disconnecting the connection, so that the client enters a CLOSE _ WAIT state.
3) And the client enters a FIN _ WAIT _2 state after receiving the confirmation data, and sends the data packet again after waiting for the server to finish the preparation.
4) After waiting for a moment, the server finishes the preparation and can disconnect the connection, then actively sends an FIN data packet to the client to inform the client and the server that the preparation is finished, and can disconnect the connection to enter an LAST _ ACK state.
5) And after receiving the FIN data packet of the server, the client sends an ACK data packet to the server to tell the server to disconnect and enter a TIME _ WAIT state.
6) And after receiving the ACK data packet of the client, the server disconnects, closes the socket and enters a CLOSED state.
Further, in the case that the number of the clients is large (for example, reaches the tens of millions), if the clients disconnect long connections at the same time, the load balancer is subsequently requested to establish long connections between the clients and the server again at the same time, so that the instantaneous stress of the load balancer is too large, for this purpose, the configuration center may independently generate a reset event for each client, and when the reset event is generated, randomly generate a value within a preset time range (for example, 0 to 4 seconds) as a delay time, and encapsulate the value into the reset event.
Then, when the client receives the reset event, the delay time can be read from the reset event, the timer is started, when the timer times out the delay time, the long connection between the client and the server is disconnected, and the time that the client requests the load balancer to establish the long connection for the client and the server again can be staggered by staggering the time for disconnecting the long connection of the client through the delay time, so that the transient pressure of the load balancer is prevented from being overlarge.
And 103, requesting the load balancer to establish long connection between the client and the server again so as to balance the load of the service cluster.
After the long connection between the client and the server is disconnected, the client can request the load balancer to establish the long connection between the client and the server again, and the load balancer can schedule the long connections between all the clients and all the servers by using a balanced load algorithm, so that the loads between the servers are balanced, and the load balance of the service cluster is realized integrally.
For long connections of different protocols, the manner of establishing the long connection by the client is different, and this embodiment does not limit this.
Taking TCP long connection as an example, when a client uses connect () to establish connection, the client and a server send three data packets to each other, and after the client calls a socket () function to create a socket, the socket is in a CLOSED state because connection is not established; after the server side calls a list () function, the socket enters a list state and starts to monitor the request of the client side.
At this time, the client starts to initiate a request:
1) after the client calls the connect () function, the TCP protocol will create a packet and set the SYN flag to indicate that the packet is used to create a synchronous connection. A random number 1000 is also generated to fill the "sequence number (Seq)" field, indicating the sequence number of the packet. After completing these tasks, and starting to SEND packets to the server, the client enters the SYN-SEND state.
2) And the server receives the data packet, detects that the SYN flag bit is set, and learns that the SYN flag bit is a request packet for establishing connection sent by the client. The server will also create a packet and set the SYN and ACK flags, where SYN indicates that the packet is used to establish a connection and ACK is used to acknowledge receipt of the packet sent by the client.
The server generates a random number 2000 to fill in the "sequence number (Seq)" field. 2000 has no relation to the client's data packets.
The server adds 1 to the client packet sequence number (1000) to get 1001 and fills the "acknowledgement number (Ack)" field with this number.
The server sends out the data packet and enters a SYN-RECV state.
3) And the client receives the data packet, detects that the SYN and the ACK flag bit are set, and learns that the data packet is an acknowledgement packet sent by the server. The client checks the "acknowledgement number (Ack)" field to see if its value is 1000+1, and if so, it indicates that the connection establishment is successful.
Then, the client will continue to build the data packet and set the ACK flag to indicate that the client correctly receives the acknowledgement packet sent by the server. Meanwhile, the number of the packet sequence (2000) sent from the server is added by 1 to obtain 2001, and the "acknowledgement number (Ack)" field is filled with this number.
The client sends out the data packet and enters an established state, which indicates that the connection is successfully established.
4) And the server receives the data packet, detects that the ACK flag bit is set, and learns that the data packet is an acknowledgement packet sent by the client. The server checks the "acknowledgement number (Ack)" field to see if its value is 2000+1, and if so, the server enters into established state.
Therefore, the client and the server enter the established state, the connection is successfully established, and then data can be transmitted and received.
For the load balancer, when a client sends a request of a long connection, the request is directly sent to a distributor (Director Server), and then the distributor distributes the request of the client to a Server (real Server) in a balanced manner according to a preset load balancing algorithm.
The load balancing algorithm which can be used by the load balancer comprises at least one of the following:
1. polling method
The requests of the client are distributed to each server in turn in sequence without considering the actual connection number and the current system load of each server.
2. Stochastic method
And randomly distributing the request of the client to each server. It is known from the probability statistics theory that as the number of times that the client calls the server increases, the actual effect of the client is closer to the average distribution, that is, the polling result.
3. Source address hashing method
The source address hashing method is that a numerical value is obtained through the calculation of a hashing function according to the IP address of the client, the numerical value is used for conducting modular extraction on the node number of the server, and the obtained result is that the server corresponding to the sequence number needs to be accessed. And (3) carrying out load balancing by adopting a source address hashing method, and when the list of the service ends is unchanged, the client ends with the same IP address can fall to the same service end for access each time.
4. The configuration of the machine and the load of the current system are different in different service ends of the weighted polling method, so that the pressure resistance of the machine and the load of the current system are different. Configuring a higher weight for a machine with high configuration and low load to process more requests; and configuring a machine with low load and high load, distributing lower weight to the machine, reducing the system load of the machine, and distributing the request sequence of the client to each server according to the weight.
5. Weighted random method
The weighted random method also distributes different weights to the load of the system according to the configuration of the machine of the server, and distributes the request of the client to each server randomly according to the weights.
Taking an LVS as an example, the LVS is an open source software project for implementing a load balancing cluster, the architecture of the LVS can be logically divided into a scheduling layer (Director), a server cluster layer (Real server) and a shared storage, and the LVS is divided into the following three modes from the implementation aspect.
The LVS generally performs load balancing by using NAT mode-Network address translation (VS/NAT), that is, the scheduler changes a target IP address of a request, that is, a virtual IP address (VIP), to an IP address of a service end, a returned data packet also passes through the scheduler, and the scheduler modifies a source address to the VIP.
Specifically, when the scheduler (LB) receives a packet requested by a client (the destination IP address of the request is VIP), it determines to which server (RS) the data is to be transmitted, based on a load balancing algorithm. Then, the scheduling changes the target IP address and port of the data packet sent by the client into the IP address (RIP) and port number of the server, so that the server (RS) can receive the data packet of the client. And after the server side responds to the request, the server side checks the default route (the default route of the RS is set as LB in the NAT mode) and sends the responded data packet to LB, and the LB changes the source address of the data packet into VIP after receiving the data packet and then sends the VIP back to the client side.
In this embodiment, a client is used for acquiring running data of a computer device, the running data is transmitted to a server in a service cluster through long connection, the client receives a reset event, the reset event represents load unbalance of the service cluster, the long connection between the client and the server is disconnected in response to the reset event, the load balancer is requested to establish long connection for the client and the server again so as to balance load of the service cluster, the client actively executes Rebalance operation, the client does not depend on a load balancer of a third party, logic of the client is simple, the addition of Rebalance operation has little influence on the client, flexibility is higher, customization degree is higher, the load balancer of the third party does not need to be developed additionally, and development cost is greatly reduced.
Example two
Fig. 4 is a flowchart of a load balancing method according to a second embodiment of the present invention, where this embodiment is based on the foregoing embodiment, and further adds processing of local running data by a client when a Rebalance operation is performed, where the method specifically includes the following steps:
step 401, receive a reset event.
Wherein the reset event represents a service cluster load imbalance.
Step 402, in response to a reset event, identifies a state of a scheduler.
If the load balancer actively breaks the long connection between the client and the server, the problem that the running data transmitted by the client is lost is easily caused, so that the monitoring of the computer equipment is influenced.
In contrast, in this embodiment, the client actively disconnects the long connection with the server, and the client can control the operation of disconnecting the long connection according to the transmission state of the operation data, so as to ensure that the operation data acquired by the client is normally transmitted to the server, thereby ensuring that the monitoring of the computer device is normally operated.
In a monitored service scene, communication between a client and a server is not in a streaming uninterrupted manner, the client sends operation data according to batches, and the size of each batch depends on the period for collecting the operation data.
Specifically, an acquisition component and a scheduler are arranged in a client, and a queue for caching running data is created in an internal memory, wherein the acquisition component is used for acquiring the running data of components in computer equipment and writing the running data into the queue, and the scheduler is used for reading the running data from the queue and transmitting the running data to a server through a long connection.
If the client receives a reset event issued by the configuration center, the current state of the scheduler can be identified, so that the operation of disconnecting the long connection can be triggered at a proper time.
Step 403, if the state of the scheduler is that the transmission of the operation data is completed, the scheduler is notified to suspend operation, so as to stop reading the operation data from the queue and transmit the operation data to the server through the long connection.
If the state of the scheduler is that the transmission of the running data is completed, that is, the scheduler does not execute any operation of "reading the running data from the queue and transmitting the running data to the server through the long connection", the running data normally transmitted to the server does not exist in the client, and the disconnection of the long connection does not cause the loss of the running data, at this time, the scheduler can be notified to suspend running, that is, the scheduler is stopped from reading the running data from the queue and transmitting the running data to the server through the long connection.
Step 404, if the scheduler has suspended operation, the long connection between the client and the server is disconnected.
If the scheduler feedback has stopped running, the client can disconnect the long connection between the current client and the server according to the protocol of the long connection.
In step 405, if the scheduler is in the state of transmitting the operation data, a wait operation is performed, and the process returns to step 402.
If the state of the scheduler is that the running data is being transmitted, namely the scheduler is executing any operation of 'reading the running data from the queue and transmitting the running data to the server through the long connection', the data normally transmitted to the server exists in the client, and the long connection is disconnected without causing the loss of the running data.
Step 406, requesting the load balancer to re-establish a long connection between the client and the server, so as to balance the load of the service cluster.
And step 407, maintaining the operation of the acquisition component.
In the process of executing the Rebalance operation (namely step 402-step 406) by the client, the collection component can be maintained to run, so that the running data of each component in the computer equipment is maintained to be collected, the running data is written into the queue, the running data is ensured to be complete and not lost, and the normal execution of monitoring on the computer equipment is ensured.
Generally, the Rebalance operation can be completed within several seconds, and the running data which can be stored in the queue can reach several minutes, so that the collection component continuously collects the running data of each component in the computer equipment and writes the running data into the queue in the process of executing the Rebalance operation by the client, and the queue does not overflow and cause abnormality.
And step 408, if the client establishes the long connection with the server again, the scheduler is informed to continue to operate so as to continue to read the operation data from the queue and transmit the operation data to the server through the long connection.
If the client side has re-established the long connection with the server side through the load balancer, at the moment, the scheduler can be informed to resume running, and the scheduler can continue to read the running data from the queue and transmit the running data to the server side through the long connection.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
EXAMPLE III
Fig. 5 is a block diagram of a load balancing apparatus according to a third embodiment of the present invention, which is applied to a client, where the client is configured to collect operation data of a computer device and transmit the operation data to a server in a service cluster through a long connection, and the apparatus may specifically include the following modules:
a reset event receiving module 501, configured to receive a reset event, where the reset event represents the load imbalance of the service cluster;
a long connection disconnection module 502, configured to disconnect the long connection between the client and the server in response to the reset event;
a long connection establishing module 503, configured to request the load balancer to re-establish a long connection between the client and the server, so as to balance the load of the service cluster.
In one embodiment of the present invention, the service cluster load imbalance is represented by the frequency of occurrence of imbalance events exceeding a preset frequency threshold;
the unbalance event is that the ratio of the first load amount to the second load amount exceeds a preset difference threshold;
the first load capacity is an average value of n load capacities with the highest numerical value provided by the server, and the second load capacity is an average value of m load capacities with the lowest numerical value provided by the server.
In one example of embodiment of the present invention, the load amount is a query rate per second.
In one embodiment of the present invention, the long connection disconnection module 502 comprises:
a delay time reading module for reading the delay time from the reset event;
and the connection delay disconnection module is used for disconnecting the long connection between the client and the server when the delay time is exceeded.
In one embodiment of the present invention, the long connection disconnection module 502 comprises:
the state identification module is used for responding to the reset event and identifying the state of a scheduler, and the scheduler is used for reading the running data from a preset queue and transmitting the running data to a server through long connection;
a pause module, configured to notify the scheduler to pause if the state of the scheduler is that transmission of the running data is completed, so as to stop reading the running data from the queue and transmit the running data to a server through a long connection;
a stop disconnection module, configured to disconnect the long connection between the client and the server if the scheduler has suspended operation.
In one embodiment of the present invention, the long connection disconnection module 502 further comprises:
and the acquisition maintaining module is used for maintaining the operation of an acquisition assembly, and the acquisition assembly is used for acquiring the operation data in the computer equipment and writing the operation data into a preset queue.
In one embodiment of the present invention, the long connection disconnection module 502 further comprises:
and the waiting module is used for executing a waiting operation and returning to call the state identification module if the state of the scheduler is that the running data is being transmitted.
In one embodiment of the present invention, further comprising:
and the continuous operation module is used for informing the scheduler to continuously operate if the client reestablishes long connection with the server, so as to continuously read the operation data from the queue and transmit the operation data to the server through the long connection.
The load balancing device provided by the embodiment of the invention can execute the load balancing method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 6 is a block diagram of a load balancing system according to a fourth embodiment of the present invention, where the system includes a configuration center 610, a plurality of clients 620, a load balancer 630, and a service cluster 640, where the service cluster 640 has a plurality of servers 641;
the client 620 is configured to collect operation data of a computer device, and transmit the operation data to the server 641 in the service cluster 640 through a long connection;
the configuration center 610 is configured to receive load amounts recorded when the plurality of service terminals 640 provide services for the plurality of clients 410, determine load imbalance of the service cluster 640 according to the load amounts, and send a reset event to the plurality of clients 620 in response to the load imbalance;
the client 620 is further configured to disconnect the long connection with the server 641 in response to the reset event, and request the load balancer 630 to reestablish the long connection between the client 610 and the server 641, so as to load balance the service cluster 640.
In one embodiment of the present invention, the configuration center 610 is further configured to:
calculating a first load and a second load, wherein the first load is an average value of n load with the highest numerical value provided by the server, and the second load is an average value of m load with the lowest numerical value provided by the server;
if the ratio of the first load amount to the second load amount exceeds a preset difference threshold value, determining that an unbalance event occurs;
and if the frequency of the unbalance event exceeds a preset frequency threshold, determining that the load of the service cluster is unbalanced.
In one example of the embodiment of the present invention, the load amount is a query rate per second.
In an embodiment of the present invention, the client 620 is further configured to:
reading a delay time from the reset event;
and when the delay time is exceeded, disconnecting the long connection between the client and the server.
In an embodiment of the present invention, the client 620 is further configured to:
responding to the reset event, and identifying the state of a scheduler, wherein the scheduler is used for reading the running data from a preset queue and transmitting the running data to a server through a long connection;
if the state of the scheduler is that the transmission of the running data is finished, the scheduler is informed to pause to stop reading the running data from the queue and transmit the running data to a server through long connection;
and if the scheduler is suspended, disconnecting the long connection between the client and the server.
In an embodiment of the present invention, the client 620 is further configured to:
and maintaining the operation of an acquisition assembly, wherein the acquisition assembly is used for acquiring the operation data in the computer equipment and writing the operation data into a preset queue.
In an embodiment of the present invention, the client 620 is further configured to:
and if the state of the scheduler is that the running data is being transmitted, executing a waiting operation, returning to execute the response to the reset event, and identifying the state of the scheduler.
In an embodiment of the present invention, the client 620 is further configured to:
and if the client establishes long connection with the server again, the scheduler is informed to continue to operate so as to continue to read the operating data from the queue and transmit the operating data to the server through the long connection.
The load balancing system provided by the embodiment of the invention can execute the load balancing method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
EXAMPLE five
Fig. 7 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention. FIG. 7 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in fig. 7 is only an example and should not bring any limitations to the functionality or scope of use of the embodiments of the present invention.
As shown in FIG. 7, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 7, and commonly referred to as a "hard drive"). Although not shown in FIG. 7, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, computer device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the system memory 28, for example, to implement the load balancing method provided by the embodiment of the present invention.
EXAMPLE six
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the load balancing method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.
A computer readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (12)

1. A load balancing method is applied to a client, wherein the client is used for collecting operation data of computer equipment and transmitting the operation data to a server in a service cluster through a long connection, and the method comprises the following steps:
receiving a reset event, the reset event representing the service cluster load imbalance;
in response to the reset event, disconnecting the long connection between the client and the server;
and requesting a load balancer to establish long connection between the client and the server again so as to balance the load of the service cluster.
2. The method of claim 1, wherein the service cluster load imbalance is manifested by a frequency of occurrence of an imbalance event exceeding a preset frequency threshold;
the unbalance event is that the ratio of the first load amount to the second load amount exceeds a preset difference threshold;
the first load capacity is an average value of n load capacities with the highest numerical value provided by the server, and the second load capacity is an average value of m load capacities with the lowest numerical value provided by the server.
3. The method of claim 2, wherein the load amount is a query rate per second.
4. The method of any of claims 1-3, wherein said disconnecting the long connection between the client and the server in response to the reset event comprises:
reading a delay time from the reset event;
and when the delay time is exceeded, disconnecting the long connection between the client and the server.
5. The method of claim 1, wherein a scheduler is configured in the client, and wherein the disconnecting the long connection between the client and the server in response to the reset event comprises:
responding to the reset event, and identifying the state of a scheduler, wherein the scheduler is used for reading the running data from a preset queue and transmitting the running data to a server through a long connection;
if the state of the scheduler is that the transmission of the running data is finished, the scheduler is informed to pause to stop reading the running data from the queue and transmit the running data to a server through long connection;
and if the scheduler is suspended, disconnecting the long connection between the client and the server.
6. The method of claim 5, wherein the client is further configured with an acquisition component that, in response to the reset event, disconnects the long connection between the client and the server, further comprising:
and maintaining the operation of an acquisition assembly, wherein the acquisition assembly is used for acquiring the operation data in the computer equipment and writing the operation data into a preset queue.
7. The method of claim 5, wherein said breaking the long connection between the client and the server in response to the reset event further comprises:
and if the state of the scheduler is that the running data is being transmitted, executing a waiting operation, returning to execute the response to the reset event, and identifying the state of the scheduler.
8. The method of claim 5, 6 or 7, further comprising, after the requesting load balancer reestablishes a long connection between the client and the server to load balance the service cluster:
and if the client establishes long connection with the server again, the scheduler is informed to continue to operate so as to continue to read the operating data from the queue and transmit the operating data to the server through the long connection.
9. A load balancing device applied to a client, the client is used for collecting operation data of computer equipment and transmitting the operation data to a server in a service cluster through a long connection, and the device comprises:
a reset event receiving module to receive a reset event, the reset event representing the service cluster load imbalance;
a long connection disconnection module for disconnecting the long connection between the client and the server in response to the reset event;
and the long connection establishing module is used for requesting the load balancer to establish long connection for the client and the server again so as to balance the load of the service cluster.
10. A load balancing system is characterized by comprising a configuration center, a plurality of clients, a load balancer and a service cluster, wherein the service cluster is provided with a plurality of servers;
the client is used for acquiring the operating data of the computer equipment and transmitting the operating data to the server in the service cluster through long connection;
the configuration center is used for receiving load recorded when the plurality of service terminals provide services for the plurality of client terminals, determining load unbalance of the service cluster according to the load, responding to the load unbalance, and sending a reset event to the plurality of client terminals;
the client is further configured to respond to the reset event, disconnect the long connection with the server, and request the load balancer to re-establish the long connection between the client and the server, so as to load balance the service cluster.
11. A computer device, characterized in that the computer device comprises:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the load balancing method of any one of claims 1-8.
12. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, implements a load balancing method according to any one of claims 1-8.
CN202110103614.2A 2021-01-26 2021-01-26 Load balancing method, device, system, computer equipment and storage medium Active CN112866394B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110103614.2A CN112866394B (en) 2021-01-26 2021-01-26 Load balancing method, device, system, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110103614.2A CN112866394B (en) 2021-01-26 2021-01-26 Load balancing method, device, system, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112866394A true CN112866394A (en) 2021-05-28
CN112866394B CN112866394B (en) 2022-09-13

Family

ID=76009218

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110103614.2A Active CN112866394B (en) 2021-01-26 2021-01-26 Load balancing method, device, system, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112866394B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115174586A (en) * 2022-09-02 2022-10-11 常州尊尚信息科技有限公司 Automatic load balancing system and method based on cloud platform

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012159349A1 (en) * 2011-07-28 2012-11-29 华为技术有限公司 Load balancing method and device
US20150052249A1 (en) * 2013-08-13 2015-02-19 International Business Machines Corporation Managing connection failover in a load balancer
CN105227602A (en) * 2014-06-20 2016-01-06 北京新媒传信科技有限公司 A kind of method of load balancing, client, registrar and system
CN106506701A (en) * 2016-12-28 2017-03-15 北京奇艺世纪科技有限公司 A kind of server load balancing method and load equalizer
US20170171305A1 (en) * 2015-12-09 2017-06-15 International Business Machines Corporation Persistent connection rebalancing
CN112202918A (en) * 2020-10-16 2021-01-08 深圳乐播科技有限公司 Load scheduling method, device, equipment and storage medium for long connection communication

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012159349A1 (en) * 2011-07-28 2012-11-29 华为技术有限公司 Load balancing method and device
US20150052249A1 (en) * 2013-08-13 2015-02-19 International Business Machines Corporation Managing connection failover in a load balancer
CN105227602A (en) * 2014-06-20 2016-01-06 北京新媒传信科技有限公司 A kind of method of load balancing, client, registrar and system
US20170171305A1 (en) * 2015-12-09 2017-06-15 International Business Machines Corporation Persistent connection rebalancing
CN106506701A (en) * 2016-12-28 2017-03-15 北京奇艺世纪科技有限公司 A kind of server load balancing method and load equalizer
CN112202918A (en) * 2020-10-16 2021-01-08 深圳乐播科技有限公司 Load scheduling method, device, equipment and storage medium for long connection communication

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
田守枝: "消费者Rebalance机制_RocketMQ教程", 《田守枝JAVA技术博客》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115174586A (en) * 2022-09-02 2022-10-11 常州尊尚信息科技有限公司 Automatic load balancing system and method based on cloud platform
CN115174586B (en) * 2022-09-02 2022-11-29 常州尊尚信息科技有限公司 Automatic load balancing system and method based on cloud platform

Also Published As

Publication number Publication date
CN112866394B (en) 2022-09-13

Similar Documents

Publication Publication Date Title
CN109274707B (en) Load scheduling method and device
US11418620B2 (en) Service request management
CN102187315B (en) Methods and apparatus to get feedback information in virtual environment for server load balancing
US6560717B1 (en) Method and system for load balancing and management
US8104042B2 (en) Load balancing of servers in a cluster
US8380843B2 (en) System and method for determining affinity groups and co-locating the affinity groups in a distributing network
CN111726415B (en) TCP long connection load balancing scheduling method and system based on negative feedback mechanism
US20040024861A1 (en) Network load balancing
US20020087612A1 (en) System and method for reliability-based load balancing and dispatching using software rejuvenation
CN101605092A (en) A kind of content-based SiteServer LBS
CN112231075B (en) Cloud service-based server cluster load balancing control method and system
WO2023050901A1 (en) Load balancing method and apparatus, device, computer storage medium and program
CN109510878B (en) Long connection session keeping method and device
CN108933829A (en) A kind of load-balancing method and device
CN104158758A (en) Load balancing processing method and system based on user message time feedback in SDN network
JP2000276432A (en) Dynamic load distribution system for transaction message
CN110809060B (en) Monitoring system and monitoring method for application server cluster
CN109787827B (en) CDN network monitoring method and device
CN108055338B (en) ISCSI access load balancing method
CN109769029B (en) Communication connection method based on electricity consumption information acquisition system and terminal equipment
CN108234208A (en) The visualization load balancing dispositions method and system of resource management based on business
CN113268351A (en) Load balancing method and device for gateway service
CN112866394B (en) Load balancing method, device, system, computer equipment and storage medium
CN115633039A (en) Communication establishing method, load balancing device, equipment and storage medium
JP2005182702A (en) Access control system in ip network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant