CN105635199B

CN105635199B - A kind of self-organizing cluster server of holding load equilibrium

Info

Publication number: CN105635199B
Application number: CN201410586235.3A
Authority: CN
Inventors: 杨国良; 陈楚函; 罗水连
Original assignee: Rui Zhe Polytron Technologies Inc
Current assignee: Rui Zhe Polytron Technologies Inc
Priority date: 2014-10-28
Filing date: 2014-10-28
Publication date: 2019-03-15
Anticipated expiration: 2034-10-28
Also published as: CN105635199A

Abstract

Self-organizing cluster server proposed by the present invention uses distributed structure/architecture, forms a virtual server by multiple physical servers, the large-scale business for being difficult to carry for running single server.The cluster server passes through Ad hoc mode, automatic discovery neighbor node joins and departs, dynamic election management access point and Service Access point, unified management interface is provided from management access point to network management system, unified business access interface is provided a user by Service Access point, and business is evenly distributed on each active physical node, to realize the load balancing and redundancy backup inside cluster server, solve the problems such as conventional cluster server needs deployment-specific load-balancing device and heartbeat software, the equipment investment of reduction system, the reliability and scalability of lifting system, reduce the complexity of system deployment and maintenance.

Description

A kind of self-organizing cluster server of holding load equilibrium

Technical field

The present invention relates to a kind of server units for running Large-Scale Interconnected net application program, especially run simultaneously multiple mutual Working application, the automatic cluster server for realizing load balancing and node administration.

Background technique

Cluster server is mainly used for running the application of Large-Scale Interconnected net, such as large-scale website, mailing system, Video service. These computation amounts are very big, and separate unit physical server is difficult to undertake, and need using cluster server.Currently, cluster server It is made of two mutually redundant load-balancing devices and Duo Tai physical server.Load balancing is all issued in customer service request Equipment, then physical server processing is dispatched to by load-balancing device.This framework is not suitable for the cluster server of ultra-large type, because The bottleneck that cluster server will be become for load-balancing device, constrains the extension of cluster server scale, is also not suitable for simultaneously Small-sized cluster server, because load-balancing device is not involved in business processing, spare load-balancing device is even more to be in idle State, for one the only cluster server of 2-3 platform physical server, load-balancing device investment is bigger than normal.

Need to run heartbeat software, Alternative load equalizing equipment real-time detection on two mutually redundant load-balancing devices The operating status of main equipment, once discovery master-failure, starts local process, adapter tube traffic scheduling function at once.In addition, negative It carries equalizing equipment and is also required to operation physical server detection program, once discovery failure, stops assigning to failed server at once Business.As it can be seen that general cluster server needs more set tool software cooperations, it is larger to implement deployment difficulty.

Load-balancing device cannot find physical server automatically, need in advance to be configured to all physics server parameters On load-balancing device.If needing to increase physical server in service operation, it is necessary to modification and test load equalizing equipment Corresponding configuration, it is difficult to realize online fastext dilatation, operation and maintenance complexity is higher.

Summary of the invention

It is led to overcome existing cluster server to need integrated load-balancing device, heartbeat software and nodal test program The deficiencies of investment of cause is big, and scalability is poor, implements and is difficult in maintenance, the present invention provides a kind of self-organizing cluster server, from group It knits cluster server to be not only only capable of finding joining and departing for node automatically, does not need other heartbeat softwares and nodal test program Cooperation, and be able to achieve internal load balancing and redundancy backup, do not need dedicated load-balancing device cooperation.

The technical solution adopted by the present invention to solve the technical problems is: self-organizing cluster server finds that neighbours save automatically The addition of point is left and state change, dynamic election management access point and Service Access point, from management access point to network management system Unified management interface is provided, unified business access interface is provided a user by Service Access point, and business is uniform It is assigned on each node, to realize the load balancing and redundancy backup inside cluster server, solves existing cluster service Device needs the problem of deployment-specific load-balancing device and heartbeat software, simplifies the installation and maintenance of system.

As shown in Figure 1, self-organizing cluster server manages physical server with joint form, and it is divided into management node, industry The different roles such as business node, driven node.Management node is responsible for the node of self-organizing aggregated server system management.Management section It cries on point for receiving and dispatching the network port of network management information and manages access point, for receiving and dispatching the IP address of network management information cluster IP, Self-organizing cluster server domain name/IP address is also corresponded to simultaneously.Only one management node of self-organizing cluster server and a pipe Manage access point.Service node is responsible for receiving the node of service request.For receiving the network-side of service request on service node Mouthful it is Service Access point, for receiving the IP address of service request business IP, while also corresponding to business domain name/IP address.From group Knitting cluster server has one or more service nodes, and service node has one or more Service Access points.Driven node is association The node for helping service node processing business to request.If do not elect the node for management node or service node automatically become from Dynamic node.

As shown in Fig. 2, self-organizing cluster server finds that neighbours save by active transmission and monitoring node state message The addition of point is left and the variation of node state.Node periodically sends the status information of local node, packet by multicast mode Namespace node, business load situation and active block link parameter etc. are included, with this presence to other nodes declaration local node. Node monitors the state message that other nodes issue simultaneously, to find the presence of neighbor node, and grasps adding for neighbor node Enter, leave, business load changes, the multidate information of link-state change.To reduce the performance cost that neighbor node finds program, Self-organizing cluster server does not use handshake method to establish neighborhood, and ensures all node states by continuous listening mode The synchronization of information.

As shown in figure 3, self-organizing cluster server determines that local node is according to certain management access point election algorithm No management node if it is actively declares that local node is management node to other nodes by multicast mode, and management Which network link access point corresponds to, if management node is not local node, with no treatment.When there is new node addition Or malfunctioning node, when leaving, the neighbor state that each node is grasped is not exclusively synchronous, and the management node calculated may not Equally, multiple nodes is caused to fall over each other to become management node.To avoid conflicting, simultaneous selection Optimal Management node, each node is adopted With the mode of yielding, even if node thinks that local node is management node, but listens to other nodes and actively apply to become management section When point, equal active release pipe abandon manages node role.

As shown in figure 4, elect the smallest node of bandwidth as management node between local node and all neighbor nodes, And elect the smallest link of bandwidth as management access point in management node.The election algorithm mainly avoids system administration expense Occupy the resource of high-performance node.The specific election algorithm for managing access point is as described below:

The first step selects the smallest node of effective total bandwidth in all nodes, if only one qualified section Point then leaps to third step；

Second step further selects the smallest node of network link IP in selecting node, if node has there are multiple Link IP is imitated, then is compared with smallest link IP；

Third step, to choose node as management node；

4th step selects the smallest active link of bandwidth, if an only qualified link in management node Then leap to the 6th step；

5th step further selects the smallest link of IP in selecting link, if link there are multiple active link IP, Then compared with minimum IP；

6th step, to choose link as management access point.

As shown in figure 5, self-organizing cluster server can run multiple business simultaneously, one business of each business-binding is connect Access point.Node is that each business elects corresponding Service Access point, and determines this according to certain Service Access point election algorithm Ground node whether service node, if it is by multicast mode actively to other nodes declare local node which industry possessed Which link the Service Access point and Service Access point of business correspond to, if local node does not have Service Access point, do not appoint Where reason.When having new node to be added or malfunctioning node leaves, the neighbor state that each node is grasped is not exclusively synchronous, calculates Service Access point out may be different, and multiple nodes is caused to fall over each other the service node as the same business.To avoid rushing Prominent, the optimal service node of simultaneous selection, each node uses yielding mode, even if node thinks that local node is some business Service node, but when listening to other nodes and actively applying to become the service node of the business, actively abandon the industry of the business Be engaged in node role.

As shown in fig. 6, being between local node and all neighbor nodes (node including having become management node) Each business elects most lightly loaded, the maximum node of bandwidth as service node, and elect in service node it is most lightly loaded, The maximum link of bandwidth is as Service Access point.The election algorithm mainly allows business to be evenly distributed in all nodes, and sufficiently Play the effect of high-performance node.The specific election algorithm of Service Access point is as described below:

The first step selects the existing least node of Service Access point, if only one is eligible in all nodes Node then leap to the 4th step；

Second step further selects the maximum node of effective total bandwidth, if only one meets item in choosing node The node of part then leaps to the 4th step；

Third step further selects the maximum node of network link IP in selecting node, if node has there are multiple Link IP is imitated, then is compared with maximum link IP；

4th step, to choose node as service node；

5th step, the selection binding least available link of Service Access point in service node, if only one meets The link of condition then leaps to the 8th step；

6th step further selects the maximum link of bandwidth in selecting link, if an only qualified chain Then leap to the 8th step in road；

7th step further selects the maximum link of IP in selecting link, if link there are multiple active link IP, Then compared with maximum IP；

8th step, to choose link as Service Access point.

As shown in fig. 7, self-organizing cluster server can realize load balancing between each node.Load Sharing Algorithm is adopted With stateless hashing algorithm, not only ensured that the service request of same user was assigned to same node processing, but also has reduced and search state table Expense, improve the performance of entire self-organizing cluster server.All service requests of some business by with the business-binding Service Access point receive.When service node receives by Service Access point the message of service request, first according to the mesh of message IP address and source IP address carry out hash calculating, and be mapped in live-vertex list.If mapping result is local node, So service request directly hands to the processing of local service layer, and processing result is directly returned to user.If mapping result is it Its node then the MAC Address of node is chosen in parsing first, then using this MAC Address as the destination address of link layer message, passes through Two layers of link are transmitted to service request message the node chosen.Non-traffic node receives service request message, directly hands to The processing of local service layer, processing result are directly returned to user.The service response message that service node and driven node return is equal Using business IP as the source IP address of message.

As shown in figure 8, management node and service node have redundancy backup ability, any one management/service node goes out Existing failure, other nodes re-elect new management/service node at once, instead of malfunctioning node, to improve entire self-organizing The reliability of cluster server.Each node constantly monitors the state message of neighbor node, if confiscated within a certain period of time Management/service node state message, then it is assumed that management/service node breaks down or leaves, then the weight in live-vertex New election management/service node.It chooses node to elect management/Service Access point in locally significant link, passes through state message Become new management/service node, and broadcast control/Service Access point arp response message to the declaration of other nodes, forces and be sent to The service request of malfunctioning node is quickly switched into new management/service node.If management/service node does not have complete off-grid, only It is management/Service Access point binding link down, then management/service node is not re-elected, only in management/service node On active link in re-elect management/Service Access point, and broadcast new management/Service Access point arp response message, compel The service request for being sent to former management/Service Access point is set to be quickly switched into new management/Service Access point.

As shown in figure 9, driven node is likewise supplied with redundancy backup ability, each node constantly monitors the state of neighbor node Message, if confiscating the state message of driven node within a certain period of time, then it is assumed that driven node breaks down or leaves, industry Business node will stop requesting to malfunctioning node forwarding service, and corresponding service request is shared by other live-vertexs.

Detailed description of the invention

Present invention will be further explained below with reference to the attached drawings and examples.

Fig. 1 self-organizing cluster server node role.

The discovery of Fig. 2 neighbor node and state synchronization method.

Fig. 3 node actively declares that local node is management node according to election algorithm.

Fig. 4 manages access point election algorithm.

Fig. 5 node actively declares that local node is service node according to election algorithm.

Fig. 6 Service Access point election algorithm.

Fig. 7 load balancing working principle.

Fig. 8 management/service node redundancy backup working principle.

The driven node redundancy back-up job principle of Fig. 9.

Figure 10 node software system architecture.

Figure 11 state is notified to workflow.

Figure 12 monitors workflow.

Figure 13 node overtime work process.

Figure 14 manages access point election process.

Figure 15 Service Access point election process.

Figure 16 traffic scheduling workflow.

Figure 17 link down workflow.

Figure 18 ARP Message processing workflow.

Figure 19 initial work process.

Specific embodiment

The contents of the present invention are described more fully with reference to the accompanying drawings.It note that and be described below in itself only Be it is explanatory and exemplary, not as to the present invention and its application or any restrictions used.Unless stated otherwise, no Then, the positioned opposite and numerical expression and numerical value of the component and step that illustrate in embodiment are not intended to limit model of the invention It encloses.In addition, technology well known by persons skilled in the art, method and apparatus may not be discussed in detail, but in appropriate circumstances Also become part of specification.

As shown in Figure 10, the software configuration of each node of self-organizing cluster server is the same, and main purpose is real Now complete ad-hoc mode, any node break down, as long as there are also live-vertexs to exist, the institute of system is functional still effectively, That is any role of system has the redundancy backup of 1:N.The major software modules of node include neighbours' discovery, the management of network layer Access point election, Service Access point election, traffic scheduling module, the link monitoring of link layer, ARP processing module, system administration Initialization module.

Neighbor discovery module is subdivided into 3 submodules such as state notification, monitoring, node time-out.As shown in figure 11, node Clock interrupt is set, link state is inspected periodically, is notified to locally significant link, management access point, Service Access point information.Specifically State notification workflow it is as described below:

The first step reads local link condition, if not increasing active link newly, leaps to the 4th step；

Second step re-elects management in locally significant link range and connects if local node is management node Access point, it is ensured that management access point is optimal selection；

Third step is that former local service connects in locally significant link range if local node is service node Access point re-elects, it is ensured that Service Access point is optimal selection；

4th step, structure node state message encapsulate the information such as locally significant link, management access point, Service Access point；

5th step is notified to local node active link information by multicast mode sending node state message, and local Whether node is management node and service node.

As shown in figure 12, node constantly monitors the state message of neighbor node, finds the addition of new node, old node link Variation and management node and service node distribution situations such as.It is as described below that specific node state monitors workflow:

The first step monitors neighbor node state message, reads the namespace node in message, active link, management/business and connects The information such as access point leap to third step if neighbor node is existing node；

Second step, for new node increase neighbor node record, save new node active link state, if new node and this Ground node competition management node then leaps to the 4th step, directly jumps if new node competes Service Access point with local node To the 5th step, otherwise terminate to monitor；

Third step is original node updates active link information, deletes the old link note not appeared in state message Record records for the newly-increased new link of link establishment, and resets the counter of node time-out, if neighbor node and local node compete Management node then enters the 4th step, leaps to the 5th step if new node and local node competition Service Access point, otherwise Terminate to monitor；

4th step, local node are abandoned becoming management node, release the binding relationship of former management access point, and notify ARP Module stops the ARP request of response management access point, terminates if neighbor node does not compete Service Access point with local node It monitors；

5th step, local node abandon the Service Access point for race condition occur, and the binding for releasing former Service Access point is closed System, and ARP module is notified to stop responding the ARP request of former Service Access point.

As shown in figure 13, the state message of neighbor node is not received by certain time, then it is assumed that the neighbor node is super When.When neighbor node time-out, local node will delete the record of overtime node, and re-elect management/Service Access point.Tool Body node overtime work process is as described below:

The first step deletes the record of overtime node, the management access point and industry bound including its link information and therewith Business access point record；

Second step re-elects management access point in live-vertex, it is ensured that management access point is effective and is most preferably to select It selects；

Third step re-elects all Service Access points in live-vertex, it is ensured that all Service Access points effectively and And be optimal selection, if local node is not management or service node, end node overtime work；

4th step is notified to local management/Service Access point information to all nodes by multicast mode, and notifies ARP mould Block responds corresponding management/Service Access point ARP request.

As shown in figure 14, each node voluntarily elects bandwidth and the smallest node of IP as management node, in management node It is middle to select bandwidth and the smallest link of IP as management access point.The specific election process for managing access point is as described below:

The first step sorts all live-vertexs (including local node and neighbor node) by total bandwidth and IP from small to large；

Second step, using the node that makes number one as management node；

Third step, by bandwidth and IP all active links of sequencing management intra-node from small to large；

4th step, using the link that makes number one as management access point.

As shown in figure 15, each node voluntarily elects bandwidth and the maximum node of IP as service node, in service node It is middle to select bandwidth and the maximum link of IP as Service Access point, and ensure Service Access point in all nodes and link uniformly Distribution.The specific election process of Service Access point is as described below:

The first step sorts from large to small all live-vertexs by total bandwidth and IP, and node ID is 0~(n-1)；

Second step, by all business IP are sorted from large to small, i is since 0 for business IP serial number；

Third step, using (i mod n) a node as the corresponding service node of i-th of business IP；

4th step sorts all active links of each service node respectively from big to small by bandwidth and IP, each business Link serial number on node is 0~(m-1) respectively；

5th step, by sorting from large to small business IP corresponding to each service node, the business on each service node J is since 0 for IP serial number；

4th step, the Service Access point bound using (j mod m) article link as j-th of business IP of the service node.

As shown in figure 16, it will determine how to dispatch business according to the role of oneself when node receives service request message, really Service request is protected to be evenly distributed inside self-organizing cluster server.Specific traffic scheduling workflow is as described below:

The first step first looks for corresponding Service Access point when receiving service request message, if corresponding service access Point does not leap to third step then in local node；

The active link of all live-vertexs is constituted a continuous one-dimensional space by second step, and each of the links are one-dimensional Length in space is directly proportional to link bandwidth, then source+destination IP of service request message is reflected by certain hash algorithm It is mapped on link space, leaps to the 4th step if the link being mapped to is not in local node；

Service request message is delivered locally applied layer and handles and terminate after returning to user by third step；

4th step inquires the MAC Address of mapping link, and service request message is transmitted to mapping chain by double layer network Node where road, then terminates.

Node link state change includes link startup and two kinds of link down.To inhibit link state frequently to overturn, save Point handles link down in real time, ignores link startup, and the inspection of new link is completed by node state notification submodule.Such as Figure 17 institute Show, when link occurs to interrupt situation, node needs to be switched fast management/Service Access point on faulty link, and notices it Its node.Specific link down workflow is as described below:

The first step deletes the record of faulty link, if the complete off-grid of node, leaps to the 4th step, if failure Link is not management/Service Access point, then leaps to third step；

Second step re-elects management/Service Access point in locally significant link, and notifies the new pipe of ARP module broadcast Reason/Service Access point arp response message and response new management/Service Access point ARP request；

Third step is notified to locally significant link, management/Service Access point information by node state message, is allowed other Node stops continuing to send service request to faulty link at once, then terminates；

4th step deletes all neighbor node records and management/Service Access point record, node is forced to enter initialization shape State waits node to network again.

As shown in figure 18, node needs to respond about local management/Service Access point ARP request.ARP Message processing tool Body running process is as described below:

The first step reads the request content of ARP message, if non-local management/Service Access point, turns at operating system It manages and terminates；

Second step reads the MAC Address of corresponding management/Service Access point binding link, and wide by arp response message It broadcasts.

Node, which is powered on or networked again from off-grid state, can all start initialization process.As shown in figure 19, node needs just The each software module of beginningization, and find neighbours and election management/Service Access point.Specific initial work process is as described below:

The first step starts link monitoring and ARP processing module, it is ensured that locally significant link information is accurate；

Second step starts traffic scheduling and neighbor discovery module, it is ensured that do not abandon the business that neighbor node forwards and ask It asks；

Third step continues to wait for a period of time after local node networking, allows local node and neighbor node to pass through regular The sufficiently synchronous neighbor state information of sending node state message；

4th step, starting management/Service Access point elect module, elect management/Service Access point；

5th step is notified to locally significant link, management/Service Access point information by multicast.

Description of the invention is given for the purpose of illustration and description, and is not exhaustively or will be of the invention It is limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.It selects and retouches It states embodiment and is to more preferably illustrate the principle of the present invention and practical application, and those skilled in the art is enable to manage The solution present invention is to design various embodiments suitable for specific applications with various modifications.

Claims

1. a kind of self-organizing cluster server of holding load equilibrium, You Duotai physical server forms a self-organizing cluster clothes Business device, it is characterized in that: the self-organizing cluster server manages the physical server with joint form, each node is found automatically Neighbor node joins and departs, and dynamic election management access point and Service Access point are mentioned from management access point to network management system For unified management interface, unified business access interface is provided a user by Service Access point, and business is uniformly divided It is fitted on each active physical node, cooperates without dedicated load-balancing device and heartbeat software, realize the negative of cluster internal Carry balanced and redundancy backup；

Node periodically sends the status information of local node by multicast mode, the state report issued while listening for other nodes Text grasps the multidate information that neighbor node is added, leaves, and ensures the same of all node status informations by continuous listening mode Step.

2. the self-organizing cluster server of holding load equilibrium according to claim 1, it is characterized in that: with joint form pipe Physical server is managed, and is divided into management node, service node, driven node；System is received from the management access point in management node Reason request under the overall leadership is responsible for management system operation by management node, receives service request from the Service Access point on service node, by Service node is responsible for traffic scheduling and business processing, and the business load of service node is shared by driven node；Only one management Node and a management access point, have one or more service nodes, service node has one or more Service Access points.

3. the self-organizing cluster server of holding load equilibrium according to claim 1, it is characterized in that: all nodes according to Identical management access point election algorithm and Service Access point election algorithm voluntarily determine local node whether management node and industry Business node, which link are management access point and Service Access point；Node actively declares local management access point to other nodes With Service Access point information；When node is added, node is only added and re-elects and declares that management and Service Access point, node are left When, other all nodes re-elect and declare to manage access point and Service Access point；It avoids conflicting using yielding mode, as long as It listens to other node declarations and possesses management access point or Service Access point, then actively release former management access point and service access The binding relationship of point.

4. the self-organizing cluster server of holding load equilibrium according to claim 1, it is characterized in that: load point between node It carries on a shoulder pole algorithm and uses stateless hashing algorithm, without searching state table, it is ensured that the service request of same user is by same node processing.

5. the self-organizing cluster server of holding load equilibrium according to claim 1, it is characterized in that: using equity completely The software configuration of mode, each node is the same, and any node has the redundancy backup of 1:N, even if only remaining a node, system Institute is functional still effectively.