CN104283948A - Server cluster system and load balancing implementation method thereof - Google Patents

Server cluster system and load balancing implementation method thereof Download PDF

Info

Publication number
CN104283948A
CN104283948A CN 201410512754 CN201410512754A CN104283948A CN 104283948 A CN104283948 A CN 104283948A CN 201410512754 CN201410512754 CN 201410512754 CN 201410512754 A CN201410512754 A CN 201410512754A CN 104283948 A CN104283948 A CN 104283948A
Authority
CN
Grant status
Application
Patent type
Prior art keywords
node
load
information
standby
module
Prior art date
Application number
CN 201410512754
Other languages
Chinese (zh)
Inventor
张珠华
张霞
徐丽丽
张骞
Original Assignee
东软集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network-specific arrangements or communication protocols supporting networked applications
    • H04L67/10Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network
    • H04L67/1002Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers, e.g. load balancing
    • H04L67/1004Server selection in load balancing

Abstract

The invention provides a server cluster system and a load balancing implementation method of the server cluster system. The system comprises a main node, a standby node and common nodes. The main node comprises an information collecting module, a load calculating module and a connecting and distribution deciding module. The standby node is used for reporting the information of the standby node to the main node, and synchronizes with the information of the main node, and the common nodes are screened, so that the selected common node is adopted as a new standby node after the common node becomes the main node. The common nodes are used for reporting the information of the common nodes to the main node, and when one common node becomes the standby node, standby node switching is carried out. Through the server cluster system and the load balancing implementation method, the server cluster system with the main node and the standby node of a mesh network is adopted, the structure of the main node and the standby node is utilized for achieving connection and distribution, the function of sharing connection is achieved, and the problems that connection and distribution delay, and connection is unstable and unreliable can be solved.

Description

服务器集群系统及其负载均衡实现方法 Server load balancing cluster system and its implementation

技术领域 FIELD

[0001] 本发明涉及服务器集群技术领域,更为具体地,涉及一种服务器集群系统及其负载均衡实现方法。 [0001] The present invention relates to a server cluster technical field, more particularly, to a system and a server cluster load balancing implementation.

背景技术 Background technique

[0002] 随着汽车保有量的上升、道路拥堵状况的加剧以及交通安全事故的频繁发生,越来越多的组织机构及个人认识到深入理解驾驶员的驾驶行为将有助于制定更为合理的交通法则和设计更加有效的智能驾驶导航系统,从而达到减少交通事故提高交通效率的目的。 [0002] With the rise in car ownership, increased congestion and traffic accidents occur frequently road conditions, more and more organizations and individuals recognize that in-depth understanding of the driving behavior will help to develop a more reasonable traffic laws and driving intelligent design more efficient navigation system, so as to achieve the purpose of reducing traffic accidents to improve traffic efficiency. 而上述的分析需要建立在大量的行车数据的基础上,需要大量智能终端(包括车载终端、手机终端等设备)不断实时地采集行车数据上传至车载服务器并进行分析。 The above analysis of the need for a large number of traffic data base requires a lot of intelligent terminals (including the car terminal, mobile terminals and other equipment) continue to collect traffic data uploaded to the server onboard and analyzed in real time. 因此,随着智能终端及其业务量的不断提升,单一的服务器已经无法满足要求,服务器集群成为一种需要解决问题的方案。 Therefore, as the smart terminal and rising business volume, a single server can not meet the requirements, server clustering solutions become a necessity to solve the problem.

[0003] 就智能终端而言,其承担着数据收集的重任,需要不断高效地采集车辆的行车数据,并与服务器保持连接,将搜集到的数据实时地传送到服务端,此过程简称为"数据摆渡"。 [0003] In terms of intelligent terminals, which bears the important task of data collection, need to continue to efficiently collect traffic data of the vehicle, and stay connected to the server, the data collected will be transferred to the server in real time, this process is referred to as " data ferry. " 在实时"数据摆渡"领域中,一般采用长连接方式,即保持智能终端长时间在线,从而保证数据的实时性。 In live "ferry" art, generally with a long connection, i.e., the intelligent terminal remains a long line, so as to ensure real-time data. 同时,由于其数据量取决于车辆的使用量,因此必然存在与服务器端的大量数据交互。 Meanwhile, since the amount of data depends on the amount of the vehicle, there must be a large amount of data to interact with the server. 因此,在此架构下,如何设计服务器,一方面满足大量智能终端的长连接需求,另一方面保证大量数据的收发及存储正常,成为了该领域的一个需要解决的问题。 Therefore, in this framework, how to design server, on the one hand a large number of intelligent terminals to meet the needs of long connection, on the other hand to ensure that a large number of transceiver and storing normal data, has become a problem to be solved in this area.

[0004] 针对上述问题,一种解决方案是采用反向代理以实现负载均衡。 [0004] In response to these problems, one solution is to use a reverse proxy for load balancing.

[0005] 在采用反向代理实现的负载均衡方案中,利用的是反向代理服务器实现链接分发。 [0005] In the use of reverse proxy load balancing program implementation, the use of a reverse proxy server to achieve link distribution. 其具体方法可以描述为:反向代理服务器首先接受来自网络上的连接请求,并与其建立连接,然后将随后收到的来自客户终端的访问请求转发给内部服务器,并将从内部服务器上返回的结果再次转发给相应的发起连接请求的客户终端。 The specific method can be described as: a reverse proxy server first accept connection requests from the network and establish a connection, and then subsequently receives access from the client terminal forwards the request to the internal server, and returns from the internal server once again forwarded the results to the client terminal corresponding to initiate a connection request. 在此过程中,反向代理服务器对外表现为一个服务器,对内则是连接请求的总入口和总出口,负责将连接分发给内部的各个服务器。 In this process, the external reverse proxy server appears as a server at home while the total inlet connection requests and total exports, is responsible for connecting distributed to the internal server.

[0006] 这种解决方案的缺陷在于:由于反向代理服务器处于0SI七层模型中的应用层, 因此必须为每一种应用服务专门开发一个反向代理服务器,这样就限制了反向代理负载均衡技术的应用范围。 Defects [0006] This solution is that: As the reverse proxy server in the application layer 0SI seven models must therefore be a reverse proxy server for every application specially developed service, which limits the reverse proxy load applications equalization technology. 目前,反向代理技术一般都用于对web服务器的负载均衡,另外,在反向代理方式实现的负载均衡中,针对每一次代理行为,代理服务器必须打开两个连接,一个对外,一个对内,因此在并发连接请求数量非常大的时候,代理服务器的负载也就非常大了,在该类网络中,通常情况下都是代理服务器本身成为了服务性能的瓶颈。 At present, the reverse proxy load balancing technique is typically used for web server, and the other, in the manner of the reverse proxy load balancing for each agent behavior, the proxy server must open two connections, one outside, one internal Therefore, when the number of concurrent connection requests a very large load on the proxy server also very large, and in this type of network, it is usually the proxy server itself become a bottleneck service performance.

[0007] 另一种最常用的解决方案是基于网络地址转换(NAT)技术实现的负载均衡。 [0007] Another most common solution is to load balancing based on network address translation (NAT) technology. 网络地址转换技术诞生之初,为了解决IPv4公网地址数量减少的问题,将内部多个服务器对外映射为一个单独的服务器,以利用一个公网IP使内部多个服务器对外提供服务。 At the beginning of the birth of network address translation technology, in order to solve reducing the number of public IPv4 addresses issues, internal multiple servers outside mapped to a single server to use a public IP address of the internal multiple servers to provide services.

[0008] 在这种基于网络地址转换(NAT)技术实现的解决方案中,从网络上发起的连接请求,经过地址转换服务器进行地址转换,再分发到内部的各个服务器上。 [0008] In such a Network Address Translation (NAT) technology in the solution, it originated from the network connection request, through the NAT NAT server, and then distributed to the respective internal server. 采用这种方式实现的负载均衡,虽然能够解决反向代理技术位于应用层,但尚需要针对每种应用服务专门开发反向代理服务器的问题;并且网络地址转换方式实现的负载均衡仍需要针对每次请求打开对内及对外的两个连接才能实现,因此在数据里非常大的时候地址转换服务器本身会成为性能瓶颈。 Using load balancing achieved in this way, although the reverse proxy technology can solve the application layer, but the problem still needs specially developed for each application service of a reverse proxy server; and network address translation way to achieve load balancing is still a need for each time internal and external requests to open two connections can be achieved, so the data is very large in the address translation server itself can become a performance bottleneck.

[0009] 与此同时,网络地址转换方式实现的负载均衡中,实现的是在每个请求到达时动态选择某个内部服务器处理请求,并需要在处理结果返回时正确找到请求来源,因此需要对每个请求保留其转换映射表。 [0009] At the same time, network address translation way to achieve load balancing, the realization of a dynamic selection of an internal server processes the request at the time of each request arrives, and need to find the source of the request correctly when the results are returned, so the need for each request retain conversion maps. 同样的道理,当请求量变大时,对请求映射表的维护也需要消耗大量资源,与反向代理技术相比,更容易造成地址转换服务器的性能问题;最后,在目前实际使用的地址转换服务式负载均衡中,大多将地址转换功能集成在硬件交换机中,其实现的只是简单的选择策略,不能支持更优化的负载均衡策略和更复杂的应用协议,并且在扩展时有一定困难。 By the same token, when a large amount of requests, maintenance requests mapping table also need to consume a lot of resources, compared with the reverse proxy technology, is more likely to cause performance problems address translation server; and finally, the current address translation service in actual use load balancing, most of the hardware address translation functions are integrated in the switch, simply select the strategy of its implementation, we can not support a more optimal load balancing strategy and more complex application protocols, and there are certain difficulties when expanding.

发明内容 SUMMARY

[0010] 鉴于上述问题,本发明的目的是提供一种服务器集群系统及其负载均衡实现方法,通过使用网状组网的具有主备用节点的服务器集群系统,利用主备用节点的结构实现连接分发,达到分摊连接的功能,以解决连接分发延时、连接不稳定和不可靠的问题。 [0010] In view of the above problems, an object of the present invention is to provide a system and a server load balancing cluster implementation method, by backup server cluster system having a master node of the mesh-like network, with the main structure to achieve the spare node connection handout , to achieve functional connection sharing, in order to solve problems connected to distribution delays, the connection is unstable and unreliable.

[0011] 一方面,本发明提供一种服务器集群系统,包括主节点、备用节点和普通节点;其中, [0011] In one aspect, the present invention provides a server cluster system, comprising a master node, the standby node and the common node; wherein,

[0012] 主节点包括信息收集模块、负载计算模块和连接分发决策模块; [0012] The master node includes an information collection module, the load distribution calculation module and a decision module connector;

[0013] 信息收集模块,用于存储与主节点相连的备用节点和各个普通节点上报的负载信息; [0013] The information collection module, load information for each alternate node and the common node is connected to the master node stores reported;

[0014] 负载计算模块,用于根据上报的负载信息,获得与负载信息相对应的节点的负载值; [0014] The load calculation module, according to load information reported, to obtain a load value of the node corresponding to the load information;

[0015] 连接分发决策模块,用于通过HTTP重定向进行连接分发,其中,根据负载计算模块获得的节点的负载值,选择出最优处理能力的节点,作为连接分发的节点; [0015] The distribution decision module connector for connecting to distribution via HTTP redirection, wherein the load values ​​of the nodes of the load calculation means for obtaining, processing power optimum node is selected as a distribution node is connected;

[0016] 备用节点,用于向主节点上报备用节点的负载信息,并与主节点的信息同步,以及对普通节点进行筛选,选择出的普通节点作为新的备用节点; [0016] standby node to the master node is adapted to report the load information of the spare nodes, and synchronize the information with the master node, and the ordinary node of the filter, the ordinary node selected as the new standby node;

[0017] 普通节点,用于向主节点上报所述普通节点的信息,以及当其成为备用节点时,将其切换为行备用节点。 [0017] ordinary node, the ordinary node for reporting information to the master node, and when it becomes the standby node, it switches to standby node lines.

[0018] 此外,优选的方案是,主节点还包括管理会话模块;用于管理客户终端与集群系统之间的会话历史记录,并负责删除客户终端与集群系统之间长时间不活跃的历史会话记录。 [0018] Further, preferred embodiment, the primary node further comprises a session management module; a conversation history between the client terminal and the cluster system management, and is responsible for deleting the history of long inactive session between the client terminal and the cluster system recording. 此外,优选的方案是,在信息收集模块中创建一个nodelist,nodelist用于存储与主节点相连的各个节点的负载信息;其中, Furthermore, the preferred embodiment is to create a nodelist information collection module, for nodelist load information of each storage node connected to the master node; wherein,

[0019] nodelist包括更新机制,更新机制用于删除离线的节点。 [0019] nodelist include an update mechanism, the update mechanism for deleting nodes offline.

[0020] 此外,优选的方案是,负载信息包括:与主节点相连的每个节点的CPU利用率、内存利用率、网络带宽占用率以及socket连接占用率; [0020] Further, preferred embodiment, the load information include: CPU utilization of each node connected to the master node, memory utilization, network bandwidth utilization rate and socket connection;

[0021] 在负载计算模块中,对每个节点通过特定加权的负载计算算法,分别获取其对于CPU密集型、内存密集型、网络带宽密集型及socket密集型请求的负载能力,并将获取的结果更新到nodelist。 [0021] In the load calculation module, for each node through a particular calculation algorithm weighted load, respectively acquire its CPU intensive, memory-intensive load capacity for bandwidth-intensive and intensive socket request and the acquired the results update to nodelist.

[0022] 此外,优选的方案是,备用节点包括备用节点信息上报模块、信息同步模块和普通节点筛选模块; [0022] Further, preferred embodiment, the standby node information reporting module includes a backup node, and the general synchronization module information node filter module;

[0023] 备用节点信息上报模块,用于向主节点周期性汇报负载信息; [0023] standby node information reporting means for reporting to the master node periodically load information;

[0024] 信息同步模块,用于周期性地同步信息收集模块中的负载信息,并进行存储; [0024] The synchronization module information, load information for periodically synchronize information collection module, and memory;

[0025] 普通节点筛选模块,用于当备用节点成为主节点时,选择普通节点中负载最小的节点成为其备选。 [0025] Common node filter module, configured to, when the standby node becomes the master node, the node selects the smallest common load node as its alternative.

[0026] 此外,优选的方案是,普通节点包括普通节点信息上报模块和备用节点切换模块; [0026] Further, preferred embodiment, the ordinary node comprises a common node and a standby node information reporting module switching module;

[0027] 普通节点信息上报模块,用于向主节点周期性汇报负载信息; [0027] Common node information reporting means for reporting to the master node periodically load information;

[0028] 备用节点切换模块,用于当备用节点成为新的主节点后,将对其所选择出的新的备用节点发出通知,普通节点据此切换为备用节点,并承担备用节点的职责。 After [0028] the spare switching node module, configured to, when the standby node becomes the new master node will be selected in its new spare node notification, the node accordingly switched to the normal standby node, the standby node and take responsibility.

[0029] 另一方面,本发明还提供一种基于服务器集群系统的负载均衡实现方法,服务器集群系统包括主节点、备用节点和普通节点,其中,: [0029] another aspect, the present invention also provides a load balancing server cluster implementation method based on a cluster server system includes a master node, the standby node and the common node, wherein:

[0030] 主节点周期性地接收与其相连的备用节点、各个普通节点上报的负载信息,并根据上报的负载信息,获得与负载信息相对应的各个节点的负载值; [0030] The master node periodically receives standby node connected thereto, each ordinary node reports the load information, the load information and reported, and the load value obtained load information corresponding to the respective nodes;

[0031] 当有客户终端连接请求时,主节点通过HTTP重定向进行连接分发,其中,主节点根据获得的各个节点的负载值,选出最优处理能力的节点,作为连接的分发节点; [0031] When a client terminal connection request, the master node is connected via HTTP redirection distributed, wherein the master node according to the load value obtained in each node, the node selecting the optimal processing capabilities, as the distribution node is connected;

[0032] 当主节点出现故障时,备用节点进行主备切换,成为新的主节点,执行主节点的职责,同时启用新的备用节点; [0032] When the master node fails, the standby node standby switching becomes a new master node, the master node performs the duties, while enabling a new standby node;

[0033] 新的备用节点将负载信息上报到新的主节点中,并与新的主节点的信息同步,同时对普通节点进行筛选,选出新的备用节点的备选。 [0033] The new spare node to the new load information reported by the master node, the synchronization information and a new master node, while the ordinary node screening, to select an alternative new spare node.

[0034] 此外,优选的方案是,在主节点中设置有会话管理机制,会话管理机制包括客户终端与集群系统之间的会话历史记录,并负责删除所述客户终端与主节点之间长时间不活跃的历史会话记录,使选择的连接分发的节点为最佳选择。 [0034] In addition, the preferred embodiment is provided in the master node with a mechanism for session management, session management mechanism includes a session history between the client terminal and the cluster system, and is responsible for deleting said client terminal between the time the master node inactive session history records, the selected node connection handout is the best choice.

[0035] 此外,优选的方案是,在主节点中创建一个nodelist,各个节点的负载信息存储在node list中,在客户客户终端请求连接之前, [0035] In addition, the preferred embodiment is to create a nodelist in the master node, each node of the load information stored in the node list, the client before the client terminal connection request,

[0036] 判断nodelist中是否存在离线的节点,若存在离线的节点,则将此节点从nodelist中删除;若存在离线的节点为备用节点,贝U重新在普通节点中选择新的备用节点,并与主节点进行信息同步; Whether the node is offline presence [0036] nodelist determination, if a node exists offline, this node is removed from the nodelist; node is offline if there is a spare node, selecting a new shell U standby node in the ordinary node, and information synchronized with the master node;

[0037] 若在nodelist中不存在离线的节点,根据客户终端的连接请求进行连接分发。 [0037] If the node does not exist in nodelist offline, the client terminal according to the connection requests connection handout.

[0038] 此外,优选的方案是,在新的备用节点对普通节点进行筛选的过程中,新的备用节点从普通节点中选择负载最小的节点作为其备选。 [0038] Further, preferred embodiment, in the process of the new spare node screening ordinary node, the node selects a new spare nodes from the smallest load as an alternative to the common node.

[0039] 此外,优选的方案是,负载信息包括:与主节点相连的每个节点的CPU利用率、内存利用率、网络带宽占用率以及socket连接占用率; [0039] Further, preferred embodiment, the load information include: CPU utilization of each node connected to the master node, memory utilization, network bandwidth utilization rate and socket connection;

[0040] 通过特定加权的负载计算算法,每个节点分别计算其对于CPU密集型、内存密集型、网络带宽密集型及socket密集型请求的负载能力,公式如下: [0040] through a specific calculation algorithm weighted load were calculated for each node, for which intensive CPU, memory-intensive, load capacity and network bandwidth-intensive socket intensive request, the following formula:

[0041] rx = f (cu, mu, nu, su) [0041] rx = f (cu, mu, nu, su)

[0042] 其中,n表示节点的剩余资源; [0042], n represents the remaining resources of the node;

[0043] cu、mu、nu、su分别表示接收到的节点上传的CPU利用率、内存利用率、网络带宽占用率以及socket连接占用率。 [0043] cu, mu, nu, su respectively received the node upload CPU utilization, memory utilization, network bandwidth utilization rate and socket connection.

[0044] 此外,优选的方案是,根据接收到各个节点上传的负载信息,计算出剩余资源百分t匕,其公式如下式所示: [0044] In addition, the preferred embodiment is based on the received load information of each node to upload, calculate the remaining percentage of resources t dagger which formula the following formula:

[0045] [0045]

Figure CN104283948AD00081

[0046] 其中, [0046] wherein,

[0047] ri(k)表示k时刻的剩余资源百分比; [0047] ri (k) represents the remaining percentage of the resources at time k;

[0048] cu (k)表示k时刻CPU使用率; [0048] cu (k) represents a k-time CPU utilization;

[0049] cu(k_l)表示(k-Ι)时刻CPU的使用率; [0049] cu (k_l) represented by (k-Ι) CPU usage time;

[0050] 在(l-tu(k)-t_st)中,当t = s时,tu(k)表示k时刻的socket连接使用率,t_st 表示socket资源的预留量,(l-tu(k)-t_st)表示节点t时刻剩余的可用的socket资源情况; [0050] in (l-tu (k) -t_st), when t = s, tu (k) represents the time K socket connection usage, t_st socket represents a reservation of resources, (l-tu (k ) -t_st) represented by the socket available node resources remaining time t;

[0051] 当t = m时,tu(k)表示k时刻的内存使用率,t_st表示内存资源的预留量, (lt u(k)-t_st)表示节点t时刻剩余的可用的内存资源情况; [0051] When t = m, tu (k) represents the memory usage time k, t_st represents the amount of memory reserved resource, (lt u (k) -t_st) t represents the time remaining available node memory resources ;

[0052] 当t = η时,tu(k)表示k时刻的网络带宽占用率,t_st表示网络带宽的预留量, (lt u(k)-t_st)表示节点t时刻剩余的可用的网络带宽资源情况。 [0052] When t = η, tu (k) represents network bandwidth occupation time k, t_st represents a reservation of network bandwidth, (lt u (k) -t_st) t represents the time remaining nodes available network bandwidth resources.

[0053] 从上面的技术方案可知,本发明的服务器集群系统及其负载均衡实现方法,与传统的方案相比具有以下有益效果: [0053] From the above technical solution, the server and load balancing cluster system implemented method of the present invention, the conventional approach has the following advantages compared to:

[0054] 首先,集群组网采用网状组网结构,并且多个节点之间具备层次性的递进关系,一方面从组网上提高了服务稳定性,另一方面利于选出全网内性能最优节点成为备用节点, 从而在主节点失效情况下成为主节点,保障了全局最优选举; [0054] First, the cluster network using a mesh network structure, and have progressive relationship between levels of multiple nodes, on the one hand to improve the stability of services from online groups, on the other hand is conducive to elect the entire network. optimal performance node becomes the standby node, so the case becomes the master node failure at the primary site, protection of the global optimum elections;

[0055] 其次,集群组网模式中实现的主用服务器驻场,备用服务器协管,普通服务器候补的布局,主备直接进行信息同步,在保证减少同步的信息开销前提下,当主节点故障时,可以实现主节点与备用节点之间的无缝切换,备用节点能够接管节点的功能,屏蔽了单点故障,提高集群对外提供业务的可用性; [0055] Next, the master server cluster networking mode implemented in the field, the backup server co-management, common server candidate layout, standby synchronization information directly, reducing information overhead premise of ensuring synchronization, when the primary node fails , it is possible to achieve seamless switching between the master node and the backup node, the backup node can take over the function of the node, the single point of failure shielding, increase the availability of services provided outside the cluster;

[0056] 再次,主节点中提出的连接分发算法,在综合考虑各个节点的资源情况下,选择最优处理能力的节点接收业务请求,能够最大限度的利用每个节点的处理能力,其连接分发机制一方面满足连接合理分配的需求,同时极大地减小了主节点服务器的负担,使得主服务器可以进行主备备份协调工作,为集群整体稳定性做出极大贡献,同时会话管理机制可以进一步降低分发连接时的时延,保证业务的可靠性及时效性。 [0056] Again, the master node connected to the distribution algorithm proposed, in considering each node resources, select the best node receives the service request processing capability, to maximize the use of the processing power of each node, which is connected to the distribution On the one hand the connection mechanism to meet the needs of a reasonable allocation, while greatly reducing the burden on the primary node server, so that the main server can be the primary backup to coordinate work, make a significant contribution to the overall stability of the cluster, and session management mechanism can be further reduce latency when connected to distribution to ensure the reliability and timeliness of business.

[0057] 为了实现上述以及相关目的,本发明的一个或多个方面包括后面将详细说明并在权利要求中特别指出的特征。 [0057] To achieve the foregoing and related ends, one or more aspects of the present invention comprises the back and particularly pointed out in the claims features will be described in detail. 下面的说明以及附图详细说明了本发明的某些示例性方面。 The following description and drawings illustrate in detail certain illustrative aspects of the present invention. 然而,这些方面指示的仅仅是可使用本发明的原理的各种方式中的一些方式。 However, these aspects are merely indicative of various ways principles of the invention may be used in some embodiments. 此外,本发明旨在包括所有这些方面以及它们的等同物。 Further, the present invention is intended to include all such aspects and their equivalents.

附图说明 BRIEF DESCRIPTION

[0058] 通过参考以下结合附图的说明及权利要求书的内容,并且随着对本发明的更全面理解,本发明的其它目的及结果将更加明白及易于理解。 [0058] The contents of the book claims by reference to the following description and the appended drawings in conjunction with the present invention and as more fully understood, other objects and results of the present invention will become more apparent and easy to understand. 在附图中: In the drawings:

[0059] 图1为反向代理及网络地址变换技术实现的负载均衡组网结构示意图; [0059] FIG. 1 is a reverse proxy network address and the network load balancing schematic structural transformation technology;

[0060] 图2为根据本发明实施例的服务器集群系统组网结构示意图; [0060] FIG. 2 is a schematic network configuration server cluster system according to an embodiment of the present invention;

[0061] 图3为根据本发明实施例的服务器集群系统逻辑结构框图; [0061] FIG. 3 is a block diagram showing the logical structure of a cluster server system according to an embodiment of the present invention;

[0062] 图4为根据本发明实施例的主节点逻辑结构框图; [0062] FIG. 4 is a block diagram showing the logical structure of the master node according to embodiments of the present invention;

[0063] 图5为根据本发明实施例的服务器集群系统应用实施例示意图; [0063] FIG. 5 is a server cluster system according to an embodiment of the present invention, a schematic diagram of the application embodiment;

[0064] 图6为根据本发明实施例的基于服务器集群系统的负载均衡实现方法流程示意图; [0064] FIG. 6 is a schematic process flow according to implement load balancing server cluster system according to an embodiment of the present invention;

[0065] 图7为根据本发明实施例的客户终端与集群系统连接请求过程的原理流程示意图; [0065] FIG. 7 is an embodiment of the present invention and a client terminal connection request cluster process schematic flow diagram of the principle;

[0066] 图8为根据本发明实施例的服务器集群系统的负载均衡实现的工作流程示意图。 [0066] FIG. 8 is a schematic diagram of the work flow servers load balancing cluster system implemented according to embodiments of the present invention.

[0067] 在所有附图中相同的标号指示相似或相应的特征或功能。 [0067] Similar or corresponding features or functions throughout the drawings the same reference numerals.

具体实施方式 detailed description

[0068] 在下面的描述中,出于说明的目的,为了提供对一个或多个实施例的全面理解,阐述了许多具体细节。 [0068] In the following description, for purposes of illustration, in order to provide a thorough understanding of the embodiments of one or more, numerous specific details are set forth. 然而,很明显,也可以在没有这些具体细节的情况下实现这些实施例。 However, it is clear that these embodiments may be practiced without these specific details. [0069] 在反向代理及网络地址变换技术实现的负载均衡组网结构中,其组网形式均是采用单点进入的树状结构,图1示出了反向代理及网络地址变换技术实现的负载均衡组网结构,如图1所示,由一个服务器作为流量的总入口,将流量分发给内部的其他服务器,再将返回的流量传递给连接发起端。 [0069] In the load balancing technology of the network structure and network address translation reverse proxy, the network which are in the form of a tree structure using a single entry point, FIG. 1 shows a reverse proxy and network address translation technology load balancing network structure shown in Figure 1, consists of a total inlet flow rate as a server, the server will distribute it to other internal traffic, the flow then returns to the transmitting end of the connection is initiated. 同时,由于入口服务器承担的工作任务较多,一般很难对其进行主备备份,主备切换时间也难以满足要求,其中,主备备份就是备用节点周期性地对主节点收到的负载信息进行备份。 Also, because many tasks undertaken by the portal server, is generally difficult to make the primary backup, the switchover time is difficult to meet the requirements, wherein the load information is backup master node periodically backup master node is received for backup.

[0070] 针对目前实际使用的地址转换服务式负载均衡中,只有简单的选择策略,不能支持更优化的负载均衡策略和更复杂的应用协议的问题,本发明经过分析反向代理技术实现的负载均衡、网络地址变换技术实现的负载均衡等方案,在分析各类方案优缺点的基础之上,结合业务场景,提出了一种新的服务器集群系统以及负载均衡实现方法,本发明提供的服务器集群系统采用的为网状组网结构,并在仔细分析所使用的协议的基础之上,创新性地利用协议的特殊功能巧妙地实现了连接分发,达到了分摊连接的功能,与传统连接分发算法相比,具有延时小,连接稳定的特征,适应车载客户终端快速移动的特性。 [0070] For the current address of the actual use of conversion services Load balancing, only a simple selection strategy, can not support a more optimal load balancing strategy and more complex application protocol, the present invention after analysis technology to achieve reverse proxy load balancing, network address translation techniques to achieve load balancing scheme, based on the analysis of advantages and disadvantages of the various programs, combined with business scenarios, proposed a new server cluster and load balancing implementation method, the present invention provides a server cluster the system uses a mesh network structure, and based on a careful analysis of the protocol used on the innovative use of the special features cleverly protocol implements the connection handout, to share the function of connection to the conventional connection distribution algorithm compared with a small delay with stable characteristics, adaptability client terminal fast-moving vehicle.

[0071] 以下将结合附图对本发明的具体实施例进行详细描述。 [0071] The following specific embodiments in conjunction with the accompanying drawings of embodiments of the present invention will be described in detail.

[0072] 为了说明本发明提供的服务器集群系统的网状结构,图2示出了根据本发明实施例的服务器集群系统的网状组网结构。 [0072] To illustrate the mesh structure of the server cluster of the present invention provides, FIG. 2 shows a configuration of the server cluster mesh networking system according to an embodiment of the present invention.

[0073] 如图2所示,本发明提出的服务器集群系统的组网结构为网状组网结构,其中,具备主备功能的集群服务架构内设三种节点类型,主节点、备用节点及普通节点。 [0073] 2, the network structure of the server cluster of the present invention is made of a mesh network structure, which includes a cluster service node architecture offers three kinds of types of the primary function of the master node, the standby node and ordinary node.

[0074] 本发明的集群组网从两个方面改善了传统组网方式,首先,减轻入口服务器的工作量负担,使其可以满足主备组网的要求,实现主用服务器驻场,备用服务器协管,普通服务器候补的布局,屏蔽了单点故障,极大地提高了组网稳定性;其次,将树形组网结构替换为网状组网结构,一方面从组网上提高了服务稳定性,另一方面利于选举全网内性能最优节点成为备用节点,从而在主节点失效情况下成为主节点,保障了全局最优选举。 [0074] The network cluster from the two aspects of the present invention improved the traditional networking, firstly, to reduce the workload burden on the portal server, so that it can meet the standby networking requirements to achieve the primary server resident standby co-management server, the server candidate general layout, shielding the single point of failure, greatly improves the stability of the network; secondly, to replace a mesh network configuration tree network structure, on the one hand to improve the stability and services from the internet group sex, on the other hand is conducive to optimal performance in the election of the whole network node becomes the standby node, so the case becomes the master node failure at the primary site, protection of the global optimum elections.

[0075] 下面详述本发明所使用的集群网状组网架构。 [0075] The following detailed description clusters mesh networking architecture used in the present invention.

[0076] 从图2中可以看出,提出备用节点的方案主要是基于目前的负载均衡的入口服务器承担的工作任务较多,一般很难对其进行主备备份,主备切换时间也难以满足要求,因此这种方式下代理服务器必然会成为单点故障的重要节点;而在本发明中主备之间信息同步就是为了便于主节点故障时备用节点能够快速接管工作,若网内只存在主节点和普通节点,要想实现无缝切换,主点就需要和网内普通节点进行信息同步,这种情况下同步的信息量的开销会比只于备用节点同步开销大。 [0076] As can be seen from Figure 2, the proposed scheme is based mainly on the standby node higher current load balancing tasks undertaken by the portal server, is generally difficult to make the primary backup, the switchover time is difficult to meet requirements, which is bound to become an important way proxy server single point of failure nodes; in the present invention is to facilitate the synchronization information standby node when the primary node fails to quickly take over between the primary and, if there is only the main network ordinary node and the node, in order to achieve seamless switching, and the main point requires ordinary node network synchronization information, synchronization information in this case overhead than only in the large spare node synchronization overhead. 下面对每个节点进行简要介绍,以便了解整体的组网方式。 The following brief description of each node, in order to understand the overall networking.

[0077] 主节点负责搜集集群内各个节点的节点信息,并经过负载计算,计算出各个节点的负载值,主节点管理客户终端的连接请求,对大量的客户终端连接按照一定的规则进行连接分发;与此同时,主节点增设会话管理机制,建立职能客户终端与服务器集群系统间的快速通道,进一步减低时延。 [0077] The master node is responsible for collecting the node information of each node in the cluster, and through the load is calculated, the calculated load value of each node, a connection request the master node management client terminal, a large number of client terminals connected to connect distributed according to certain rules ; At the same time, the master node adding session management mechanism, the establishment of fast-track functions between the client terminal and the server cluster system, further reducing latency.

[0078] 备用节点主要实现三个功能,一是将自己节点的负载信息周期性上报给主节点, 作为主节点选择连接分发节点的依据,同时备用节点周期性地从主节点同步集群内负载信息,与主节点联合实现主备备份,在主节点异常情况下备用节点切换为主节点,避免因主节点异常导致集群系统不能对外提供业务的情况;最后,备用节点负责选举普通节点中性能最优节点,以便在自己成为主节点后使其成为备用节点。 [0078] spare node to achieve three main functions, the first load information of the own node to periodically report to the master node, as a master node selected according to the connection node of the distribution, while the standby node periodically load information from the primary cluster node synchronization , implemented jointly with the primary node primary backup, in exceptional circumstances standby node primary node to switch the master node, the master node to avoid because of abnormalities caused by cluster system can not provide external services; and finally, the standby node responsible for electing the best performance ordinary node node, so as to become the master node after the node becomes the standby.

[0079] 普通节点实现负载信息周期汇报机制,负责向主节点周期汇报负载情况,同时,其最重要的任务,是处理由主节点重定向过来的客户终端的链接,为客户终端提供服务。 [0079] ordinary nodes to achieve load cycle information reporting mechanism responsible for reporting to the master node load cycles, while its most important task is processed by the master node redirects the client terminal over a link for customers to provide terminal services.

[0080] 通过图2中的服务器集群系统的组成功能描述来看,本发明提供的服务器集群系统,主节点负责将智能客户终端的业务长连接均衡地打散到集群内各个节点上,避免单一节点所带来的性能瓶颈,其次,主节点上的会话管理机制能够实现连接快速的重新定向,降低时延;主备用节点联合的备份机制,能够实现主节点故障时备用节点能够快速成为新的主节点,集群系统出现故障节点时仍可以稳定的提供业务服务。 [0080] The function described by the composition of the cluster server system in FIG. 2 view, the present invention provides a server cluster, the master node is responsible for the long service smart client terminals connected to the respective balanced scattered nodes in the cluster, avoiding the single brought performance bottleneck node, secondly, the session management mechanism connected to the master node enables rapid reorientation reduced delay; primary spare node backup mechanism in combination, it is possible to quickly become the new standby node when the primary node fails the master node can still provide a stable business service node when the cluster system fails. 在备用节点成为主节点后, 新的备用节点将选出,整个集群系统成为有层次的递进系统,只要有一台服务器能正常工作,即可向外提供正常服务,最大化提高集群的可用性。 After the standby node becomes the master node, the new standby node will be selected, the entire cluster system has become a progressive system level, as long as a server to work properly, can provide services outside normal, maximized to improve the availability of the cluster.

[0081] 根据本发明提供的网状组网结构,本发明对服务器集群系统内的节点进行了分类,总共分为主节点、备用节点及普通节点三类,各类节点共同协作,分工明确,在对外提供服务的同时,实现了集群内部负载均衡,提高了系统整体稳定性。 [0081] The present invention provides a mesh network configuration, the present invention is to nodes within the server cluster classification system, a total of points as a master node, the standby node and a common node three, various nodes work together, clear division, at the same time offer services to achieve the cluster load balancing, improve the overall stability of the system. 图3示出了根据本发明实施例的服务器集群系统逻辑结构。 Figure 3 shows a logical structure of a server cluster in accordance with an embodiment of the present invention.

[0082] 如图3所示,本发明提供一种服务器集群系统300,包括:主节点310、备用节点320和普通节点330。 [0082] As shown in FIG. 3, the present invention provides a server cluster system 300, comprising: a master node 310, node 320 and common spare node 330.

[0083] 其中,主节点310包括信息收集模块311、负载计算模块312和连接分发决策模块313。 [0083] wherein the master node 310 includes an information collection module 311, a load calculation module 312 and the distribution decision module 313 is connected.

[0084] 备用节点320用于向主节点上报备用节点的负载信息,并与主节点的信息同步, 以及对普通节点进行筛选,选择出的普通节点作为新的备用节点。 [0084] spare node 320 is configured to report the load information of the spare node to the master node, and the synchronization information with the master node, and the ordinary node of the filter, the ordinary node selected as the new spare node.

[0085] 普通节点330用于向主节点上报普通节点的负载信息,以及当其成为备用节点时,将其切换为备用节点。 [0085] The common node 330 is configured to report load information common node to the master node, and when it becomes the standby node, it switches to standby node.

[0086] 为了详细说明服务器集群系统中的主节点的逻辑结构,图4示出了根据本发明实施例的主节点逻辑结构,如图4所示,主节点310进一步包括信息收集模块311、负载计算模块312、连接分发决策模块313及会话管理模块314。 [0086] To the logical structure of the master node of the cluster server system detailed description, FIG. 4 shows the logical structure of the master node according to embodiments of the present invention, shown in Figure 4, the master node 310 further includes an information collection module 311, the load calculation module 312, connected to the distribution decision module 313 and session management module 314.

[0087] 具体地,信息收集模块311用于存储与主节点相连的备用节点和普通节点上报的负载信息。 [0087] Specifically, the information collection module 311 for loading information and ordinary standby node connected to the storage node and the master node reports.

[0088] 负载计算模块312用于根据上报的负载信息,获得与负载信息相对应的节点的负载值。 [0088] The calculation module 312 is used to load the load value of the load information reported by the obtained load information corresponding to the node according to.

[0089] 连接分发决策模块313用于通过HTTP重定向进行连接分发,其中,根据负载计算模块获得的节点的负载值,选择出最优处理能力的节点,作为连接分发的节点。 [0089] Decision module 313 connected to the distribution connector for distribution via HTTP redirection, wherein the load values ​​of the nodes of the load calculation means for obtaining, processing power optimum node is selected as a distribution node is connected.

[0090] 管理会话模块314用于管理客户终端与集群系统之间的会话历史记录,并负责删除客户终端与集群系统之间长时间不活跃的历史会话记录。 [0090] session management module 314 for managing conversation history between the client terminal and the cluster system, and is responsible for deleting inactive long historical record conversation between the client terminal and the cluster system.

[0091] 其中,在信息收集模块311中创建一个nodelist (节点列表),nodelist用于存储与主节点相连的各个节点上报的负载信息;其中,nodelist包括更新机制,更新机制用于删除离线的节点。 [0091] wherein a NodeList create (list of nodes) in the information collection module 311, load information for each NodeList nodes connected to the master node reports the store; wherein NodeList mechanism including node update, the update mechanism for deleting offline .

[0092] 负载信息包括:与主节点相连的每个节点的CPU利用率、内存利用率、网络带宽占用率以及socket (socket是应用层与TCP/IP协议族通信的中间软件抽象层,它是一组接口,对用户来说,一组简单的接口就是全部,让socket去组织数据,以符合指定的协议)连接占用率;在负载计算模块312中,对每个节点通过特定加权的负载计算算法,分别获取其对于CPU密集型、内存密集型、网络带宽密集型及socket密集型请求的负载能力,并将获取的结果更新到nodelist。 [0092] The load information include: CPU utilization of each node connected to the master node, memory utilization, network bandwidth utilization and a socket (socket is an intermediate layer and application software TCP / IP protocol suite to communicate abstraction layer, it is a set of interfaces for users, a simple interface is all set, so that socket to organize data in order to comply with the specified protocol) connection occupancy; load calculating module 312, the calculation load for each node-specific weighted by algorithm, which respectively acquire CPU intensive, memory-intensive, bandwidth-intensive and intensive load capacity request to the socket, the result of the update to the acquired nodelist.

[0093] 备用节点320进一步包括备用节点信息上报模块、信息同步模块和普通节点筛选模块。 [0093] The node 320 further comprises a spare spare node information reporting module, and the general synchronization module information node filter module.

[0094] 具体地,备用节点信息上报模块,用于向主节点周期性汇报负载信息。 [0094] Specifically, the backup node information reporting means for reporting to the master node periodically load information.

[0095] 信息同步模块,用于周期性地同步信息收集模块中的负载信息,并进行存储。 [0095] The information synchronization means for loading information to periodically synchronize information collection module, and stored.

[0096] 普通节点筛选模块,用于当备用节点成为主节点时,选择普通节点中负载最小的节点成为其备选。 [0096] Common node filter module, configured to, when the standby node becomes the master node, the node selects the smallest common load node as its alternative.

[0097] 普通节点包括普通节点信息上报模块和备用节点切换模块。 [0097] Normal Normal node information includes a node and a standby node reporting module switching module.

[0098] 具体地,普通节点信息上报模块,用于向主节点周期性汇报负载信息。 [0098] Specifically, ordinary node information reporting means for reporting to the master node periodically load information.

[0099] 备用节点切换模块,用于当备用节点成为新的主节点后,将对其所选择出的新的备用节点发出通知,普通节点据此切换为备用节点,并承担备用节点的职责。 After [0099] the spare switching node module, configured to, when the standby node becomes the new master node will be selected in its new spare node notification, the node accordingly switched to the normal standby node, the standby node and take responsibility.

[0100] 此外,在本发明中,以五个节点组成的服务器集群系统为例进行详细阐述,图5示出了根据本发明实施例的服务器集群系统应用实施例结构。 [0100] In the present invention, a server cluster system with five nodes detail an example, FIG. 5 shows a cluster server system embodiment of the present invention, the structure of an application example embodiment.

[0101] 如图5所示,A为主节点,B为备用节点,C、D、E为普通节点。 As shown in [0101] FIG. 5, A master node, B is a spare node, C, D, E as an ordinary node. 其中,主节点A包括信息收集模块、负载计算模块、连接分发决策模块及会话管理模块。 Wherein the master node A includes an information collection module, load calculation module, connected to the distribution decision module and the session management module. 备用节点B包括备用节点信息上报模块、信息同步模块和普通节点筛选模块。 Node B comprises a standby spare node information reporting module, and the general synchronization module information node filter module. 普通节点C、普通节点D、普通节点E 分别包括普通节点信息上报模块和备用节点切换模块。 Common node C, ordinary nodes D, E node include common ordinary node information reporting module and the standby switching node module.

[0102] 在如图5所示的主节点A的信息收集模块中,主节点A负责和备用节点B及普通节点C、D、E之间保持稳定的长连接状态,以此获取各个节点的状态信息。 [0102] In the information collection module master node A as shown in Figure 5, the standby master node A and the node B and is responsible for ordinary node C, D, to maintain a stable connection state between the length E, each node in order to obtain status information. A节点上创建一个nodelist,以便存储与之相连的各个节点上报的负载信息,每当A节点接收到B、C、D、E 周期性上报的状态信息时(状态信息包括:每个节点的负载信息,即CPU利用率、内存利用率、网络带宽占用率以及socket连接占用率),A节点将相应的负载信息更新到本节点所保存的nodelist中,并以这些信息作为连接分发判定时节点的选择依据。 When creating a nodelist the A node connected thereto for storing load information of each node reports whenever A node receiving the status information B, C, D, E to periodically report (status information comprises: loading each node information, i.e., CPU utilization, memory utilization, network bandwidth utilization and occupancy socket connection nodes), a node will update the load information to the node stored in nodelist, and is connected to the information distribution as determined select basis. nodelist包括更新机制,更新机制保证了将不在线(离线)的僵死的节点适时删除。 nodelist including the update mechanism will not update mechanism ensures that the online (offline) node of the dead timely removed. 如:当主节点A长时间未收到一个节点的相应信息时,即认为该节点出现故障,因此将该节点的信息从nodelist中剔除,避免A节点进行连接分发时讲连接分发到该节点上。 Such as: When the master node a node A time corresponding information is not received, i.e., that the node fails, thus removed, to avoid stresses node A is connected to the distribution node connected to the distribution from the information node nodelist.

[0103] 在如图5所示的主节点A的负载计算模块中,A节点在信息收集模块搜集到的节点负载信息,包括CpuUseRate、MemUseRate、NetworkUseRate、SocketUseRate,分别代表了每个节点的CPU利用率、内存利用率、网络带宽占用率以及socket连接占用率。 [0103] In the master node A load calculation module shown in Figure 5, the information collection module at the node A to the node load information gathered, including CpuUseRate, MemUseRate, NetworkUseRate, SocketUseRate, representing each node of the CPU utilization , memory usage, network bandwidth utilization and socket connection usage. A节点根据收集到的负载信息对每个节点的负载值进行计算,负载算法将采用特定加权的负载计算算法,即:针对每个节点,分别计算其对于CPU密集型、内存密集型、网络带宽密集型及socket 密集型请求的负载能力,以便有效地处理不同的请求类型,更合理地进行连接分发,计算出的结果更新到对应节点的nodelist中去,便于后续快速地进行连接分发。 A node in accordance with the collected load information for each node calculated load value, a load using a specific algorithm load calculation algorithm weighted, namely: for each node, respectively, its CPU-intensive, memory-intensive, network bandwidth for socket-intensive and intensive load capacity requests in order to efficiently handle different types of requests, more rational distribution of connection, the calculation result to update the corresponding node nodelist to facilitate subsequent quick connect distributed.

[0104] 在如图5所示的主节点A的连接分发决策模块中,当有智能终端连接请求时,选择最优处理能力的节点,并将连接分发到该节点上,假定在B、C、D、E四个节点中,选择在负载计算模块中计算出最优处理能力的节点为D节点,则A节点将智能终端的连接重定向到D节点上,通过这种方案,可以将连接均衡的打散到各个节点上,并且可以降低处理的延时。 [0104] In the master node connected to the distribution shown in Figure 5 A decision module when the intelligent terminal connection request, selecting an optimal node processing capability, and is connected to the distribution node, it is assumed that B, C , D, E four nodes, select the optimal processing capacity is calculated in the load calculation module node is the node D, the node a connected to the intelligent terminal is redirected to the node D, in this embodiment, may be connected break balanced to each node, and can reduce processing delays. 连接分发的具体过程及相关算法将在下面进一步描述。 Specific procedures and related connection distributed algorithm will be described further below.

[0105] 在如图5所示的主节点A的会话管理模块中,主节点增设会话管理机制,管理智能终端与服务器集群系统的会话历史记录,针对智能终端后续连接进行快速分发,最大限度地降低请求时延。 [0105] As shown in the session management module of the master node A shown in Figure 5, the master node adding the session history session management mechanism, the intelligent terminal and the server cluster management system, rapid distribution, to maximize the intelligent terminal for subsequent connections reduction request latency. 会话管理模块负责删除那些长时间不活跃的会话记录,从而保证复活连接能根据连接分发算法进行打散,使连接分发更加合理。 Session Management Module is responsible for deleting those long inactive recording sessions, so as to ensure the resurrection of the connection can be broken up according to the connection distribution algorithm, the connection distribution more reasonable.

[0106] 也就是说,会话管理模块将删除那些长时间不活跃的会话记录,是由于智能终端与连接分发的节点长时间不连接,集群系统中的节点的负载会发生变化,原来的连接分发的节点的负载也会发生变化,因此不能保证原来的连接分发的节点是最优处理能力的节点,故将此会话记录删除,重新选择新的连接分发的节点,以保证选择的节点为最优处理能力的节点。 [0106] In other words, the session management module will delete those long inactive recording session, due to the intelligent terminal node connection handout for a long time not connected to the cluster nodes in the system will load changes, the original connection distribution load node will change, and therefore can not guarantee the original node is connected to a node optimum distribution of processing power, so this session record is deleted, re-select a new distribution of nodes connected to the node selected as the best guarantee node processing capacity. 因此会话管理模块进一步降低分发连接时的时延,保证智能终端与集群系统之间业务的可靠性及时效性。 Thus the session management module is further reduced when the delay distribution is connected, timeliness and reliability between the intelligent terminal and the service cluster system.

[0107] 在如图5所示的备用节点B的备用节点信息上报模块中,备用节点B周期性向主节点A汇报自己的负载信息,其负载信息包括cpu负载、内存负载、网络带宽负载及socket 负载等,方便主节点A利用这些信息进行连接分发及节点管理。 [0107] In the standby node standby node B shown in FIG. 5 information reporting module, the standby node B periodically report their load information to the master node A, which load information comprises a load cpu, memory load, network bandwidth and load socket load, etc., to facilitate the use of the master node A and node information management connection handout.

[0108] 在如图5所示的备用节点B的信息同步模块中,备用节点B周期性地从主节点A上同步nodelist信息,保存在本节点的内存中,同步主备用节点之间对集群系统的信息。 [0108] In the information synchronization module standby node B shown in FIG. 5, the standby node B periodically synchronize nodelist information from the master node A, stored in the memory node, the synchronization between the standby master cluster nodes information systems. 备用节点B采用定时机制及事件触发机制来检测主节点A的在线状态:首先,主节点A将定时检测主节点A的在线状态;其次,当备用节点B周期性上报信息给主节点A失败时,将触发对主节点A的在线状态检测,若检测结果是主节点A离线,则备用节点B将切换为主节点, 并群发通知集群系统内各节点,通知预先选出的普通节点成为备用节点,从而继续对智能终端的连接请求提供服务。 Secondly, when the standby node B periodically reports the information to the master node A fails; First, the master node A timing detector online master node A: standby node B using event trigger mechanism and timing mechanism to detect the state of the master node A line the line state detector triggers the master node a, if the detection result is offline master node a, the node B switches the standby master node, and each node within a mass notification system cluster, the node notifies the ordinary standby node becomes preselected to continue to provide services for intelligent terminal connection request.

[0109] 在如图5所示的备用节点B的普通节点筛选模块中,备用节点B将以较长周期选择负载最小的节点成为自己的备选,并在自己成为主节点的时,通知该节点成为备用节点, 从而在主节点失效的情况下完成预定网络的组网结构的修补,继续保证网络的稳定性,对外提供服务。 [0109] FIG ordinary node in the node B of the spare filter module shown in Figure 5, the standby node B will choose a longer period own node becomes minimum load alternative, and, when become the master node notifies the node becomes the standby node, thereby completing the repair of the network structure in the case where a predetermined network master node failure, to ensure the continued stability of the network, to provide services.

[0110] 在如图5所示的普通节点的普通节点信息上报模块中,普通节点C、D、E周期性向主节点汇报自己的负载信息,包括cpu负载、内存负载、网络带宽负载及socket负载等,方便主节点利用这些信息进行连接分发及节点管理。 [0110] In the ordinary node of the ordinary node information 5 reporting module, the common node C, D, E periodically report their own load information to the master node, including cpu load, memory load, network bandwidth and loading load socket etc., to facilitate the use of this information the master node and the connecting node distribution management.

[0111] 在如图5所示的普通节点的备用节点切换模块中,当备用节点B成为新的主节点后,将对其所选择出的新的备用节点发出通知,普通节点C据此进行备用节点切换,包括承担备用节点的各项功能与服务,完成当前连接请求并返回结果等,随后与主节点进行通信, 继续对集群内的节点进行管理。 [0111] In the normal standby node node switch module shown in FIG. 5, after the standby node becomes the new master node B, it will select a new spare node notification, the node C common accordingly the standby node switches, including the assumption of the spare node functions and services to complete the current connection request and returns the results, etc., and then communicates with the master node, a node within the cluster continued to be managed.

[0112] 上述为服务器集群系统中各个节点的功能,为了说明服务器集群系统中各个节点的负载均衡功能的实现,图6示出了根据本发明实施例的基于服务器集群系统的负载均衡实现方法流程。 [0112] The server cluster system functions of each node, to illustrate to achieve load balancing of the server cluster in each node, FIG. 6 shows an embodiment of the present invention to achieve a method flow based load balancing server cluster system .

[0113] 如图6所示,本发明提供的基于服务器集群系统的负载均衡实现的方法,服务器集群系统包括主节点、备用节点和普通节点。 [0113] As shown in FIG 6, a method of load balancing server cluster-based implementation, the server cluster comprising a master node, the standby node and the common node of the present invention is provided. 其中, among them,

[0114] S610:主节点周期性地接收与其相连的备用节点、各个普通节点上报的负载信息。 [0114] S610: The master node periodically receiving a spare node connected thereto, the load information reported by ordinary nodes.

[0115] 其中,需要说明的是,在主节点中创建一个nodelist,各个节点的负载信息存储在nodelist中,负载信息包括:与主节点相连的每个节点的CPU利用率、内存利用率、网络带宽占用率以及socket连接占用率。 [0115] wherein Incidentally, a nodelist created in the master node, the load information stored in each node nodelist, the load information include: CPU utilization of each node connected to the master node, memory utilization, network bandwidth utilization and socket connection usage.

[0116] S620 :主节点根据上报的负载信息,获得与负载信息相对应的各个节点的负载值。 [0116] S620: the master node reports the load value of the load information, the load information obtained corresponding to the respective nodes. FIG.

[0117] 在客户终端请求连接之前,判断nodelist中是否存在离线的节点,若存在离线的节点,则将此节点从nodelist中删除;若存在离线的节点为备用节点,则重新在普通节点中选择新的备用节点,并与主节点进行信息同步;若在nodelist中不存在离线的节点,根据所述客户终端的连接请求进行连接分发。 [0117] Before a client terminal requests a connection, the node determines whether there nodelist off, if the presence of the node is offline, then the node deletes from this nodelist; if the node is present in an offline standby node, the ordinary node reselect new spare nodes, and information synchronized with the master node; if not present in the node nodelist off, the connection requesting distribution according to the connected client terminal.

[0118] S630 :当有客户终端连接请求时,主节点通过HTTP重定向进行连接分发,其中,主节点根据获得的各个节点的负载值,选出最优处理能力的节点,作为连接的分发节点。 [0118] S630: When a client terminal connection request, the master node is connected via HTTP redirection distributed, wherein the master node according to the load value obtained in each node, the node selecting the optimal processing capabilities, as the distribution nodes connected to .

[0119] 其中,在主节点中设置有会话管理机制,会话管理机制包括客户终端与集群系统之间的会话历史记录,并负责删除客户终端与主节点之间长时间不活跃的历史会话记录, 使选择的连接分发的节点为最佳选择。 [0119] where, in the master node is provided with a mechanism for session management, session management mechanisms, including conversation history between the client terminal and the cluster system, and is responsible for deleting inactive long historical record conversation between the client terminal and the master node, the selection of the distribution of nodes connected to the best choice.

[0120] 这是由于客户终端与连接分发的节点长时间不进行业务往来,集群系统中的各个节点的负载会发生变化,为保证业务连接分发的节点是最优处理能力的节点,故将会话记录删除,重新选择新的连接分发的节点。 [0120] This is due to the client terminal for a long time not to do business with the distribution of nodes connected, cluster system load of each node will change, in order to ensure service connection node distribution is optimal processing node capability, so the session delete records, re-select a new node is connected distribution. 会话管理机制的设置进一步降低分发连接时的时延,保证智能终端与集群系统之间业务的可靠性及时效性。 Set session management mechanisms to further reduce the delay time of distribution connections to ensure the reliability and timeliness of services between the intelligent terminal and the cluster system.

[0121] S640 :当主节点出现故障时,备用节点进行主备切换,成为新的主节点,执行主节点的职责,同时启用新的备用节点。 [0121] S640: When the master node fails, the standby node standby switching becomes a new master node, the master node performs the functions, and enable a new spare node.

[0122] S650:新的备用节点将负载信息上报到新的主节点中,并与新的主节点的信息同步,同时对普通节点进行筛选,选出新的备用节点的备选。 [0122] S650: the new spare node to the new load information reported by the master node, and the synchronization information with the new master node, while the ordinary node screening, to select an alternative new spare node.

[0123] 上述步骤S640和步骤S650为主备切换的工作流程。 [0123] The step S640 and step S650 switchover based workflow. 当主节点出现故障时,备用节点是如何替换主节点进行履行主节点职责的。 When the primary node fails, the standby node is how to replace the primary node to perform the duties of the master node. 主节点和备用节点联合的备份机制,能够实现主节点故障时备用节点能快速成为新的主节点,服务器集群系统出现问题时仍然能够可以稳定地提供业务服务。 Standby node can quickly become the new primary node when the primary and standby nodes jointly backup mechanism, to achieve the primary node fails, you can still be able to consistently provide business services cluster server system problems.

[0124] 下面将详细描述主节点出最优处理能力的节点作为连接的分发节点的具体过程。 [0124] The master node will be described in detail a specific node during the optimal processing capacity as a distribution of connected nodes.

[0125] 本发明采用对服务器集群系统中主节点进行备份,这种备份机制与反向代理技术及网络地址变化技术实现的负载均衡相比有如下优点:一方面,服务器集群系统中的主节点不需要像反向代理服务器那样承担应用层工作;另一方面,也不需要像网络地址变换那样建立双向连接,因此减小了主节点的工作量,降低了单个节点故障风险,也使得主备实现变得可能,进一步提高集群稳定性。 [0125] The present invention uses a cluster server system, the master node for backup, the backup mechanism such reverse proxy network address changes in technology and technology to achieve load balancing as compared following advantages: on the one hand, the master server node in the cluster system as the application layer does not need to bear as a reverse proxy server; on the other hand, does not need to establish a two-way connection so as network address translation, thus reducing the workload of the master node, reducing the risk of failure of a single node, but also makes the standby realization becomes possible to further improve the stability of the cluster. 下面将描述入口服务器(即主节点)连接分发的工作原理及相应算法。 The operation principle of the distribution and the corresponding algorithm will be described an inlet connected to the server (i.e., the master node).

[0126] 众所周知,在HTTP协议中,HTTP请求到达服务端后,由服务端进行处理,并返回HTTP回应,此回应会携带HTTP返回码,各返回码蕴含了不同的含义,如表1所示。 [0126] It is well known in the HTTP protocol, the HTTP requests reach the server, processed by the server, and returns the HTTP response, this response will carry the HTTP return code, return code each contains a different meaning, as shown in Table 1 .

[0127] 表1 [0127] TABLE 1

[0128] [0128]

Figure CN104283948AD00141

[0129] 其中,以3开头的HTTP返回码表明所请求的服务已经更换了网络位置,需要客户终端端重新发起请求,新的链接地址将在HTTP回应中一并返回。 [0129] Among them, HTTP 3 at the beginning of the return code indicates that the requested service has been replaced network location, you need to re-initiate the request client-side terminal, the new link address will also be returned in the HTTP response.

[0130] 主节点利用HTTP返回码中的307,其含义为采用临时重定向,表明所请求的服务临时更好了位置。 [0130] The master node using HTTP return of 307 yards, using a temporary redirection of its meaning, indicating that the requested service temporary location better. 使用临时重定向的好处有:可以在HTTP回应中返回希望的网络地址,继而让客户终端向新的地址发起连接请求;其次,由于是临时重定向,客户终端在发起新的连接请求时,将不会对重定向后的地址进行保存,以便每次连接请求时,主节点都会首先收到客户终端的连接请求,再根据各节点的负载情况进行连接分发,保证了集群设计功能的实现。 The benefits of using temporary redirects are: hope to return in the HTTP response to the network address, and then let the client terminal sends a connection request to the new address; secondly, because it is a temporary redirect the client terminal in initiating a new connection request, the It will not address redirects to save, so that each connection request, the master node will first receive a connection request of the client terminal, and then connect distributed according to the load of each node to ensure the realization of the cluster design features.

[0131] 图7示出了根据本发明实施例的客户终端与集群系统连接请求过程的原理流程。 [0131] FIG. 7 shows the principle of the process flow of the connection request to the client terminal cluster system according to an embodiment of the present invention. 在本实施例中,以具体的车载客户端作为客户终端。 In the present embodiment, the vehicle to the client as a specific client terminal.

[0132] 如图7所示,车载客户端向主节点发送连接请求,主节点通过连接分发算法获得最优处理能力的连接分发的普通节点C,车载客户端将根据主节点返回的最优处理能力的普通节点C的信息,向普通节点C建立起业务长连接,然后车载客户端与普通节点C开始业务处理。 [0132] As shown, the vehicle sends a connection request to the client the master node 7, node C is the primary node obtained normal distribution is connected by the connection handling capacity optimal distribution algorithm, the vehicle according to the client optimization process of the master node returns information common node C of power, ordinary node C to establish long business connections and on-board client node C starts with ordinary business processing.

[0133] 在本发明中,主节点通过HTTP重定向进行连接分发,通过HTTP重定向进行连接分发有如下好处:主节点只需要进行一次HTTP请求的处理,没有对应用提供服务,也没有对内建立连接,极大地节省了主节点的处理资源,减轻了主节点的负担;同时,由于HTTP重定向携带的新地址是通过计算得出的内部节点的地址,具有不可预测性,其功能等同于网络访问时所使用的验证码,可以在一定程度上屏蔽由自动机发起的无效访问,因此具有防DDoS网络攻击的能力,提高了系统整体的安全性。 [0133] In the present invention, the main node is connected via HTTP redirection distributed, the distribution has the following advantages are connected by HTTP redirection: the master node only needs to process an HTTP request, the application does not provide the service, there is no internal establish a connection, a significant savings in processing resources master node, reduce the burden on the master node; the same time, since the new address is carried in the HTTP redirect internal node obtained by calculating the address, unpredictability, which is functionally equivalent to when used for network access codes, can be shielded to some extent invalid access initiated by the automatic machine, thus has the ability to network anti DDoS attacks, improve the security of the entire system.

[0134] 在连接分发算法中,对负载最小的节点的选择是重中之重(即:选出最优处理能力的节点),其不仅涉及到连接请求能否快速得到处理,也涉及集群系统整体的稳定性,下面将对负载最小节点选择算法进行描述。 [0134] In connection distribution algorithm, the selection of the least loaded nodes is the most important (i.e.: selecting the optimal node processing capabilities), which not only relates to the connection request can be processed quickly, also relates to a cluster system overall stability, the following minimum load node selection algorithm will be described.

[0135] 由于在主节点上将创建了一个nodelist,实时更新了各节点的各项负载情况,每个节点上搜集到的数据用四元组,具体为CPUUsage、MemUsage、NetworkUsage和SocketUsage,分别表示节点上CPU、内存、网络带宽及Socket连接数量的消耗水平。 [0135] Since created on the master node in a NodeList, updated in real time the load of each node, each node on the data collected by a four-tuple, in particular CPUUsage, MemUsage, NetworkUsage and SocketUsage, denote node CPU, memory, network bandwidth consumption and the number of horizontal Socket connection. 根据连接请求的不同,将连接类型区分为CPU密集型、内存密集型及带宽密集型。 Depending on the connection request, the connection type is divided into CPU intensive, bandwidth-intensive and memory-intensive. 因此,通过特定加权的负载计算算法,每个节点将分别计算其对不同连接请求的处理能力,如下所示: Thus, by a specific weighting load calculation algorithm, each node calculates its ability to handle a request for a different connection, as follows:

[0136] rx = f (cu, mu, nu, su) [0136] rx = f (cu, mu, nu, su)

[0137] 其中,Γι表示节点的剩余资源;在剩余CPU资源越大时,表明其可承担的工作量越多,则可以将更多的连接分发到相应节点上去。 [0137] wherein, Γι indicating the remaining resource nodes; larger when the remaining CPU resources, it may indicate that more effort undertaken, can be distributed more nodes connected to respective up. cu、mu、nu、su分别表示接收到的节点上传的CPU利用率、内存利用率、网络带宽占用率以及socket连接占用率,值越大,表明相应的资源占用越大,节点负载越重。 cu, mu, nu, su respectively received the node upload CPU utilization, memory utilization, network bandwidth utilization rate and socket connection, the larger the value, the greater the corresponding resource consumption indicates that, the heavier the load node. 根据接收到的相应数据,计算出最终的剩余资源百分比,其公式如下式所示: The respective received data, calculates the final remaining percentage of resources, which is shown in the following formula formula:

[0138] [0138]

Figure CN104283948AD00151

[0139] 其中,ri(k)表示k时刻的剩余资源百分比;cu(k)表示k时刻CPU使用率;cu(k_l) 表示(k-1)时刻CPU的使用率;CPU影响因子考虑其两个时刻,利用离散求导的方法来估算(PU的变化曲线。 [0139] where, ri (k) represents the remaining percentage of the resources at time k; Cu (k) represents a k-time CPU utilization; cu (k_l) represented by (k-1) time CPU utilization; Factors to consider the two CPU time, using the method of derivation of the discrete estimated curve (PU's.

[0140] 在(l-tu(k)-t_st)中,当t = s时,tu(k)表示k时刻的socket连接使用率,t_st表示socket资源的预留量,(lt u(k)-t_st)表示节点t时刻剩余的可用的socket资源情况; 当t = m时,tu(k)表示k时刻的内存使用率,t_st表示内存资源的预留量,(l-tu(k)-t_ st) 表示节点t时刻剩余的可用的内存资源情况;当t = η时,tu(k)表示k时刻的网络带宽占用率,t_st表示网络带宽的预留量,(it u(k)-t_j表示节点t时亥lj剩余的可用的网络带宽资源情况。 [0140] in (l-tu (k) -t_st), when t = s, tu (k) represents the time K socket connection usage, t_st socket represents a reservation of resources, (lt u (k) -t_st) represents available resources socket nodes remaining time t; when t = m, tu (k) represents the memory usage time k, t_st represents the amount of memory reserved resource, (l-tu (k) - t_ st) represents the available memory resources of the nodes remaining time t; when t = η, tu (k) represents network bandwidth occupation time k, t_st represents a reservation of network bandwidth, (it u (k) - Hai lj available network bandwidth resources remaining at a node representing t_j t.

[0141] 需要说明的是,一般情况下,socket资源和网络宽带资源的预留量都为0,表明此资源的使用率可以到达100%,而相应的,内存资源的预留量不能为〇,表明内存资源的使用率不能到达100%,需要保留一定的空间。 [0141] Incidentally, in general, socket reservation of resources and network bandwidth resources are 0, this indicates that the resource usage may reach 100%, and correspondingly, the amount of memory resources can not be reserved for the square that the memory resource usage does not reach 100%, the need to retain a certain space.

[0142] 也就是说,k时刻的剩余资源百分比用ri(k)表示,其分别受到cu、mu、nu、个影响因子的影响,每个影响因子对其的影响方式各不相同;而每个影响因子对其的影响方式又各不相同,若其中一个影响因子达到极限,如cpu剩余量为0,则计算出的资源最终值将变为〇,则该节点将不会得到资源分配,保证了该节点的稳定性和访问请求的正确响应。 [0142] That is, the remaining percentage of resources at time k is represented by RI (k), respectively by cu, mu, nu, a Effects factors each affect different factors on the way thereof; and each Effects embodiment a factor thereof and varied, if one impact factor reaches the limit, such as cpu remaining amount is 0, the calculated value becomes the final square resources, the node will not get the resource allocation, It guarantees proper response to the access request, and the stability of the node. 在4个影响因子中,根据每类资源的不同特性,其对应的影响方式也不同。 Effects of the four factors, depending on the characteristics of each type of resource, corresponding effects in different ways.

[0143] 以socket连接为例,l-su(k)-s_st表示socket连接的影响因子;su(k)为k时刻的socket连接占用率;s_st为socket资源的预留量。 [0143] In the socket connection as an example, l-su (k) -s_st represents Factor socket connection; su (k) is a socket connector occupancy time K; s_st an amount of resources reserved for the socket. 一般情况下,socket资源和network 资源的预留量都为〇,表明这些资源的使用率可以到达1〇〇%,而相应的,Mem资源的预留量不能为〇,表明内存资源的使用率不能到达1〇〇%,这是为了保证节点的正常运行,均需要为内存保留一定的空间,以免造成节点死机。 In general, socket reservation of resources and network resources are square, indicates that utilization of these resources can reach 1〇〇%, while the corresponding, Mem reservation of resources is not square, that the memory resource usage can not reach 1〇〇%, which is to ensure the normal operation of a node, we need to retain a certain memory space, so as not to cause node crash.

[0144] 其中,需要说明的是,在影响因子的计算中,比较特殊的是CPU资源的计算,从系统层面来讲,CPU资源是非常宝贵的资源,同时又是使用率时刻变化的资源,既不能以简单预留空间的方式浪费CPU资源,又必须考虑CPU的变好曲线,同时还需要考虑计算的复杂性,以便在周期性的大量计算中节省主节点的计算资源。 [0144] in which, should be noted that, in the calculation of the impact factor, the more special is to calculate CPU resources, from the system level is concerned, CPU resources are a very valuable resource, but it is also constantly changing resource utilization, in a simple embodiment neither a waste of CPU resources reserved space, it must consider the CPU becomes better curve, also we need to consider the computational complexity, in order to save computing resources in a large number of the primary node in the periodic calculation.

[0145] 综上所述,利用离散求导的方法来估算CPU的变化曲线,并将其作为CPU的预留空间来考虑,当CPU曲线呈现上升趋势时,其一阶导数大于0,将其作为CPU使用预留量使用, 得出l-2c u(k)+cu(kl)的影响因子计算公式,反之,当CPU使用率曲线呈现下降趋势,其一阶导数为负值,由于CPU将得到更多的剩余量,此时不要为其预留使用率空间。 [0145] In summary, by the method to estimate the discrete derivative curve of the CPU, and as a reserved space to consider CPU, when the CPU curve rise, the first derivative is greater than 0, which is as the amount of CPU used rESERVE, obtained l-2c u (k) + cu (kl) impact factor is calculated, on the contrary, when the CPU usage rate downward trend curve, the first derivative is negative, since the CPU get more of the remaining amount, this time not to reserve space usage. 采用曲线轨迹预测方法估算的CPU使用量,不仅能满足CPU使用率的估算要求,同时能够最大程度上地提高CPU的使用率,提升系统整体效能,收到了良好的效果。 Using curved trajectory prediction method for estimating the amount of CPU used, not only to meet the requirements estimate CPU utilization, CPU utilization can be improved while the maximum extent to improve overall system performance, with good results.

[0146] 为了进一步说明服务器集群系统的节点之间如何实现负载均衡的,图8示出了根据本发明实施例的服务器集群系统的负载均衡实现方法的工作流程。 [0146] To further illustrate how to implement load balancing, Figure 8 shows an implementation of a method according to the load balancing server cluster embodiment of the present invention, the workflow between node server cluster system. 在本实施例中,以具体的车载客户端作为客户终端。 In the present embodiment, the vehicle to the client as a specific client terminal.

[0147] 如图8所示,本实施例所示的服务器集群系统的负载均衡实现方法,包括如下步骤: Load Balancing [0147] As shown in FIG. 8, the present embodiment illustrated in the server cluster system implemented method, comprising the steps of:

[0148] S801 :开始; [0148] S801: start;

[0149] S802 :主节点是否存在故障? [0149] S802: The master node is faulty? 若存在故障,执行步骤S803 ;若不存在故障,执行步骤S808 ; If the fault, the presence of step S803; if there is a failure, step S808;

[0150] S803 :主备切换,备用节点成为新的主节点; [0150] S803: the switchover, the standby node becomes the new master node;

[0151] S804 :新的备用节点启用; [0151] S804: The new standby node is enabled;

[0152] S805 :备用节点信息上报; [0152] S805: standby node information report;

[0153] S806 :与主节点信息同步; [0153] S806: information synchronized with the master node;

[0154] S807 :备用节点进行普通节点筛选;主备切换完成后,履行主节点的职责;再执行步骤S808 ; [0154] S807: Common backup node node filter; standby after the handover is completed, the master node to perform their duties; re-executing step S808;

[0155] S808 :主节点周期性接收其它节点负载信息上报; [0155] S808: receiving the master node periodically reports the load information to other nodes;

[0156] S809 :主节点根据上报的负载情况计算每个节点的负载值; [0156] S809: the master node calculates a load value of each node according to the load reported;

[0157] S810 :nodelist中是否有不在线节点? [0157] S810: if there is no online node in nodelist? 若有不在线的节点,执行步骤S711 ;若没有不在线的节点,执行步骤S815 ; If the node is not online, step S711; if No online nodes, executing step S815;

[0158] S811 :不在线的节点是否是备用节点? [0158] S811: whether the node is not online backup node? 若是备用节点,执行步骤S712 ;若不是,执行步骤S814 ; If the standby node, step S712; if not, to step S814;

[0159] S812 :从nodelist中清除此节点; [0159] S812: Clear the node from the nodelist;

[0160] S813 :重新选择新的备用节点并进行信息同步,再执行步骤S815 ; [0160] S813: selecting a new standby node and synchronizing information, and then performs step S815;

[0161] S814 :从nodelist中清除此节点,再执行步骤S815 ; [0161] S814: nodelist clear from this node, and then performs step S815;

[0162] S815 :当有车载客户端连接请求时,选择最优处理能力的节点作为连接分发的节占. [0162] S815: When a car client connection requests, selecting the optimal processing capacity as a section connected to the node representing the distribution.

[0163] S816 :车载客户端和主节点选择出的连接分发的节点开始进行业务流程; [0163] S816: vehicle master node and the client node connected to the selected distribution business process is started;

[0164] S817:结束。 [0164] S817: the end.

[0165] 上述为服务器集群系统的负载均衡实现方法的一个具体实施例的流程。 Example of the process [0165] a method for implementing the above-described load balancing server cluster specific embodiments. 综上所述,本发明针对智能终端的高并发长连接的特点,提出这种长连接、高并发服务端负载均衡实现方法及服务器集群系统及连接分发算法,本发明的服务器集群系统采用网状组网结构,并且多个节点之间具备层次性的递进关系,一方面从组网上提高了服务稳定性,另一方面利于选举全网内性能最优节点成为备用节点,从而在主节点失效情况下成为主节点,保障了全局最优选举。 In summary, the intelligent terminal characteristics for long high concurrent connection with the invention, such a long connection, high concurrency server load balancing to achieve a method and server cluster is connected and the distribution algorithm, the cluster server system according to the present invention employs mesh network structure, and have multiple nodes progressive relationship between the level of, on the one hand to improve the stability of services from online groups, on the other hand is conducive to optimal performance in the election of the whole network node becomes the standby node, so the failure at the primary site It becomes the master node in the case to protect the global optimum elections.

[0166] 并且,在集群组网模式中采用主备备份机制,可以实现主备之间的无缝切换,备用节点能够接管节点的功能,屏蔽了单点故障,提高集群对外提供业务的可用性。 [0166] Furthermore, with the primary backup mechanism in a cluster networking mode can achieve seamless handover between the primary and standby node can take over the function of the node, the single point of failure shield, providing services to improve the availability of external cluster .

[0167] 再次,所提出的连接分发算法,选择最优处理能力的节点接收业务请求,能够最大限度的利用每个节点的处理能力,同时会话管理机制可以进一步降低分发连接时的时延, 保证业务的可靠性及时效性。 [0167] Again, the connection of the proposed distribution algorithm to select the best node receives the service request processing capability, to maximize the use of the processing power of each node, while the session management mechanism can further reduce the delay time of dispensing connector, to ensure reliability and timeliness of business. 因此,本发明提出负载均衡实现的方法具有广泛运用的可能性,在各类类似的项目中将发挥更为重要的作用。 Accordingly, the present invention provides a method of load balancing implementation with the possibility of extensive use, play a more important role in all kinds of similar projects.

[0168] 如上参照附图以示例的方式描述了根据本发明提出的服务器集群系统及其负载均衡实现方法。 [0168] As described by way of example the present invention is achieved according to the proposed load balancing server cluster system and method with reference to the accompanying drawings. 但是,本领域技术人员应当理解,对于上述本发明所提出的服务器集群系统及其负载均衡实现方法,还可以在不脱离本发明内容的基础上做出各种改进。 However, those skilled in the art will appreciate that, for the proposed system of the present invention, the server and load balancing cluster implementation method, various modifications may be made without departing from the present invention. 因此,本发明的保护范围应当由所附的权利要求书的内容确定。 Thus, the contents of the book scope of the invention should be determined by the appended claims.

Claims (12)

  1. 1. 一种服务器集群系统,包括主节点、备用节点和普通节点;其中, 所述主节点包括信息收集模块、负载计算模块和连接分发决策模块; 所述信息收集模块,用于存储与所述主节点相连的备用节点和各个普通节点上报的负载信息; 所述负载计算模块,用于根据所述上报的负载信息,获得与所述负载信息相对应的节点的负载值; 所述连接分发决策模块,用于通过HTTP重定向进行连接分发,其中,根据所述负载计算模块获得的节点的负载值,选择出最优处理能力的节点,作为连接分发的节点; 所述备用节点,用于向所述主节点上报备用节点的负载信息,并与所述主节点的信息同步,以及对所述普通节点进行筛选,选择出的普通节点作为新的备用节点; 所述普通节点,用于向所述主节点上报普通节点的负载信息,以及当其成为备用节点时,将其切换为 A server cluster comprising a master node, the standby node and the common node; wherein the master node includes an information collection module, the load distribution calculation module and a decision module is connected; said information collecting means for storing said backup master node and nodes connected to the respective common node reports load information; said load calculating module, according to the reporting load information, the load value is obtained and the node corresponding to the load information; said connection handout decision means for connecting distributed via HTTP redirection, wherein the load values ​​of the nodes of the load calculation module obtained, the selected optimal node processing capability, is connected as a node distributed; the spare node for the the master node reports the load information of the spare nodes, and synchronize information with the master node, and the ordinary node screened ordinary node selected as the new standby node; said common node, it is used to said master node reports the load information of the ordinary node, and when it becomes the standby node, it switches to 备用节点。 Spare node.
  2. 2. 如权利要求1所述的服务器集群系统,其中, 所述主节点还包括管理会话模块,用于管理客户终端与所述集群系统之间的会话历史记录,并负责删除所述客户终端与所述集群系统之间长时间不活跃的历史会话记录。 2. The server cluster system according to claim 1, wherein the master node further comprises session management means for managing the history of the client session between the terminal and the cluster system, and is responsible for deleting the client terminal and long inactive recording session history between the cluster system.
  3. 3. 如权利要求1所述的服务器集群系统,其中, 在所述信息收集模块中创建一个nodelist,所述nodelist用于存储与所述主节点相连的各个节点上报的负载信息;其中, 所述nodelist包括更新机制,所述更新机制用于删除离线的节点。 3. The server cluster system according to claim 1, wherein said creating a nodelist information collection module, the load information nodelist respective nodes for storing the master node connected reported; wherein said nodelist update mechanism including a mechanism for updating node deletes offline.
  4. 4. 如权利要求3所述的服务器集群系统,其中, 负载信息包括:与所述主节点相连的每个节点的CPU利用率、内存利用率、网络带宽占用率以及socket连接占用率; 在所述负载计算模块中,对每个节点通过特定加权的负载计算算法,分别获取其对于CPU密集型、内存密集型、网络带宽密集型及socket密集型请求的负载能力,并将获取的结果更新到所述nodelist。 4. The server cluster system according to claim 3, wherein the load information include: CPU utilization, memory utilization, network bandwidth utilization rate and socket connection with each node connected to the master node; in the load calculation module, through a specific calculation algorithm weighted load, respectively acquire its CPU intensive, memory-intensive load capacity for bandwidth-intensive and intensive socket for each node of said request, to update the acquired result the nodelist.
  5. 5. 如权利要求1所述的服务器集群系统,其中, 所述备用节点包括备用节点信息上报模块、信息同步模块和普通节点筛选模块; 所述备用节点信息上报模块,用于向所述主节点周期性汇报负载信息; 所述信息同步模块,用于周期性地同步所述信息收集模块中的负载信息,并进行存储; 所述普通节点筛选模块,用于当所述备用节点成为主节点时,选择普通节点中负载最小的节点成为其备选。 5. The server cluster system according to claim 1, wherein said node comprises a standby spare node information reporting module, and the general synchronization module information node filter module; information reporting module the standby node, the master node for the said common node filter module, configured to, when the standby node becomes the master node; periodically reporting the load information; said information synchronization module is configured to periodically synchronize the information load information collecting module, and stores , ordinary node selected in the smallest load node as its alternative.
  6. 6. 如权利要求1所述的服务器集群系统,其中, 所述普通节点包括普通节点信息上报模块和备用节点切换模块; 所述普通节点信息上报模块,用于向所述主节点周期性汇报负载信息; 所述备用节点切换模块,用于当所述备用节点成为新的主节点后,将对其所选择出的新的备用节点发出通知,普通节点据此切换为备用节点,并承担备用节点的职责。 6. The server cluster system according to claim 1, wherein said common node comprises a common node information reporting module and the standby switching node module; the normal node information reporting means for reporting to the master node periodically load information; standby node after the switching module, configured to, when the standby node becomes the new master node will be selected in its new spare node notification, the node accordingly switched to the normal standby node, the standby node and bear duties.
  7. 7. -种基于服务器集群系统的负载均衡实现方法,所述服务器集群系统包括主节点、 备用节点和普通节点,其中, 所述主节点周期性地接收与其相连的备用节点、各个普通节点上报的负载信息,并根据上报的所述负载信息,获得与所述负载信息相对应的各个节点的负载值; 当有客户终端连接请求时,所述主节点通过HTTP重定向进行连接分发,其中,所述主节点根据获得的各个节点的负载值,选出最优处理能力的节点作为连接分发的节点; 当所述主节点出现故障时,所述备用节点进行主备切换,成为新的主节点,执行主节点的职责,同时启用新的备用节点; 所述新的备用节点将负载信息上报到所述新的主节点中,并与新的主节点的信息同步,同时对所述普通节点进行筛选,选出所述新的备用节点的备选。 7. - kind of server load balancing cluster system implementation method, the system includes a main server cluster node, the standby node and the common node, wherein the master node periodically receives standby nodes connected thereto, each ordinary node reported load information, and report the load information, the load information is obtained corresponding to the value of each node according to load; when a client terminal connection request, the master node are connected via HTTP redirection distributed, wherein the said master node according to the load value obtained in each node, the node is selected as the optimum processing capability distributed nodes connected; when the primary node fails, the standby node switchover, the new primary node, perform functions of the master node, while enabling a new standby node; the new spare node reports the load information to the new master node, the synchronization information and a new master node, while the ordinary node screened elect a new alternative spare node.
  8. 8. 如权利要求7所述的基于服务器集群系统的负载均衡实现方法,其中,在所述主节点中设置有会话管理机制, 所述会话管理机制包括所述客户终端与所述集群系统之间的会话历史记录,并负责删除所述客户终端与所述主节点之间长时间不活跃的历史会话记录,使选择的连接分发的节点为最佳选择。 Server load balancing cluster system implementation method, wherein the master node is provided in the session manager mechanism 7, comprising the session management mechanism with the client terminal between the cluster system as claimed in claim conversation history, and is responsible for deleting the client terminal between the master node and a long history inactive session recording, select the connection node distribution is the best choice.
  9. 9. 如权利要求7所述的基于服务器集群系统的负载均衡实现方法,其中,在所述主节点中创建一个nodelist,各个节点的负载信息存储在所述nodelist中,在所述客户终端请求连接之前, 判断所述nodelist中是否存在离线的节点,若存在离线的节点,则将此节点从所述nodelist中删除;若存在离线的节点为备用节点,则重新在所述普通节点中选择新的备用节点,并与所述主节点进行信息同步; 若在所述nodelist中不存在离线的节点,根据所述客户终端的连接请求进行连接分发。 Server load balancing cluster system implementation method, wherein a nodelist created in the master node in claim 7 as claimed in claim 9, the load information stored in the respective nodes nodelist, when the client terminal requests a connection before determining whether the node is offline presence nodelist, when the offline node exists, the node deletes this from the nodelist; if the standby node is offline node exists, then selecting a new node in the normal the standby node, and the information synchronized with the master node; if the node does not exist in the offline nodelist, the distribution according to a connection request to connect the client terminal.
  10. 10. 如权利要求7所述的基于服务器集群系统的负载均衡实现方法,其中,在所述新的备用节点对所述普通节点进行筛选的过程中, 所述新的备用节点从所述普通节点中选择负载最小的节点作为其备选。 Server load balancing cluster system implementation method, wherein during said standby node the new screening ordinary node, the new node from the standby node 10. The common claim 7 smallest load node selected as an alternative.
  11. 11. 如权利要求7所述的基于服务器集群系统的负载均衡实现方法,其中, 所述负载信息包括:与所述主节点相连的每个节点的CPU利用率、内存利用率、网络带宽占用率以及socket连接占用率; 通过特定加权的负载计算算法,每个节点分别计算其对于CPU密集型、内存密集型、网络带宽密集型及socket密集型请求的负载能力,公式如下: rx = f (cu, mu, nu, su) 其中,表示节点的剩余资源; cu、mu、nu、su分别表示接收到的节点上传的CPU利用率、内存利用率、网络带宽占用率以及socket连接占用率。 Server load balancing cluster system implementation method, wherein the load 11. The details of claim 7 comprising: CPU utilization may each node and the master node is connected, memory utilization, network bandwidth utilization and a socket connector occupancy; specific weighted by the load calculation algorithm, which calculates for each node CPU intensive and memory-intensive, load capacity and network bandwidth-intensive socket intensive request, the following formula: rx = f (cu , mu, nu, su) which represents the remaining resource node; cu, mu, nu, su respectively received the node upload CPU utilization, memory utilization, network bandwidth utilization rate and socket connection.
  12. 12. 如权利要求11所述的基于服务器集群系统的负载均衡实现方法,其中, 根据接收到各个节点上传的负载信息,计算出剩余资源百分比,其公式如下式所示: [Π (1 -/"(/c) - tcimsl)]·(1 - 2cit(k) + cu(k -1)) if cu{k)> cu{k-\) r,{k)={ t=sum [Π (1 - Λ, (/f) - tninsl)] · (1 - c" (k)) else 其中,ri(k)表示k时刻的剩余资源百分比; Cu(k)表示k时刻CPU使用率; cu(kl)表示(k-1)时刻CPU的使用率; 在(l-tu(k)-t_st)中,当t = s时,tu(k)表示k时刻的socket连接使用率,t_st表示socket资源的预留量,(l-tu(k)-t_st)表示节点t时刻剩余的可用的socket资源情况; 当t = m时,tu(k)表示k时刻的内存使用率,t_st表示内存资源的预留量, (l-tu(k)-t_st)表示节点t时刻剩余的可用的内存资源情况; 当t = n时,tu(k)表示k时刻的网络带宽占用率,t_st表示网络带宽的预留量, (l-tu(k)-t_st)表示节点t时刻剩余的可用的网络带宽资 12. The load balancing server cluster implementation method based system, wherein each node according to the received load information uploaded, calculate the remaining percentage of resources according to claim 11, which is shown in the following formula formula: [Π (1 - / "(/ c) - tcimsl)] · (1 - 2cit (k) + cu (k -1)) if cu {k)> cu {k- \) r, {k) = {t = sum [Π ( 1 - Λ, (/ f) - tninsl)] · (1 - c "(k)) else where, ri (k) represents the remaining percentage of the resources at time k; a Cu (k) represents a k-time CPU utilization; Cu ( kl to) represents the (k-1) CPU utilization time; the (l-tu (k) -t_st), when t = s, tu (k) represents the time K socket connection usage, t_st socket resources represents the reserve amount, (l-tu (k) -t_st) represented by the socket available node resources remaining time t; when t = m, tu (k) represents the memory usage time k, t_st memory resources represents reserve amount, (l-tu (k) -t_st) represents available memory resources of the nodes remaining time t; when t = n, tu (k) represents network bandwidth occupation time k, t_st network bandwidth represents reserve amount, (l-tu (k) -t_st) node t represents the time remaining available network bandwidth resources 情况。 Happening.
CN 201410512754 2014-09-26 2014-09-26 Server cluster system and load balancing implementation method thereof CN104283948A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201410512754 CN104283948A (en) 2014-09-26 2014-09-26 Server cluster system and load balancing implementation method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201410512754 CN104283948A (en) 2014-09-26 2014-09-26 Server cluster system and load balancing implementation method thereof

Publications (1)

Publication Number Publication Date
CN104283948A true true CN104283948A (en) 2015-01-14

Family

ID=52258421

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201410512754 CN104283948A (en) 2014-09-26 2014-09-26 Server cluster system and load balancing implementation method thereof

Country Status (1)

Country Link
CN (1) CN104283948A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104768196A (en) * 2015-04-21 2015-07-08 四川西结微波科技发展有限责任公司 Cluster group seamless switching method
CN105007233A (en) * 2015-07-13 2015-10-28 互联网域名系统北京市工程研究中心有限公司 Method for distributing address based on DHCP (dynamic host configuration protocol) server cluster load
CN105338078A (en) * 2015-10-26 2016-02-17 北京百度网讯科技有限公司 Data storage method and device used for storing system
CN105516245A (en) * 2015-11-25 2016-04-20 国家计算机网络与信息安全管理中心 Flow-based load balancing system and realization method thereof
CN105577759A (en) * 2015-12-15 2016-05-11 东软熙康健康科技有限公司 Server node allocation method and device
CN106027649A (en) * 2016-05-20 2016-10-12 深圳市永兴元科技有限公司 Data collection method of distributed data system and the distributed data system
CN106060123A (en) * 2016-05-20 2016-10-26 深圳市永兴元科技有限公司 Distributed data system data acquisition method and distributed data system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1549978A (en) * 2001-07-16 2004-11-24 Bea系统公司 Method and apparatus for session replication and failover
CN1925444A (en) * 2006-09-14 2007-03-07 华为技术有限公司 Method for establishing point-to-point collection in P2P network and nodes in P2P network
CN101521679A (en) * 2009-04-03 2009-09-02 南京邮电大学 Self-organizing method based on composite structured peer-to-peer network
CN101605092A (en) * 2009-07-10 2009-12-16 浪潮电子信息产业股份有限公司 Load balancing system based on content
CN102387218A (en) * 2011-11-24 2012-03-21 浪潮电子信息产业股份有限公司 Multimachine hot standby load balance system for computer
CN103117876A (en) * 2013-01-24 2013-05-22 中兴通讯股份有限公司 User state information synchronizing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1549978A (en) * 2001-07-16 2004-11-24 Bea系统公司 Method and apparatus for session replication and failover
CN1925444A (en) * 2006-09-14 2007-03-07 华为技术有限公司 Method for establishing point-to-point collection in P2P network and nodes in P2P network
US20090177772A1 (en) * 2006-09-14 2009-07-09 Huawei Technologies Co., Ltd. Method, system and device for establishing a peer to peer connection in a p2p network
CN101521679A (en) * 2009-04-03 2009-09-02 南京邮电大学 Self-organizing method based on composite structured peer-to-peer network
CN101605092A (en) * 2009-07-10 2009-12-16 浪潮电子信息产业股份有限公司 Load balancing system based on content
CN102387218A (en) * 2011-11-24 2012-03-21 浪潮电子信息产业股份有限公司 Multimachine hot standby load balance system for computer
CN103117876A (en) * 2013-01-24 2013-05-22 中兴通讯股份有限公司 User state information synchronizing method and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104768196A (en) * 2015-04-21 2015-07-08 四川西结微波科技发展有限责任公司 Cluster group seamless switching method
CN105007233A (en) * 2015-07-13 2015-10-28 互联网域名系统北京市工程研究中心有限公司 Method for distributing address based on DHCP (dynamic host configuration protocol) server cluster load
CN105007233B (en) * 2015-07-13 2018-02-27 互联网域名系统北京市工程研究中心有限公司 Method based dhcp server load distribution cluster address
CN105338078A (en) * 2015-10-26 2016-02-17 北京百度网讯科技有限公司 Data storage method and device used for storing system
CN105516245A (en) * 2015-11-25 2016-04-20 国家计算机网络与信息安全管理中心 Flow-based load balancing system and realization method thereof
CN105577759A (en) * 2015-12-15 2016-05-11 东软熙康健康科技有限公司 Server node allocation method and device
CN106027649A (en) * 2016-05-20 2016-10-12 深圳市永兴元科技有限公司 Data collection method of distributed data system and the distributed data system
CN106060123A (en) * 2016-05-20 2016-10-26 深圳市永兴元科技有限公司 Distributed data system data acquisition method and distributed data system

Similar Documents

Publication Publication Date Title
US7636917B2 (en) Network load balancing with host status information
US7676516B2 (en) System and method for the optimization of database access in data base networks
Al Nuaimi et al. A survey of load balancing in cloud computing: Challenges and algorithms
US20140068602A1 (en) Cloud-Based Middlebox Management System
Stojmenovic Fog computing: A cloud to the ground support for smart things and machine-to-machine networks
US20040103194A1 (en) Method and system for server load balancing
US20140258536A1 (en) Application delivery controller and global server load balancer
US20090328054A1 (en) Adapting message delivery assignments with hashing and mapping techniques
Yi et al. A survey of fog computing: concepts, applications and issues
US7185096B2 (en) System and method for cluster-sensitive sticky load balancing
US20060268742A1 (en) Topology-centric resource management for large scale service clusters
Vulimiri et al. More is less: reducing latency via redundancy
US20160043901A1 (en) Graceful scaling in software driven networks
CN101207550A (en) Load balancing system and method for multi business to implement load balancing
CN102244685A (en) Distributed type dynamic cache expanding method and system supporting load balancing
CN1719831A (en) High-available distributed boundary gateway protocol system based on cluster router structure
WO2014052099A2 (en) Load distribution in data networks
US20100037225A1 (en) Workload routing based on greenness conditions
CN102143022A (en) Cloud measurement device and method for IP network
US20130301413A1 (en) Virtual internet protocol migration and load balancing
CN103491129A (en) Service node configuration method and service node pool logger and system
Gilly et al. An up-to-date survey in web load balancing
CN101753461A (en) Method for realizing load balance, load balanced server and group system
US20140064066A1 (en) Data Processing
CN102469023A (en) Dispatching method, unit and system based on cloud computing

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination