CN111277629A

CN111277629A - High-availability-based web high-concurrency system and method

Info

Publication number: CN111277629A
Application number: CN202010030614.XA
Authority: CN
Inventors: 孟利民; 王斌; 应颂翔; 蒋维; 林梦嫚
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-01-13
Filing date: 2020-01-13
Publication date: 2020-06-12

Abstract

A high-availability-based web high-concurrency system comprises a load balancer cluster adopting a high-availability scheme, a dynamic and static separated web server cluster and a database server containing session sharing, wherein the load balancer cluster comprises a main load balancer and a plurality of additional high-availability standby load balancers; the web server cluster is realized by adopting a mode of matching Nginx + Tomcat; the database server comprises a hard disk database Mysql and a Memcached memory database for realizing session sharing. And to provide a high-availability based web high-concurrency approach. The invention improves the processing capacity of the web system when dealing with high concurrent events, can reasonably schedule system server resources when different work request types are adopted, shortens the average response time of the client and improves the overall throughput of the server.

Description

High-availability-based web high-concurrency system and method

Technical Field

The invention is based on the most widely applied world wide Web application on the Internet at present, and particularly designs a system and a method for solving the high concurrency of a Web server.

Background

With the improvement of the basic implementation of the internet and the increasing of mobile terminals, the internet services are more and more abundant, and the number of users is more and more large. On one hand, the social, media, shopping and other web sites are getting larger, for example, the current concurrent access needs are large, such as e-commerce sites, related famous search engines, etc., and the daily concurrent access amount and the peak access amount are hard to imagine. On the other hand, in some promotion activities, holiday ticket buying activities and other activities, the impact of peak flow puts higher requirements on the concurrence performance of the web system website.

On the peak date of ticket buying during the spring festival of each year, the 12306 platform is used as an official train ticket buying platform, but the problems of server jamming, breakdown and the like frequently occur, which is very unfriendly for the user experience. This illustrates that as the number of users and the size of data grows, conventional web system architectures have been unable to meet current needs, which presents new challenges to the ability of web systems to handle high concurrency.

Nginx is used as a high-performance load balancer, all requests reach Nginx, the load balancer is located at a very important position, and if a Nginx server is down, a back-end web server cannot provide services, so that the influence is serious. But in the face of ticket buying peak hours such as 12306 platform, only one or a few load balancers are not enough, and thus a load balancer cluster needs to be constructed. At this time, the load balancer cluster needs to adopt a high availability scheme to ensure the mutual communication among the cluster servers, once the main load balancer has a problem, the standby load balancer needs to take over to be the main load balancer, so that one load balancer is always in a normal working state to realize the continuity of the service, and the load balancer cluster realizes high availability.

Disclosure of Invention

Aiming at the problems, the high-concurrency web system architecture is designed and realized from the aspects of high availability, dynamic load balancing algorithm, session sharing and the like based on the load balancer cluster of the high-availability module. The functions of high concurrency of the web system, improvement of the utilization balance rate of system resources and the like can be realized. The method has certain practical value for the high-availability specific application research and the design of a high-concurrency system architecture.

The purpose of the invention can be realized by the following technical scheme:

a high-availability-based web high-concurrency system comprises a load balancer cluster adopting a high-availability scheme, a dynamic and static separated web server cluster and a database server containing session sharing;

the load balancer cluster comprises a main load balancer and a plurality of added high-availability standby load balancers, and the main load balancer and the standby load balancers form the high-availability cluster so as to prevent a single-point fault of a load balancing module;

the high availability scheme is realized by adopting Keepalived, and the Keepalived configuration is relatively simple, is easy to realize and is convenient to manage and test;

keepalived is a specific implementation of Virtual Routing Redundancy Protocol (VRRP) that can address single point failures of static link routing. When the Keepalived node works, the main node sends a packet, the standby node receives the packet, when the standby node cannot receive the data packet sent by the main node, a receiving program is started to receive the resources of the main node, a plurality of standby nodes can be provided, and the standby nodes are selected to become the main node through priority. The Keepalived high-availability pairs communicate through the VRRP, when the system works, the main server can acquire all resources, constantly broadcast the VRRP packet and inform the standby server of the health state of the main server, and when the main server is unavailable, related services are started to take over the resources so as to ensure the continuity of the service;

when the main load balancer master is in a normal working state, the backup load balancer is in a dormant state temporarily, meanwhile, the health state of the main load balancer master is monitored continuously, once the main load balancer master is found to be abnormal, namely the backup balancer is considered to be down when a VRRP packet is not received, all resources of the main load balancer need to be taken over immediately at the moment, one backup as the master can be selected according to the priority of VRRP, and the configuration of the backup load balancer is generally consistent with that of the main load balancer in order to be capable of taking over service effectively;

keepalived has three major modules, core, check and vrrp. The core module is a kernel of keepalive and is responsible for starting and maintaining a main process and loading and analyzing a global configuration file, and the check is responsible for health check and comprises various common check modes. The VRRP module is used for realizing the VRRP protocol;

the load balancing cluster is the foremost part of the system, the load balancing module is realized by adopting Nginx to carry out reverse proxy, and a dynamic load balancing algorithm is adopted to forward the client request, so that the back-end servers can better cooperatively process the user request;

the dynamic load balancing module calculates the dynamic weight of the web service cluster in real time by adopting weighted summation according to the specific type of the working load and the performance monitoring data of the web cluster, so as to realize reasonable forwarding and efficient processing of the flow;

the performance index data of the web cluster is collected through monitoring, and performance data indexes of a cpu, an internal memory, an IO, a network and the like of the web server are mainly monitored. The specific performance index data is defined as follows:

(1)cpu_irepresenting the CPU availability performance of the web server, CPU _ limit representing the maximum available core number of the web server, CPU _ freq representing CPU main frequency, and CPU _ utilization representing CPU real-time utilization, wherein the formula is defined as follows:

(2)mem_ithe method comprises the following steps of representing the available performance of a web server memory, representing the maximum available memory of the web server by mem _ limit, representing the memory frequency by mem _ freq, representing the real-time utilization rate of the memory by mem _ utilization, and defining a formula as follows:

(3)fs_iindicating the web server hard disk availability, fs _ read _ speed, fs _ write _ speed, fs _ optimization, and real-time utilization of the hard disk, wherein the formula is defined as follows:

(4)net_ithe method comprises the steps of representing the network transceiving speed of a web server, representing the network communication state of a current server, representing the receiving speed of the network flow of the server by net _ receive _ speed, and representing the sending speed of the network flow of the server by net _ transmit _ speed;

net_i＝net_receive_speed+net_tranmit_speed

assuming that the number of web server clusters is N, the ith web server performance indicator parameter may be defined as follows:

P_i＝[cpu_i,mem_i,fs_i,net_i]

when each period starts, calculating a load value, and P is the threshold value when only one of the four hardware occupancy rates reaches the threshold value_iAnd if not, sequentially calculating four performance index values. If all three occupancy rates of a certain server are 0, then it is down, P_iAt 0, the request will not reach this server until normal is restored.

In order to eliminate the influence of different dimensions on the weighted summation, a normalization method is adopted to process cpu, memory, hard disk and network index data, and the normalization data is as follows:

P_i'＝[cpu_i',mem_i',fs_i',net_i']

cpu_i' denotes a normalized web server cpu real-time performance index, the larger the value of which denotes the better the current server cpu performance.

mem_i' indicating the real-time performance index of the memory of the normalized web server, the larger the value of the index, the better the performance of the memory of the current server。

fs_i' denotes the real-time performance index of the normalized web server hard disk, and the larger the value of the index is, the better the performance of the current server hard disk is.

net_i' denotes a normalized web server network real-time performance indicator, with a larger value indicating better current server network communication performance.

Setting the weight of the corresponding index according to the corresponding performance monitoring data:

K＝[K_c,K_m,K_f,K_n]wherein K_c+K_m+K_f+K_n＝1

The weight of the corresponding performance index can be set according to the characteristics of the working load, so that the load weight of the web server cluster is dynamically adjusted, the flow is reasonably forwarded to the optimal web server, the concurrent processing performance of the system is improved, and the working load of the web system can be divided into the following three types according to the difference of specific special service scenes:

(1) cpu intensive: a large amount of calculation is required to consume cpu resources, for example, high-definition decoding of resources such as videos, data analysis and calculation of social networking sites, and the like are performed. At this time, the specific gravity of the cpu and the memory should be increased appropriately, so that a large number of requests can be forwarded to the server with sufficient cpu and memory for operation.

(2) IO intensive: a large number of IO operations, such as network data transmission, frequent database operations, etc., are required. At this time, the network and IO weight should be increased appropriately to ensure that a large number of requests can be forwarded to a server with sufficient IO resources.

(3) Response time type: most online websites pay attention to user experience and require a high response speed of the system. At this time, all factors can be considered, and the request is forwarded to the server with better average system performance to be executed as much as possible, so that the performance such as the response time of the system is ensured.

Real-time dynamic weights W with weight summation web server_iThe calculation is as follows:

W_i＝P_i'×K^T

the overall dynamic weight W of the web server cluster is:

W＝P×K^T＝[P₁',P₂',...,P′_N]×K^T

the dynamic load balancing algorithm can distribute corresponding weights according to the working load characteristics and periodically and dynamically calculate the dynamic weights of the web server cluster according to the web server index data collected in real time, and then the load balancer dynamically loads the dynamic weights so as to realize the dynamic load balancing of the cluster.

The dynamic load balancing realizes the functions of registering and discovering the dynamic load balancing by using Consul, realizes that the Nginx server dynamically acquires the latest upstream list by using Upsync, and reads the configuration information in real time after the Nginx configuration information is modified each time, thereby avoiding the redundant step of restarting the Nginx configuration information after the Nginx configuration information is modified.

Consul uses a Raft algorithm to achieve cluster data consistency, and Upsync is a three-party module for achieving dynamic configuration based on Nginx and used for sourcing of Xinlang microblogs. The function of the Nginx-Upsync-Module is to pull the list of the backend servers of Consul and dynamically update the routing information of the Nginx. This module is not dependent on any third party module. Consul as the DB for Nginx, uses the KV service of Consul to pull the configuration of each upstream and update the route of each upstream independently for each Nginx Work process.

The Web server cluster is realized by adopting a mode of matching Nginx + Tomcat, Nginx processes multi-user concurrent requests based on an event-driven model, has incomparable advantages in the aspect of processing static resources, and has expansibility and stability, so that Nginx is used as a Web server to process the static resources. Since Tomcat has good scalability and security in handling dynamic resources, Tomcat is used as a Web server to handle dynamic resources. The web server cluster is realized by configuring different ports.

The database server comprises a hard disk database Mysql and a Memcached memory database for realizing session sharing.

The session sharing is to maintain the consistency of session data among a plurality of nodes of the cluster, and the session sharing based on the memcached cache is adopted. Even if the cacheDB is used for storing the session information, the back-end server receives a new request and stores the session information in the cacheDB, when the back-end server fails, the scheduler can search for an available node and distribute the request in a traversing manner, when the application server finds that the session is not in the memory of the local machine, the application server searches in the cacheDB, and if the session is found, the session is copied to the local machine, so that the session sharing and high availability are realized.

The session is stored by using memcached, the sessions of a plurality of tomcats are managed in a centralized mode, and the front end utilizes nginx load balancing and dynamic and static resource separation, so that the system level expansion is considered, and meanwhile, the higher performance can be guaranteed. Namely, the Session of Tomcat is serialized by an MSM tool and then stored in Memcached, thereby realizing Session sharing.

Because the web server cluster has a plurality of Tomcat servers, a Non-Sticky Session mode is adopted, when a Request comes, a standby Session is loaded to Tomcat from Memcached2, when the Request ends, the Tomcat Session is updated to a main Memcached1 and a standby Memcached2, and the Tomcat Session is cleared, so that the purpose of main-standby synchronization is achieved, and a Non-Sticky mode is required to be selected when the plurality of Tomcat clusters are arranged, namely, the Sticky is "false".

A web high-concurrency access processing method comprises the following steps:

s1, when the load balancer receives the client request, the resource types of the user request are classified, the static resource is processed by the static resource server, the load balancer carries out forward proxy, the URL _ HASH algorithm is adopted, meanwhile, the static resource server is added with cache, so that the response speed of the back-end server is greatly improved;

s2, when the request is a dynamic resource, the load balancer carries out reverse proxy, dynamically adjusts the upstream list of the Nginx server through a dynamic load balancing algorithm, and hands the user request to the dynamic resource server for processing;

s3, the load balancer cluster adopts Keepalive to ensure that one main load balancer is always in a normal working state;

s4, the dynamic resource server cluster obtains database data through interaction with the hard disk database Mysql, and session data are shared through the memory database Memcached.

The invention has the beneficial effects that:

(1) the load balancer cluster adopts a high-availability strategy, and when the load balancer has a single-point fault, a standby machine with the same function always takes over resources, so that the service continuity is ensured.

(2) The new dynamic load balancing algorithm can distribute corresponding weight according to the working load characteristics and dynamically calculate the dynamic weight of the cluster according to the collected real-time performance data of the web server, and then the load balancer loads the dynamic weight so as to realize reasonable resource distribution of the web server cluster.

(3) And the nginx configuration information is modified every time, the restarting is not needed, and the nginx reads the configuration information in real time.

(4) Under the tomcat cluster environment, the session is synchronized among a plurality of application servers, so that the session is kept consistent, and the hidden danger caused by the fact that the ip _ hash algorithm sends the same user request to a fixed back-end server is solved.

Drawings

FIG. 1 is a system framework diagram of the load balancer cluster, static resource server cluster, dynamic resource server cluster, and database server cluster of the present invention.

Fig. 2 is a flow of a high availability scheme for a cluster of load balancers.

FIG. 3 is a flow diagram of a dynamic load balancing algorithm based on workload characteristics.

Fig. 4 is a schematic diagram of a dynamic resource server sharing session through a Memcached database.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

Referring to fig. 1 to 4, a high-availability web high-concurrency system, as shown in fig. 1, includes a load balancer cluster adopting a high-availability scheme, a dynamically and statically separated web server cluster, and a database server including session sharing;

the load balancer cluster comprises a main load balancer, and a plurality of high-availability standby load balancers are added to form a high-availability cluster so as to prevent a single point of failure of a load balancing module.

The high availability scheme, as shown in fig. 2, is implemented using Keepalived, because Keepalived configuration is relatively simple, easy to implement, and convenient to manage and test. The main server sends a multicast packet to all the standby servers by using a VRRP protocol so as to inform the health status of the standby servers. And once the standby server cannot receive the multicast packet, the standby server considers that the main server is abnormal and replaces the standby server to be the new main server to continue to complete the work.

The dynamic load balancing algorithm is used for calculating the dynamic weight of the web service cluster in real time by adopting weighted summation according to the specific type of the working load and the performance monitoring data of the web cluster, so that the reasonable forwarding and the efficient processing of the flow are realized.

The session sharing is, as shown in fig. 4, to maintain consistency of session data among a plurality of nodes of a cluster, where session sharing based on a memcached cache is adopted.

A high-availability based web high-concurrency method comprises the following steps:

s1, when the user request type is static resource, forward proxy is carried out through a load balancer, the request is distributed according to the hash result of the website of the access request by adopting a url _ hash algorithm, different urls point to different servers, and meanwhile, a cache is added in the static resource server, so that the response speed of the back-end server is greatly improved. The static resources are generally designed html pages, and the static resources do not need to participate in program processing with a database;

and S2, when the request is a dynamic resource, the load balancer carries out reverse proxy, dynamically adjusts the upstream list of the Nginx server through a dynamic load balancing algorithm based on the characteristics of the working load, and hands the user request to the dynamic resource server for processing. The general dynamic resource generally firstly submits a request to a web server, the web server is connected with a database, after the database processes data, the content is delivered to the web server, and the web server returns to a client for analysis and rendering processing;

s3: the load balancer cluster adopts Keepalive to ensure that one main load balancer is always in a normal working state;

s4: the dynamic resource server cluster obtains database data through interaction with a hard disk database Mysql, and session data are shared through a memory database Memcached.

The dynamic load balancing algorithm in step S2, whose flow is shown in fig. 3, may allocate corresponding weights according to the working load characteristics and periodically and dynamically calculate the dynamic weights of the web server cluster according to the web server index data collected in real time, and then dynamically load the dynamic weights by the load balancer to implement the dynamic load balancing of the cluster.

The performance index data of the web cluster is collected through monitoring, and performance data indexes of a cpu, an internal memory, an IO, a network and the like of the web server are mainly monitored.

The weight of the corresponding performance index can be set according to the characteristics of the working load, so that the load weight of the web server cluster is dynamically adjusted, the flow is reasonably forwarded to the optimal web server, and the concurrent processing performance of the system is improved. In general, web system workloads can be classified into three types according to different specific special service scenarios: cpu intensive, IO intensive, response time intensive.

The specific implementation steps of the algorithm are as follows:

(1) initializing Nginx servers, initializing performance indexes of the servers, and writing the performance indexes into configuration information;

(2) judging whether a certain performance index of a server reaches a specified threshold value, if so, setting the server as unavailable and repeating the step 1, otherwise, continuing the next step;

(3) starting to receive a client request and judge the load type of the request, setting the weight of a corresponding performance index according to the load characteristic of the request, and then distributing a web server according to the calculation result to process the request;

(4) and judging whether the server starts processing after the request is forwarded, if so, feeding the request back to the corresponding client by the back-end server after the request is processed, otherwise, setting a corresponding flag bit, and making the server unavailable when the server is down. And judging whether all servers are down, if so, finishing the process, and otherwise, carrying out load balancing calculation again and distributing new server processing requests.

Claims

1. A high-availability based web high-concurrency system, characterized by: the web high concurrency system comprises a load balancer cluster adopting a high availability scheme, a web server cluster with dynamic and static separation and a database server containing session sharing;

the high availability scheme is realized by adopting Keepalived, because the Keepalived configuration is relatively simple, the realization is easy, and the management and the test are convenient, the main server sends the multicast packets to all the standby servers by utilizing a VRRP protocol to inform the health state of the standby servers, once the standby servers cannot receive the multicast packets, the main server is considered to be abnormal, and the main server is replaced into a new main server to continue to complete the work;

the dynamic load balancing is realized by using Consul to realize the functions of registering and discovering the dynamic load balancing, the Upsync is used to realize that the Nginx server dynamically acquires the latest upstream list, and the Nginx reads the configuration information in real time after modifying the Nginx configuration information each time, so that the redundant step of restarting the Nginx configuration information after modifying the Nginx configuration information is eliminated;

the Web server cluster is realized by adopting a mode of matching Nginx + Tomcat, Nginx processes multi-user concurrent requests based on an event-driven model, has incomparable advantages in the aspect of processing static resources, and has expansibility and stability, so that Nginx is used as a Web server to process the static resources; in the aspect of processing dynamic resources, Tomcat has good performance in the aspects of expandability and safety, so Tomcat is used as a Web server to process dynamic resources, and different ports are configured to realize a Web server cluster;

the database server comprises a hard disk database Mysql and a Memcached memory database for realizing session sharing; the session sharing is to maintain consistency of session data among a plurality of nodes of the cluster, and the session sharing based on the memcached cache is adopted here.

2. The high availability based web high concurrency system of claim 1, wherein: in the dynamic load balancing module, when the performance indexes of the servers are periodically calculated, the real-time utilization rate of each server is added, the four performance indexes are subjected to normalization processing to obtain the real-time performance indexes of the servers, and the types of work requests are divided into three types: and the cpu intensive type, the IO intensive type and the response time type are used for obtaining corresponding index weights, and finally obtaining the real-time dynamic weight of the server, wherein the larger the value is, the larger the distributed probability is.

3. The high availability based web high concurrency system framework according to claim 1 or 2, wherein: the real-time dynamic weight is obtained by summing four performance indexes of the server and index weights, and the four performance indexes are real-time performance data obtained by monitoring and reflect the real-time load capacity of the server; the index weight is the optimal distribution number obtained according to the type of the work request, and the process is as follows:

(3)fs_irepresenting the availability of the web server hard disk, fs _ read _ speed representing the reading speed of the hard disk, and fs _ write _ speed representing the writing speed of the hard diskFs _ utilizations represents the real-time utilization of the hard disk, and the formula is defined as follows:

net_i＝net_receive_speed+net_tranmit_speed

P_i＝[cpu_i,mem_i,fs_i,net_i]

when each period starts, calculating a load value, and P is the threshold value when only one of the four hardware occupancy rates reaches the threshold value_iIf the three occupancy rates of a certain server are all 0, the server is down, P_i0, the request cannot reach the server until normal is recovered;

P_i'＝[cpu_i',mem_i',fs_i',net_i']

cpu_i' represents a normalized web server cpu real-time performance index, the larger the value of which represents the better the current server cpu performance;

mem_ithe real-time performance index of the memory of the normalized web server is represented, and the larger the value of the real-time performance index is, the better the performance of the memory of the current server is represented;

fs_ithe real-time performance index of the web server hard disk is normalized, and the larger the value of the index is, the better the performance of the current server hard disk is;

net_i' represents a normalized web server network real-time performance index, and the larger the value of the index is, the better the current server network communication performance is;

K＝[K_c,K_m,K_f,K_n]wherein K_c+K_m+K_f+K_n＝1

(1) cpu intensive: a large amount of CPU resources are consumed by calculation, for example, high-definition decoding is performed on resources such as videos, and data analysis and calculation operations of social network sites are required, so that the specific gravity of the CPU and the memory should be increased appropriately, and a large amount of requests can be transmitted to a server with sufficient CPU and memory for operation;

(2) IO intensive: a large amount of IO operations, such as network data transmission, are required to be performed, database operations are frequently operated, and at this time, the proportion of networks and IO should be properly increased, so that a large amount of requests can be transmitted to a server with sufficient IO resources;

(3) response time type: most of online websites pay attention to user experience, and require that the response speed of the system is high, and at the moment, all factors can be considered, and the request is forwarded to a server with good average performance of the system as much as possible to be executed, so that the response time performance of the system is ensured;

W_i＝P_i'×K^T

the overall dynamic weight W of the web server cluster is:

W＝P×K^T＝[P′₁,P′₂,...,P′_N]×K^T

4. the high availability based web high concurrency system framework according to claim 1 or 2, wherein: the dynamic load balancing realizes the functions of registering and discovering the dynamic load balancing by using Consul, realizes that the Nginx server dynamically acquires the latest upstream list by using Upsync, and reads the configuration information in real time after the Nginx configuration information is modified each time, thereby avoiding the redundant step of restarting the Nginx configuration information after the Nginx configuration information is modified.

5. The high availability based web high concurrency system framework according to claim 1 or 2, wherein: in the high availability scheme, when the main load balancer master is in a normal working state, the backup load balancer backup is in a dormant state temporarily, and simultaneously, the health state of the main load balancer master is monitored continuously, once the main load balancer master is found to be abnormal, namely the backup load balancer master cannot receive a VRRP packet, the master is considered to be down, all resources of the main load balancer need to be taken over immediately at the moment, one backup load balancer can be selected according to the priority of VRRP, and the configuration of the backup load balancer is generally consistent with that of the main load balancer in order to effectively take over service.

6. The high availability based web high concurrency system framework according to claim 1 or 2, wherein: in the session sharing, the cacheDB is used for storing the session information, the back-end server receives a new request and stores the session information in the cacheDB, when the back-end server fails, the scheduler can search for available nodes and distribute the request in a traversing manner, when the application server finds that the session is not in the memory of the local machine, the application server searches in the cacheDB, and if the session is found, the session is copied to the local machine, so that the session sharing and high availability are realized.

7. A high availability web high concurrency system implemented method according to claim 1, characterized in that said method comprises the steps of: