CN112463451B - Buffer disaster recovery cluster switching method and soft load balancing cluster device - Google Patents

Buffer disaster recovery cluster switching method and soft load balancing cluster device Download PDF

Info

Publication number
CN112463451B
CN112463451B CN202011389545.8A CN202011389545A CN112463451B CN 112463451 B CN112463451 B CN 112463451B CN 202011389545 A CN202011389545 A CN 202011389545A CN 112463451 B CN112463451 B CN 112463451B
Authority
CN
China
Prior art keywords
cache
cluster
disaster recovery
node
switching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011389545.8A
Other languages
Chinese (zh)
Other versions
CN112463451A (en
Inventor
傅兵
刘静
武文斌
朱文涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202011389545.8A priority Critical patent/CN112463451B/en
Publication of CN112463451A publication Critical patent/CN112463451A/en
Application granted granted Critical
Publication of CN112463451B publication Critical patent/CN112463451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

The method for switching the cache disaster recovery clusters and the soft load balancing cluster device can be used in the technical field of information security, and the method comprises the steps of firstly checking the health state of each cache agent node in an original cache cluster; and if the health state of at least one cache agent node is unavailable, isolating the corresponding cache agent nodes until the health state of all the cache agent nodes is unavailable, connecting the request of the application client to the cache disaster recovery cluster, and further, aiming at the application with reproducible, non-main data source, uninterrupted service and strong data requirement consistency of the cache data, realizing the application-level disaster recovery of the application client when the cache system is in disaster abnormal, wherein the disaster recovery method can achieve the system recovery time of zero RTO, realize the automatic recovery of the cache system from the disaster state to the normal running state, and ensure the continuity of the application service.

Description

Buffer disaster recovery cluster switching method and soft load balancing cluster device
Technical Field
The application relates to the technical field of cache disaster recovery cluster switching, in particular to a cache disaster recovery cluster switching method and a soft load balancing cluster device.
Background
With the popularity of a distributed system at present, a large number of distributed session applications access to distributed cache services, and more important applications with high access quantity and high data quantity, such as mobile banking, personal internet banking, enterprise internet banking, e-commerce purchase and the like, are accessed, and the application cache data has the following characteristics:
1. the local area has no data source and the data can be regenerated;
2. service is not interruptible;
3. the data requires strong consistency;
at present, when each session is applied to cope with a catastrophic failure of a cache service, several disaster recovery countermeasures and pain points are commonly adopted as follows:
1. the problem that the database bears pressure is large after the local database is switched back, and high performance and excellent experience sense brought by the cache service are lost.
2. When the disaster recovery cluster is cut into, the application needs to modify the cache configuration and restart the application service, which faces the problem of application service interruption.
3. The method has the advantages that the buffer disaster recovery clusters are configured in advance, when the buffer service is unavailable, the buffer disaster recovery clusters are automatically switched to, the same problem exists in the measure, the application service adopts distributed deployment, and the fact that all application servers are switched cannot be guaranteed, so that the problem of inconsistent data caused by the fact that the application servers access the two buffer clusters simultaneously is likely to be faced.
To sum up: when the cache service fails and cannot be recovered, access cache application usually faces various pain points such as service degradation, service interruption, data failing to guarantee consistency and the like, and the design of the disaster recovery of ten-odd beauty is difficult to achieve.
Disclosure of Invention
Aiming at the problems in the prior art, the application provides a method for switching the buffer disaster recovery clusters and a soft load balancing cluster device, which can automatically forward an application service request to the buffer disaster recovery clusters, and ensure that the application realizes the continuity service of 7 x 24 hours of service.
In order to solve the technical problems, the application provides the following technical scheme:
in a first aspect, the present application provides a method for switching a buffer disaster recovery cluster, where the method for switching a buffer disaster recovery cluster is executed by a soft load balancing cluster device, and the method for switching a buffer disaster recovery cluster includes:
checking the health state of each caching agent node in the original caching cluster;
if the health status of at least one cache proxy node is unavailable, isolating the corresponding cache proxy node until the health status of all the cache proxy nodes is unavailable, and connecting the request of the application client to the cache disaster recovery cluster.
In a preferred embodiment, the checking the health status of each caching agent node in the original cluster caching agent node cluster includes:
and establishing TCP connection with the cache proxy node and sending a preset request protocol to the cache proxy node, so that the cache service proxy node receives the preset request protocol, forwards the preset request protocol to the storage node, and returns to the soft load balancing cluster device, thereby realizing complete link simulation request checking from the cache proxy node to the cache storage node.
In a preferred embodiment, the method for switching the cache disaster recovery cluster further includes:
and carrying out security authentication on the request connection of the application client connected to the cache disaster recovery cluster.
In a preferred embodiment, if the health status of all the cache proxy nodes is unavailable, the method for switching the cache disaster recovery cluster further includes:
and starting a preset disaster backup change-over switch and sending out an alarm instruction.
In a preferred embodiment, further comprising:
and setting a disaster recovery switching switch, wherein when the disaster recovery switching switch is started, the connection between the request of the application client and the original cache disaster recovery cluster is shielded.
In a second aspect, the present application provides a soft load balancing cluster apparatus, including:
the health state checking module is used for checking the health state of each cache agent node in the original cluster cache agent node cluster;
and the switching module isolates the corresponding cache proxy nodes if the health state of at least one cache proxy node is unavailable until the health states of all the cache proxy nodes are unavailable, and then connects the request of the application client to the cache disaster recovery cluster.
In a preferred embodiment, the health status checking module is specifically configured to establish a TCP connection with the cache proxy node and send a preset request protocol to the cache proxy node, so that the cache service proxy node receives the preset request protocol, forwards the preset request protocol, writes the preset request protocol into the storage node, and returns the preset request protocol to the soft load balancing cluster device, thereby implementing complete link simulation request checking from the cache proxy node to the cache storage node.
In a preferred embodiment, further comprising:
and the security authentication module is used for performing security authentication on the request connection of the application client connected to the cache disaster recovery cluster.
In a preferred embodiment, further comprising:
and the alarm module is used for starting a preset disaster recovery changeover switch and sending an alarm instruction.
In a preferred embodiment, further comprising:
and the disaster recovery switching switch setting module is used for setting a disaster recovery switching switch, and when the disaster recovery switching switch is started, the connection between the request of the application client and the original cache disaster recovery cluster is shielded.
In a third aspect, the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements a method for switching a cache disaster recovery cluster when executing the program.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for switching cache disaster recovery clusters.
According to the technical scheme, the method for switching the cache disaster recovery clusters and the soft load balancing cluster device provided by the application are characterized in that firstly, the health state of each cache agent node in an original cache cluster is checked; and if the health state of at least one cache agent node is unavailable, isolating the corresponding cache agent nodes until the health state of all the cache agent nodes is unavailable, connecting the request of the application client to the cache disaster recovery cluster, and further, aiming at the application with reproducible, non-main data source, uninterrupted service and strong data requirement consistency of the cache data, realizing the application-level disaster recovery of the application client when the cache system is in disaster abnormal, wherein the disaster recovery method can achieve the system recovery time of zero RTO, realize the automatic recovery of the cache system from the disaster state to the normal running state, and ensure the continuity of the application service.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a method for switching a cache disaster recovery cluster in an embodiment of the present application.
Fig. 2 is a schematic structural diagram of a switching system according to an embodiment of the present application.
Fig. 3 is a schematic diagram of an original cache cluster structure in an embodiment of the present application.
Fig. 4 is a schematic diagram of a cache disaster recovery cluster structure in an embodiment of the present application.
Fig. 5 is a schematic structural frame diagram of a client routing request interaction system in an embodiment of the present application.
Fig. 6 is a schematic structural diagram of a soft load balancing cluster device in an embodiment of the present application.
Fig. 7 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
It should be noted that the present invention may be used in the technical field of information security, but may also be used in the technical field of cloud computing and big data, and the present invention is not limited thereto.
With the popularity of a distributed system at present, a large number of distributed session applications access to distributed cache services, and more important applications with high access quantity and high data quantity, such as mobile banking, personal internet banking, enterprise internet banking, e-commerce purchase and the like, are accessed, and the application cache data has the following characteristics:
1. the local area has no data source and the data can be regenerated;
2. service is not interruptible;
3. the data requires strong consistency.
For such caching services, when a catastrophic failure occurs, several disaster recovery countermeasures and pain points are commonly employed as follows:
1. the problem that the database bears pressure is large after the local database is switched back, and high performance and excellent experience sense brought by the cache service are lost.
2. When the disaster recovery cluster is cut into, the application needs to modify the cache configuration and restart the application service, which faces the problem of application service interruption.
3. The method has the advantages that the buffer disaster recovery clusters are configured in advance, when the buffer service is unavailable, the buffer disaster recovery clusters are automatically switched to, the same problem exists in the measure, the application service adopts distributed deployment, and the fact that all application servers are switched cannot be guaranteed, so that the problem of inconsistent data caused by the fact that the application servers access the two buffer clusters simultaneously is likely to be faced.
To sum up: when the cache service fails and cannot be recovered, access cache application usually faces various pain points such as service degradation, service interruption, data failing to guarantee consistency and the like, and the design of the disaster recovery of ten-odd beauty is difficult to achieve.
Based on this, the embodiment of the invention provides a technical scheme for switching the cache disaster recovery clusters, and the main principle is that a disaster recovery design and method which have no perception to an application client side and can ensure the consistency of application data and have zero system recovery time are provided at a cache server side. Through linkage with an in-line soft load balancing system SLB, deep health inspection of a cache service node is realized, and a customized request forwarding strategy is performed for the cache service node, so that the capability of automatically forwarding an application service request to a cache disaster recovery cluster when a catastrophic unrecoverable abnormality occurs in the cache service is provided, and the continuity service of 7 x 24 hours of application realization service is ensured.
The present invention will be described in detail below.
An embodiment of a first aspect of the present invention provides a method for switching a buffer disaster recovery cluster, as shown in fig. 1, where the method for switching a buffer disaster recovery cluster is performed by a soft load balancing cluster device, and the method for switching a buffer disaster recovery cluster includes:
s1: checking the health state of each caching agent node in the original caching cluster;
s2: if the health status of at least one cache proxy node is unavailable, isolating the corresponding cache proxy node until the health status of all the cache proxy nodes is unavailable, and connecting the request of the application client to the cache disaster recovery cluster.
The invention provides a method for switching a cache disaster recovery cluster, which comprises the steps of firstly checking the health state of each cache agent node in an original cache cluster; and if the health state of at least one cache agent node is unavailable, isolating the corresponding cache agent nodes until the health state of all the cache agent nodes is unavailable, connecting the request of the application client to the cache disaster recovery cluster, and further, aiming at the application with reproducible, non-main data source, uninterrupted service and strong data requirement consistency of the cache data, realizing the application-level disaster recovery of the application client when the cache system is in disaster abnormal, wherein the disaster recovery method can achieve the system recovery time of zero RTO, realize the automatic recovery of the cache system from the disaster state to the normal running state, and ensure the continuity of the application service.
Specifically, a set of corresponding disaster recovery cache clusters are built for the original cloud cache cluster proxy nodes and the storage nodes, customized deep health inspection is carried out on all the cache proxy nodes on the cloud by relying on an in-line soft load balancing system SLB, isolation of abnormal proxy nodes is achieved, when the situation that all the cache proxy nodes of the original cluster are unavailable due to faults occurs, an application request is automatically cut into the disaster recovery clusters, and continuity of access application service is guaranteed.
In an embodiment of the present invention, the checking the health status of each caching agent node in the original cluster caching agent node cluster includes:
and establishing TCP connection with the cache proxy node and sending a preset request protocol to the cache proxy node, so that the cache service proxy node receives the preset request protocol, forwards the preset request protocol to the storage node, and returns to the soft load balancing cluster device, thereby realizing complete link simulation request checking from the cache proxy node to the cache storage node. The invention is described below in connection with the actual handover scenario of fig. 2.
Fig. 2 shows a disaster recovery switching system according to an embodiment of the present invention, which includes: the original buffer proxy node clusters 4, 5, 6, 7, the buffer disaster backup cluster (not shown in fig. 2) and the soft load balancing cluster device 2.
In the specific embodiment in fig. 2, the system is deployed based on the PaaS cloud distributed coordinator ETCD cluster device 8, and specifically, the system further includes a load balancing device 1 for load balancing control of overall traffic.
In specific operation, as shown in fig. 5, the cache Proxy (Proxy) cluster devices 4, 5, 6, and 7 are deployed on the PaaS cloud container, and register information such as node status to the Etcd cluster.
Soft Load Balancing (SLB) Haproxy, periodically performs customized deep health checks on cloud nodes (cache Proxy nodes) on PaaS.
Specifically, the SLB soft load establishes TCP connection with the cache proxy node, and performs periodic deep health check on a RESP (Redis's serialization protocol) protocol sent by the SLB soft load, wherein the protocol content is as follows:
*4\r\n$5\r\nSETEX\r\n$25\r\ncache_server_health_check\r\n$2\r\n30$25\r\ncache_server_health_check\r\n
the protocol is used for initiating a write cache operation with ttl to the cache service, so as to simulate the application to request the cache service, the cache service proxy node receives the request protocol, forwards the request protocol to be written into the storage node, and finally returns the request to the SLB soft load, so that a simulation request of a complete link from the proxy node to the storage node can be realized, and further, the deep health check of the cache service is realized.
The SLB soft load detects the cache agent node once every 3 seconds, and when three continuous detection failures occur, the agent node is considered to be unavailable and isolated.
When detecting that an abnormality occurs in a certain proxy node and caching service cannot be provided, isolating the node and updating the state of the node, and not receiving a client request.
When all nodes of the caching Proxy server (Proxy) cluster device 4, 5, 6 and 7 are abnormally unavailable, the caching service cannot provide service at the moment, and the caching disaster recovery cluster is specially provided as a switching scheme for guaranteeing the continuity of the application service.
It will be appreciated that the various servers may communicate using any suitable network protocol, including one that has not been developed at the filing date of this application. The network protocols may include, for example, TCP/IP protocol, UDP/IP protocol, HTTP protocol, HTTPS protocol, etc. Of course, the network protocol may also include, for example, RPC protocol (Remote Procedure Call Protocol ), REST protocol (Representational State Transfer, representational state transfer protocol), etc. used above the above-described protocol.
Specifically, it can be understood that in the present invention, the node is a node cluster device, and the original cache cluster includes a cache proxy node and a cache storage node, as shown in fig. 3, and the cache disaster recovery cluster includes a cache proxy node and a cache storage node, as shown in fig. 4.
Referring to fig. 3 to 5, the present invention needs to build 2 clusters, including the following devices:
the original cluster caching agent node cluster devices 1, 2 and 3;
the original cluster cache stores node cluster devices 1, 2 and 3;
disaster recovery cluster caching agent node cluster devices 1, 2 and 3;
the disaster recovery cluster caches and stores node cluster devices 1, 2 and 3;
1. in fig. 3 and 4, the numbers and specifications of the proxy nodes and the storage nodes of the two cluster devices are kept consistent, and the tenant ID of the disaster recovery cluster needs to be kept similar or consistent with the tenant ID of the original cluster.
2. If the devices 1, 2, 3 of the disaster recovery cluster proxy node in fig. 4 receive a request for establishing a connection, it indicates that the original tenant cluster is unavailable and disaster recovery switching has occurred, and at this time, it needs to be identified whether the tenant ID in the request information is close to or consistent with the tenant ID of the disaster recovery cluster.
2.1 if the tenant IDs are consistent or similar, allowing the tenant IDs to establish connection and perform secret-free authentication, processing the request and returning to the application end, and ensuring that the application end is insensitive to disaster recovery switching.
2.2 if the tenant IDs are inconsistent, the request forwarded by the non-original tenant cluster is indicated, and connection is refused.
From the above description, it can be known that the present invention builds a set of corresponding disaster recovery cache clusters for the original cloud cache cluster proxy nodes and storage nodes, and relies on the in-line soft load balancing system SLB to perform customized deep health inspection on all the cache proxy nodes on the cloud, so as to isolate the abnormal proxy nodes, and when the failure of all the cache proxy nodes of the original cluster is unavailable, automatically cut the application request to the disaster recovery cluster, thereby guaranteeing the continuity of the access application service.
The overall process of client request routing of the present invention is described below with reference to fig. 5.
The full flow of client request routing, referring to fig. 4:
1. the cluster devices of the application ends 101, 102 and 103 initiate connection establishment requests to the designated 201F5 device.
2.201 device F5 forwards the request to the 301SLB Haproxy cluster device on which it is downloaded;
SLB obtains the health status of the agent node through a health checking mechanism, and relies on the disaster recovery switching forwarding strategy 401 configured in advance to perform matching rule processing under the following conditions:
and 3.1, if the disaster recovery cluster is available and the disaster recovery switch is opened, the request is forwarded to the disaster recovery cluster.
And 3.2, if the states of the caching proxy nodes of the original cluster are unavailable, the situation that the original cluster has a disaster-related fault and is unavailable is indicated, and the request is required to be forwarded to the disaster-tolerant cluster.
And 3.3, if the conditions (1) and (2) are not matched, adopting a default strategy, namely accessing the original cache cluster.
4. If the disaster recovery cluster receives the transaction request initiated by the original cluster, which indicates that the original cluster is not available and disaster recovery switching occurs, the disaster recovery switching switch is turned on and an alarm is given. The disaster recovery switching switch is closed by default, namely the disaster recovery is not accessed, if the disaster recovery switching switch is opened, the original cluster is not available and the disaster recovery cluster is required to be accessed, the request is prevented from being forwarded to the original cluster again after the original cluster is recovered to be normal, and the monotonicity of the application access cluster, namely the consistency of data, is ensured.
It can be understood that the invention mainly aims at the application with reproducible, non-main data source, uninterrupted service and strong consistency of data requirements of the cache data, and helps to realize the application-level disaster recovery without sense of the application client when the disaster abnormality occurs in the cache system.
The disaster recovery design and method can achieve the system recovery time with zero RTO, realize the automatic recovery of the buffer system from the disaster state to the normal operation state, and ensure the continuity of the application service.
The invention can realize the noninductive, transparent and automatic switching of the application end without any emergency operation such as restarting, switching, disaster backup processing and the like of the application end when the buffer system has a catastrophic failure, thereby meeting the requirements of uninterrupted application service, strong data consistency and the like.
In order to ensure that the application client does not feel and the consistency of the application data is guaranteed, and the system recovery time is zero, the application further provides a soft load balancing cluster device, as shown in fig. 6, which comprises:
the health state checking module 1 is used for checking the health state of each cache agent node in the original cluster cache agent node cluster;
and the switching module 2 isolates the corresponding cache proxy nodes if the health status of at least one cache proxy node is unavailable until the health status of all the cache proxy nodes is unavailable, and then connects the request of the application client to the cache disaster recovery cluster.
According to the technical scheme, the soft load balancing cluster device provided by the application firstly checks the health state of each cache agent node in the original cache cluster; and if the health state of at least one cache agent node is unavailable, isolating the corresponding cache agent nodes until the health state of all the cache agent nodes is unavailable, connecting the request of the application client to the cache disaster recovery cluster, and further, aiming at the application with reproducible, non-main data source, uninterrupted service and strong data requirement consistency of the cache data, realizing the application-level disaster recovery of the application client when the cache system is in disaster abnormal, wherein the disaster recovery method can achieve the system recovery time of zero RTO, realize the automatic recovery of the cache system from the disaster state to the normal running state, and ensure the continuity of the application service.
In one or more embodiments of the present application, the health status checking module is specifically configured to establish a TCP connection with a cache proxy node and send a preset request protocol to the cache proxy node, so that the cache service proxy node receives the preset request protocol, forwards the preset request protocol to a write storage node, and returns the write storage node to the soft load balancing cluster device, thereby implementing complete link simulation request checking from the cache proxy node to the cache storage node.
In one or more embodiments of the present application, further comprising:
and the security authentication module is used for performing security authentication on the request connection of the application client connected to the cache disaster recovery cluster.
Also included in one or more embodiments of the present application are:
and the alarm module is used for starting a preset disaster recovery changeover switch and sending an alarm instruction.
In one or more embodiments of the present application, further comprising:
and the disaster recovery switching switch setting module is used for setting a disaster recovery switching switch, and when the disaster recovery switching switch is started, the connection between the request of the application client and the original cache disaster recovery cluster is shielded.
From the above description, it can be known that the present invention builds a set of corresponding disaster recovery cache clusters for the original cloud cache cluster proxy nodes and storage nodes, and relies on the in-line soft load balancing system SLB to perform customized deep health inspection on all the cache proxy nodes on the cloud, so as to isolate the abnormal proxy nodes, and when the failure of all the cache proxy nodes of the original cluster is unavailable, automatically cut the application request to the disaster recovery cluster, thereby guaranteeing the continuity of the access application service.
From the aspect of hardware, the application provides an embodiment of an electronic device for implementing all or part of contents in a method for switching a disaster recovery cluster, where the electronic device specifically includes the following contents:
a processor (processor), a memory (memory), a communication interface (Communications Interface), and a bus; the processor, the memory and the communication interface complete communication with each other through the bus; the communication interface is used for realizing information transmission among the server, the device, the distributed message middleware cluster device, various databases, user terminals and other related equipment; the electronic device may be a desktop computer, a tablet computer, a mobile terminal, etc., and the embodiment is not limited thereto. In this embodiment, the electronic device may refer to an embodiment of the method for switching a disaster recovery cluster in the embodiment, and an embodiment of the device for switching a disaster recovery cluster is implemented, and the contents thereof are incorporated herein, and the repetition is omitted.
Fig. 7 is a schematic block diagram of a system configuration of an electronic device 9600 of an embodiment of the present application. As shown in fig. 7, the electronic device 9600 may include a central processor 9100 and a memory 9140; the memory 9140 is coupled to the central processor 9100. Notably, this fig. 7 is exemplary; other types of structures may also be used in addition to or in place of the structures to implement telecommunications functions or other functions.
In one embodiment, the cache disaster recovery cluster switching functionality may be integrated into the central processor 9100. For example, the central processor 9100 may be configured to control as follows:
s1: checking the health state of each caching agent node in the original caching cluster;
s2: if the health status of at least one cache proxy node is unavailable, isolating the corresponding cache proxy node until the health status of all the cache proxy nodes is unavailable, and connecting the request of the application client to the cache disaster recovery cluster.
As can be seen from the above description, the electronic device provided in the embodiments of the present application first checks the health status of each cache agent node in the original cache cluster; and if the health state of at least one cache agent node is unavailable, isolating the corresponding cache agent nodes until the health state of all the cache agent nodes is unavailable, connecting the request of the application client to the cache disaster recovery cluster, and further, aiming at the application with reproducible, non-main data source, uninterrupted service and strong data requirement consistency of the cache data, realizing the application-level disaster recovery of the application client when the cache system is in disaster abnormal, wherein the disaster recovery method can achieve the system recovery time of zero RTO, realize the automatic recovery of the cache system from the disaster state to the normal running state, and ensure the continuity of the application service.
In another embodiment, the switching device of the buffer disaster recovery cluster may be configured separately from the central processor 9100, for example, the buffer disaster recovery cluster may be configured as a chip connected to the central processor 9100, and the buffer disaster recovery cluster switching function is implemented under the control of the central processor.
As shown in fig. 7, the electronic device 9600 may further include: a communication module 9110, an input unit 9120, an audio processor 9130, a display 9160, and a power supply 9170. It is noted that the electronic device 9600 need not include all of the components shown in fig. 7; in addition, the electronic device 9600 may further include components not shown in fig. 7, and reference may be made to the related art.
As shown in fig. 7, the central processor 9100, sometimes referred to as a controller or operational control, may include a microprocessor or other processor device and/or logic device, which central processor 9100 receives inputs and controls the operation of the various components of the electronic device 9600.
The memory 9140 may be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information about failure may be stored, and a program for executing the information may be stored. And the central processor 9100 can execute the program stored in the memory 9140 to realize information storage or processing, and the like.
The input unit 9120 provides input to the central processor 9100. The input unit 9120 is, for example, a key or a touch input device. The power supply 9170 is used to provide power to the electronic device 9600. The display 9160 is used for displaying display objects such as images and characters. The display may be, for example, but not limited to, an LCD display.
The memory 9140 may be a solid state memory such as Read Only Memory (ROM), random Access Memory (RAM), SIM card, etc. But also a memory which holds information even when powered down, can be selectively erased and provided with further data, an example of which is sometimes referred to as EPROM or the like. The memory 9140 may also be some other type of device. The memory 9140 includes a buffer memory 9141 (sometimes referred to as a buffer). The memory 9140 may include an application/function storage portion 9142, the application/function storage portion 9142 storing application programs and function programs or a flow for executing operations of the electronic device 9600 by the central processor 9100.
The memory 9140 may also include a cache disaster recovery cluster switch 9143, where the cache disaster recovery cluster switch 9143 is configured to store data, such as contacts, digital data, pictures, sounds, and/or any other data used by the electronic device. The driver storage portion 9144 of the memory 9140 may include various drivers of the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, address book applications, etc.).
The communication module 9110 is a transmitter/receiver 9110 that transmits and receives signals via an antenna 9111. A communication module (transmitter/receiver) 9110 is coupled to the central processor 9100 to provide input signals and receive output signals, as in the case of conventional mobile communication terminals.
Based on different communication technologies, a plurality of communication modules 9110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, etc., may be provided in the same electronic device. The communication module (transmitter/receiver) 9110 is also coupled to a speaker 9131 and a microphone 9132 via an audio processor 9130 to provide audio output via the speaker 9131 and to receive audio input from the microphone 9132 to implement usual telecommunications functions. The audio processor 9130 can include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 9130 is also coupled to the central processor 9100 so that sound can be recorded locally through the microphone 9132 and sound stored locally can be played through the speaker 9131.
The embodiment of the present application further provides a computer readable storage medium capable of implementing all the steps in the method for switching a cache disaster recovery cluster of a server by using an execution subject in the above embodiment, where the computer readable storage medium stores a computer program, and when the computer program is executed by a processor, the execution subject in the above embodiment implements all the steps in the method for switching a cache disaster recovery cluster of a soft load balancing cluster device.
As can be seen from the above description, the computer readable storage medium provided in the embodiments of the present application, for applications with renewable, non-primary data source, service irreruptable, and strong consistency of data requirements, helps to implement an application-level disaster tolerance that is not felt by an application client when a disaster abnormality occurs in a cache system, and meanwhile, the disaster tolerance method can achieve a system recovery time when RTO is zero, so as to implement fast automatic recovery of the cache system from a disaster state to a normal operation state, and ensure continuity of application services.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principles and embodiments of the present invention have been described in detail with reference to specific examples, which are provided to facilitate understanding of the method and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (10)

1. The method for switching the buffer disaster recovery clusters is characterized in that the method for switching the buffer disaster recovery clusters is executed by a soft load balancing cluster device, and comprises the following steps:
checking the health state of each caching agent node in the original caching cluster;
if the health state of at least one cache proxy node is unavailable, isolating the corresponding cache proxy node until the health state of all the cache proxy nodes is unavailable, and connecting the request of the application client to the cache disaster recovery cluster;
wherein, the checking the health status of each caching agent node in the original caching cluster includes:
and establishing TCP connection with the cache proxy node and sending a preset request protocol to the cache proxy node, so that the cache service proxy node receives the preset request protocol, forwards and writes the preset request protocol into the cache storage node, and returns to the soft load balancing cluster device, thereby realizing complete link simulation request checking from the cache proxy node to the cache storage node.
2. The method for switching a disaster recovery cluster according to claim 1, further comprising:
and carrying out security authentication on the request connection of the application client connected to the cache disaster recovery cluster.
3. The method for switching a cache disaster recovery cluster according to claim 1, wherein if health status of all cache agent nodes is unavailable, the method further comprises:
and starting a preset disaster backup change-over switch and sending out an alarm instruction.
4. The method for switching a disaster recovery cluster according to claim 1, further comprising:
and setting a disaster recovery switching switch, wherein when the disaster recovery switching switch is started, the connection between the request of the application client and the original cache disaster recovery cluster is shielded.
5. A soft load balancing cluster apparatus, comprising:
the health state checking module is used for checking the health state of each cache agent node in the original cluster cache agent node cluster;
the switching module isolates the corresponding cache proxy nodes if the health state of at least one cache proxy node is unavailable, and connects the request of the application client to the cache disaster recovery cluster until the health state of all the cache proxy nodes is unavailable;
the health status checking module is specifically configured to establish a TCP connection with the cache proxy node and send a preset request protocol to the cache proxy node, so that the cache service proxy node receives the preset request protocol, forwards the preset request protocol, writes the preset request protocol into the cache storage node, and returns the preset request protocol to the soft load balancing cluster device, thereby implementing complete link simulation request checking from the cache proxy node to the cache storage node.
6. The soft load balancing cluster apparatus of claim 5, further comprising:
and the security authentication module is used for performing security authentication on the request connection of the application client connected to the cache disaster recovery cluster.
7. The soft load balancing cluster apparatus of claim 5, further comprising:
and the alarm module is used for starting a preset disaster recovery changeover switch and sending an alarm instruction.
8. The soft load balancing cluster apparatus of claim 5, further comprising:
and the disaster recovery switching switch setting module is used for setting a disaster recovery switching switch, and when the disaster recovery switching switch is started, the connection between the request of the application client and the original cache disaster recovery cluster is shielded.
9. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program executable on the processor, characterized in that the processor implements the cache disaster recovery cluster switching method according to any of claims 1 to 4 when executing the program.
10. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the cache disaster recovery cluster switching method of any of claims 1 to 4.
CN202011389545.8A 2020-12-02 2020-12-02 Buffer disaster recovery cluster switching method and soft load balancing cluster device Active CN112463451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011389545.8A CN112463451B (en) 2020-12-02 2020-12-02 Buffer disaster recovery cluster switching method and soft load balancing cluster device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011389545.8A CN112463451B (en) 2020-12-02 2020-12-02 Buffer disaster recovery cluster switching method and soft load balancing cluster device

Publications (2)

Publication Number Publication Date
CN112463451A CN112463451A (en) 2021-03-09
CN112463451B true CN112463451B (en) 2024-01-26

Family

ID=74806411

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011389545.8A Active CN112463451B (en) 2020-12-02 2020-12-02 Buffer disaster recovery cluster switching method and soft load balancing cluster device

Country Status (1)

Country Link
CN (1) CN112463451B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113190364A (en) * 2021-04-30 2021-07-30 平安壹钱包电子商务有限公司 Remote call management method and device, computer equipment and readable storage medium
CN113434340B (en) * 2021-06-29 2022-11-25 聚好看科技股份有限公司 Server and cache cluster fault rapid recovery method
CN113590386B (en) * 2021-07-30 2023-03-03 深圳前海微众银行股份有限公司 Disaster recovery method, system, terminal device and computer storage medium for data
CN114070716B (en) * 2021-11-29 2024-02-13 中国工商银行股份有限公司 Application management system, application management method and server
CN114189547B (en) * 2022-02-14 2022-05-03 北京安盟信息技术股份有限公司 SSL tunnel fast switching method under cluster
CN115378962B (en) * 2022-08-18 2023-04-21 北京志凌海纳科技有限公司 High-availability communication method and system for storage cluster based on iSCSI protocol
CN115484149B (en) * 2022-09-13 2024-04-02 中国建设银行股份有限公司 Network switching method, network switching device, electronic equipment and storage medium
CN116233137B (en) * 2023-02-17 2023-11-17 通明智云(北京)科技有限公司 Cluster-based load sharing and backup method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111756841A (en) * 2020-06-23 2020-10-09 中国平安财产保险股份有限公司 Service implementation method, device, equipment and storage medium based on micro-service cluster

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10191824B2 (en) * 2016-10-27 2019-01-29 Mz Ip Holdings, Llc Systems and methods for managing a cluster of cache servers

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111756841A (en) * 2020-06-23 2020-10-09 中国平安财产保险股份有限公司 Service implementation method, device, equipment and storage medium based on micro-service cluster

Also Published As

Publication number Publication date
CN112463451A (en) 2021-03-09

Similar Documents

Publication Publication Date Title
CN112463451B (en) Buffer disaster recovery cluster switching method and soft load balancing cluster device
CN111031058A (en) Websocket-based distributed server cluster interaction method and device
JP6275693B2 (en) Binding CRUD type protocol in distributed agreement protocol
US11917023B2 (en) Fast session restoration for latency sensitive middleboxes
US10146525B2 (en) Supporting hitless upgrade of call processing nodes in cloud-hosted telephony system
WO2021057526A1 (en) Disaster recovery method for gateway device, and communication device
CN110764881A (en) Distributed system background retry method and device
CN111371695B (en) Service flow limiting method and device
CN112953908A (en) Network isolation configuration method, device and system
CN112069154A (en) Automatic operation and maintenance method and related device for etcd distributed database
CN112929438B (en) Business processing method and device of double-site distributed database
CN112612851B (en) Multi-center data synchronization method and device
CN114257532A (en) Server side state detection method and device
CN111352959B (en) Data synchronous remedying and storing method and cluster device
CN110597467B (en) High-availability data zero-loss storage system and method
EP3896931A1 (en) Spark shuffle-based remote direct memory access system and method
CN116193481A (en) 5G core network processing method, device, equipment and medium
CN114697339A (en) Load balancing method and device under centralized architecture
CN113452776B (en) PaaS platform service scheduling method and device and PaaS platform
CN116185755A (en) Data processing method and device for distributed load balancing system
CN111698337B (en) Method, device and equipment for establishing communication connection
CN114979234A (en) Session control sharing method and system in distributed cluster system
CN110851526B (en) Data synchronization method, device and system
CN113050974B (en) Online upgrading method and device for cloud computing infrastructure
CN111885148B (en) Session synchronization method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant