CN110351122A - Disaster recovery method, device, system and electronic equipment - Google Patents

Disaster recovery method, device, system and electronic equipment Download PDF

Info

Publication number
CN110351122A
CN110351122A CN201910521769.0A CN201910521769A CN110351122A CN 110351122 A CN110351122 A CN 110351122A CN 201910521769 A CN201910521769 A CN 201910521769A CN 110351122 A CN110351122 A CN 110351122A
Authority
CN
China
Prior art keywords
node
information
service
service node
new demand
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910521769.0A
Other languages
Chinese (zh)
Other versions
CN110351122B (en
Inventor
侯焯明
刘林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910521769.0A priority Critical patent/CN110351122B/en
Publication of CN110351122A publication Critical patent/CN110351122A/en
Application granted granted Critical
Publication of CN110351122B publication Critical patent/CN110351122B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1456Hardware arrangements for backup
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Abstract

The disclosure provides a kind of disaster recovery method, device and electronic equipment, is related to field of computer technology.The disaster recovery method includes: the service node that the first information from client is sent to N number of equity, and N is the integer more than or equal to 2;M the second information of the service node from N number of equity are received, M is the positive integer less than or equal to N;One in the M the second information is sent to the client.The reliability that data are transmitted between client and server can be improved in the disaster recovery method that the disclosure provides, and improves redundancy ability.

Description

Disaster recovery method, device, system and electronic equipment
Technical field
This disclosure relates to which field of computer technology, sets in particular to a kind of disaster recovery method, device, system and electronics It is standby.
Background technique
Disaster tolerance is one kind in order to guarantee information system normal operation when meeting with disaster, realize business continuance and design Management plan.In the related art, disaster tolerance is realized usually using active and standby service node switching method, i.e., go out in main service node When existing failure, backup services node is upgraded to new main service node, takes over the work of former main service node.Active-standby switch side Method generally passes through artificial or third party's service and realizes.
During the main service node of manual switching and backup services node, each service node is in down state, Since man efficiency is lower, the time that service node is in down state is longer, and drawback is larger.
Main/standby switching method needs based on third party's service are registered to service node in third party's service, by third party Service selects main service node or elects new main service node in main service node failure, and service node monitors main service The state change information of node, and restore corresponding status information when the upgrading of backup services node is main service node.It is this Method is needed when the snoop logic for increasing main service node handover event on the code of service node and main service node switch The recovery logic of service condition, it is larger for the code invasion of original business, increase O&M risk;It is new in service node when having When business is added, it is also necessary to increase the state recovery code for restoring new business in service node state procedure, be unfavorable for maintenance system System is stablized.Meanwhile this method needs to safeguard a third party's service, increases O&M cost.Finally, it is identical as manual type, Presence service down state in main-standby nodes handoff procedure, and for stateful service node, in switching service node When, it is also necessary to the status information of main service node is synchronized to the backup services node, data reliability is lower.
It should be noted that information is only used for reinforcing the reason to the background of the disclosure disclosed in above-mentioned background technology part Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
The embodiment of the present disclosure provides a kind of disaster recovery method, device, system and electronic equipment, at least to a certain extent Overcome that Disaster Tolerant Scheme presence service down state caused by the limitation and defect due to the relevant technologies, reliability is not high asks Topic.
According to one aspect of the disclosure, a kind of disaster recovery method is provided, is executed by the service cluster being set in server, The service cluster includes main management node and more than one backup management node, and the main management node is for executing the appearance Disaster method, the backup management node are used for the candidate when the main management node breaks down as new main management node Object, the disaster recovery method include:
The first information from client is sent to the service node of N number of equity, N is the integer more than or equal to 2;
M the second information of the service node from N number of equity are received, M is the integer less than or equal to N;
One in the M the second information is sent to the client.
In a kind of exemplary embodiment of the disclosure, one by the M the second information is sent to described Client includes:
Arrival time earliest second information is sent to the client.
In a kind of exemplary embodiment of the disclosure, one by the M the second information is sent to described Client further include:
If arrival time posterior second information is identical as the second earliest information of the arrival time, described in discarding Arrival time posterior second information;
If arrival time posterior second information is different from the second earliest information of the arrival time, described in notice The corresponding service node of arrival time posterior second information is closed.
It is described that the first information from client is sent to N number of equity in a kind of exemplary embodiment of the disclosure Service node include: that the first information is sent to all backup management nodes;It is described by the M the second information In one be sent to the client further include: one in the M the second information is sent to all backups pipes Manage node.
In a kind of exemplary embodiment of the disclosure, further includes:
Response service node registration request determines the identifier of new demand servicing node to be registered, and the service node registration is asked Seek the execution file encryption value including the new demand servicing node;
If there is no registered service node corresponding with the identifier, isolated section is started to the new demand servicing node Point register flow path;
If there is registered service node corresponding with the identifier, the execution file of the new demand servicing node is compared Whether secret value is consistent with the execution file encryption value of the registered service node;
If consistent, redundant node register flow path is started to the new demand servicing node;
If inconsistent, refuse the service node registration request.
In a kind of exemplary embodiment of the disclosure, the isolated node register flow path includes:
It determines and corresponds to the new demand servicing node in the maximum executed instructions serial number n1 and caching of the new demand servicing node Identifier maximum received instruction serial number n2 serial number absolute value of the difference x;
If x is equal to zero, the new demand servicing node is registered;
If x is less than or equal to the first preset value, the x first information of serial number n1 to n2 is called to be sent to institute from caching New demand servicing node is stated, after the new demand servicing node has executed the x first information, registers the new demand servicing node;
If x is greater than first preset value, refuse the service node registration request, wherein n1, n2 are positive integer.
In a kind of exemplary embodiment of the disclosure, the redundant node register flow path includes:
The maximum of the maximum executed instructions serial number n1 and the registered service node that determine the new demand servicing node are Execute instruction the serial number absolute value of the difference y of serial number n3;
If y is equal to zero, the new demand servicing node is registered;
If y is less than or equal to the second preset value, the y first information of serial number n1 to n3 is called to be sent to institute from caching New demand servicing node is stated, after the new demand servicing node has executed the y first information, registers the new demand servicing node;
If y is greater than second preset value, copy state information is sent to described from the registered service node After new demand servicing node, the new demand servicing node is registered, wherein n1, n3 are positive integer.
In a kind of exemplary embodiment of the disclosure, the registration new demand servicing node includes:
Initialization information is issued to the new demand servicing node, the initialization information includes random seed and patrols from driving Volume, the logic of driving certainly includes the timestamp of the management node.
It is described that the first information from client is sent to N number of equity in a kind of exemplary embodiment of the disclosure Service node include:
Determine the identifier of the corresponding service node of the first information;
The first information is sent to the service node of N number of equity according to the identifier.
In a kind of exemplary embodiment of the disclosure, second information includes the second letter that the service node is sent The serial number of breath, the M for receiving the service node from N number of equity second information include:
In multiple information from service node, the serial number of the second information sent according to the service node is determined The second information corresponding to the first information.
In a kind of exemplary embodiment of the disclosure, the service node of N number of equity uses shared drive, Mei Gesuo State service node has independent memory space in the shared drive.
In a kind of exemplary embodiment of the disclosure, when the service node restarted after being to be closed in new demand servicing node, The memory space used in the shared drive before being closed is continued to use when the new demand servicing node reboot.
According to another aspect of the disclosure, comprising:
Information distribution module is set as the first information from client being sent to the service node of N number of equity, and N is Integer more than or equal to 2;
Information receiving module, is set as receiving M the second information of the service node from N number of equity, M be less than Positive integer equal to N;
Information sending module is set as one in the M the second information being sent to the client.
According to another aspect of the disclosure, a kind of disaster tolerance system is provided, comprising:
At least one client;
Server cluster is coupled to the client, is provided with service cluster, and the service cluster includes main management node With more than one backup management node, the backup management node is used for when the main management node breaks down as new The candidate target of main management node, the main management node is for executing described in any item disaster recovery methods as above.
According to the another aspect of the disclosure, a kind of electronic equipment is provided, comprising:
Memory;And
It is coupled to the processor of the memory, the processor is configured to the finger based on storage in the memory It enables, executes the disaster recovery method as described in above-mentioned any one.
The first information to be processed is distributed to more by the embodiment of the present disclosure simultaneously by the service node of the multiple equities of deployment The service node of a equity, and the second information fed back is determined according to the processing result of each service node, it can be at one or more When service node breaks down, the normal processing of business is ensured using normal service node, eliminates related active-standby switch disaster tolerance Method is existing to service down state.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not The disclosure can be limited.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure Example, and together with specification for explaining the principles of this disclosure.It should be evident that the accompanying drawings in the following description is only the disclosure Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.
Fig. 1 is the schematic diagram of disaster tolerance system in disclosure exemplary embodiment.
Fig. 2 is the flow chart of disaster recovery method in disclosure exemplary embodiment.
Fig. 3 is the flow chart that the feedback information of the first information is determined in an embodiment of the present disclosure.
Fig. 4 is the schematic diagram for realizing disaster recovery method in an embodiment of the present disclosure by service cluster.
Fig. 5 is the schematic diagram of the service cluster based on Paxos protocol realization in an embodiment of the present disclosure.
Fig. 6 is the register flow path figure of service node in an embodiment of the present disclosure.
Fig. 7 A is the flow chart of isolated node register flow path when service node is with state in an embodiment of the present disclosure.
Fig. 7 B is the flow chart of redundant node register flow path when service node is with state in an embodiment of the present disclosure.
Fig. 8 is the interaction schematic diagram in an embodiment of the present disclosure in service node registration process.
Fig. 9 is the schematic diagram of an embodiment of the present disclosure interior joint registration.
Figure 10 is the schematic diagram initialized in an embodiment of the present disclosure to service node.
Figure 11 is that main management node in an embodiment of the present disclosure (Master Gdriver) carries out pipe to upstream data packet The schematic diagram of reason.
Figure 12 is that main management node in an embodiment of the present disclosure (Master Gdriver) carries out pipe to downlink data packet The schematic diagram of reason.
Figure 13 is the schematic diagram of one application scenarios of the disclosure.
Figure 14 is a kind of block diagram of disaster tolerance device in one exemplary embodiment of the disclosure.
Figure 15 is the block diagram of a kind of electronic equipment in one exemplary embodiment of the disclosure.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, thesing embodiments are provided so that the disclosure will more Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.Described feature, knot Structure or characteristic can be incorporated in any suitable manner in one or more embodiments.In the following description, it provides perhaps More details fully understand embodiment of the present disclosure to provide.It will be appreciated, however, by one skilled in the art that can It is omitted with technical solution of the disclosure one or more in the specific detail, or others side can be used Method, constituent element, device, step etc..In other cases, be not shown in detail or describe known solution to avoid a presumptuous guest usurps the role of the host and So that all aspects of this disclosure thicken.
In addition, attached drawing is only the schematic illustrations of the disclosure, reciprocity appended drawing reference indicates same or similar portion in figure Point, thus repetition thereof will be omitted.Some block diagrams shown in the drawings are functional entitys, not necessarily necessary and object The entity managed or be logically independent is corresponding.These functional entitys can be realized using software form, or in one or more These functional entitys are realized in hardware module or integrated circuit, or in heterogeneous networks and/or processor device and/or microcontroller These functional entitys are realized in device.
Fig. 1 is the schematic diagram of disaster tolerance system 100 in the embodiment of the present disclosure.The disaster recovery method or disaster tolerance of the embodiment of the present disclosure fill It sets and can be applied to disaster tolerance system 100.
With reference to Fig. 1, disaster tolerance system 100 may include:
At least one customer end A;
Server cluster B, is coupled to customer end A, is provided with service cluster, and service cluster includes main management node and one The above backup management node, backup management node are used for the time when main management node breaks down as new main management node Object is selected, main management node is used to execute the disaster recovery method of following embodiments.
As shown in Figure 1, customer end A for example can be the various electronic equipments with display screen, including but not limited to intelligently Mobile phone, tablet computer, portable computer and desktop computer etc..Net can be used between customer end A and server cluster B The medium that network provides communication link is coupled, and customer end A is enable to send the first information to server cluster B or receive service The second information that device cluster B is sent.Network may include various connection types, such as wired, wireless communication link or optical fiber Cable etc..
It should be understood that the number of customer end A, network and server cluster B in Fig. 1 is only schematical.According to reality It now needs, can have any number of terminal, network and server.
The service node of multiple equities can be set in server cluster B, these service nodes can be for example application The process of program, multiple service nodes are arranged in one or more servers.
Disaster recovery method provided by the embodiment of the present disclosure can be executed by server cluster B, and correspondingly, disaster tolerance device can be with It is set in server cluster B.
Disclosure example embodiment is described in detail with reference to the accompanying drawing.
Fig. 2 is the flow chart of disaster recovery method in disclosure exemplary embodiment.With reference to Fig. 2, disaster recovery method 200 be can wrap It includes:
The first information from client, is sent to the service node of N number of equity by step S1, and N is whole more than or equal to 2 Number.
Step S2, receives M the second information of the service node from N number of equity, and M is just whole less than or equal to N Number.
One in the M the second information is sent to the client by step S3.
In the embodiments of the present disclosure, the first information, the second information form for example can be message/data packet, when first When information, the second information are data packet, the first information is referred to alternatively as upstream data packet, and the second information is referred to alternatively as downlink data Packet.
Information to be processed is distributed to multiple equities by the service node of the multiple equities of deployment by the embodiment of the present disclosure simultaneously Service node, the information that sends to client, Ke Yi are determined according to the redundancy that the service node of multiple equities is sent When partial service node breaks down, the normal processing of business is ensured by remaining normal service node, is eliminated related active and standby Switch the existing service down state of disaster recovery method;Moreover, because being only managed to the input and output of information, for business The invasion of code is less;Since the information of multiple service nodes processing is identical, when one or more service nodes break down Business processing and state can be kept to update, it can be to avoid the system due to caused by service node failure, service state loss Unavailable disaster.
In the following, each step to embodiment illustrated in fig. 2 is described in detail.
In step S1, the first information from client is sent to the service node of N number of equity, N is more than or equal to 2 Integer.
In the embodiments of the present disclosure, reciprocity service node refers to possessing consistent execution logic and consistent initialization The service node of condition.In some embodiments, the service node of N number of equity can be deployed on one or more server.
When service node is the process of an application program, start an application program then one service section of corresponding starting Point.There may be the multiple service nodes for corresponding to multiple application programs in same time server.At this point it is possible to first really Determine the identifier (such as number or title of application program) of the corresponding service node of the first information, then the first information is sent to The service node of the corresponding N number of equity of the identifier.
In step S2, M the second information of the service node from N number of equity are received, M is just less than or equal to N Integer.
In the embodiments of the present disclosure, the second information either service node active transmission to the information of client, can also To be that the service node response first information is sent to the feedback information of client, the disclosure is not limited.
In the exemplary embodiment, the second information includes sending the identifier and service node of the service node of the second information The serial number of the second information sent, can be in multiple information from service node, according to the second of service node transmission The serial number of information determines the second information for corresponding to the first information.
For example, service node 1,2,3 ..., in the second information for sending respectively of N, respectively include service node transmission The serial number 99 of second information, 99,99 ..., 99, wherein 99 finger second information are the 99th article of information that service node 1 is sent, Other service nodes are identical.The above numerical value is merely illustrative, and the disclosure is not limited.
In step S3, one in the M the second information is sent to the client.
In the embodiments of the present disclosure, it can will be sent to client in the second information reached earliest, if arrival time Posterior second information is identical as the second earliest information of arrival time, abandons arrival time posterior second information, otherwise leads to Know that the corresponding service node of arrival time posterior second information is closed.
Due to service node equity, if each service node operates normally, the second information (downlink data packet) should be complete Portion is identical, rejects redundancy at this time, and the second information of first arrival is taken to be sent to client;If there is event in partial service node Barrier, such as send the second information of mistake or do not send the second information, then the second information reached using first is standard to arriving It tests up to time posterior second information, as long as the second letter of arrival time posterior second information and arrival time earliest Breath is not identical, then judges that arrival time posterior second information for error message, notifies arrival time posterior second information pair The service node answered is closed.
Fig. 3 is the flow chart that the feedback information of the first information is determined in an embodiment of the present disclosure.
With reference to Fig. 3, step S3 may include:
The second earliest information of arrival time is sent to client by step S31.
Step S32 judges whether arrival time posterior second information second information earliest with arrival time is identical, such as Fruit is identical to enter step S33;Otherwise S34 is entered step.
Step S33 abandons arrival time posterior second information.
Step S34, the corresponding service node of notice arrival time posterior second information are closed.
Wherein it is possible to notify arrival time posterior second letter according to the identifier for the service node for including in the second information Corresponding service node is ceased to close.
If the second information is feedback information of the service node to the first information, due to service node equity, input information Identical, the second information of the service node output of each equity should be identical;If the second information is service node actively to client The information sent is held, the second information of the service node output of each equity also should be identical.
But in some cases, it is possible to which information reception error caused by presence service node failure, information processing are wrong Accidentally, information sends various error situations such as mistake, and then the second information for causing failed services node to send and other positive informal dresses The second information that node is sent of being engaged in is different, needs to identify failed services node by the second information of mistake at this time, and Failed services node is notified to close.
The efficiency for identifying the second information next time can be improved in notice failed services node closing, and the mode of notice for example may be used Think and notify the service node offline, or removes the registration information etc. of the service node.In some other embodiment, may be used also Think corresponding service node restarting of notification error information etc., those skilled in the art can voluntarily set according to the actual situation It sets.
In method provided by the above embodiment, due to the second information redundancy, occur as long as being not all of service node Failure would not influence the normal transmitting of feedback information.Even if each service node is stateful node, due to the service of equity Node input having the same, the state change process and last state of the service node of each equity are also identical, even if part takes Business node failure, will not lead to the loss of state or asynchronous of other service nodes.Therefore, compared to switching in the related technology The unavailable defect of service caused by active and standby service node, or can not be synchronized like clockwork when service node breaks down The state of service node, the technical solution that the embodiment of the present disclosure provides have higher reliability and better user experience.
In an embodiment of the present disclosure, disaster recovery method 200 can be executed by the service cluster being set in server.
Fig. 4 is the schematic diagram for realizing disaster recovery method 200 in an embodiment of the present disclosure by service cluster 400.
With reference to Fig. 4, service cluster 400 includes a main management node 41 and one or more backup management nodes 42, supervisor Reason node 41 is used for when main management node 41 breaks down for executing disaster recovery method 200, backup management node 42 as new Main management node candidate target.
Service cluster 400 can be set on one or more server, and each management node can be set identical or not With on server.When each management node is arranged on different server, service cluster 400 be can be set between each server Communication channel, the data to realize multiple management nodes are synchronous.
At work, service cluster 400 chooses a management node as main management section first in multiple management nodes Point 41, and backup management node 42 is set by other management nodes.In order to which backup management node 42 can be in main management node Take over work, synchrodata when 41 failure in time, main management node 41 is in the letter for receiving client 1 or service node 2 is sent It, can be synchronous to full backup management node by service cluster agreement when breath.
As shown in figure 4, client 1 and service node 2 are only interacted with main management node 41.Main management node 41 is to client 1 and 2 output information of service node, receive the information that client 1 and service node 2 are sent.Meanwhile main management node 41 is by client The input information and output information of end 1 and service node 2 are submitted to service cluster 400 by service cluster agreement, such as are responsible for The input information and output information of client 1 and service node 2 can be submitted to service cluster through Paxos agreement by reason node 41 400 so as to by the first information or the second synchronizing information to full backup management node.In some embodiments, main management node 41 When receiving the first information, the first information can be synchronized to full backup management node first, receive backup management node Response after, the first information is distributed to the service node 2 of N number of equity;It, can be first when main management node 41 receives the second information First by the second synchronizing information to full backup management node, after the response for receiving backup management node, the second information is sent out Give client 1.Alternatively, above-mentioned will send data to client/service node process and data be synchronized to backup management The process of node can also carry out simultaneously.
Service cluster 400 monitors the working condition of main management node 41 in real time, when main management node 41 breaks down, stands A new main management node is chosen i.e. in multiple backup management nodes 42, and automatically by the road of client 1 and service node 2 New main management node is switched to by address.
Since service cluster 400 itself also supports disaster tolerance, further guarantee is provided for the reliability of system.In addition, Since addition service cluster 400 only needs modification information transceiver logic, code revision amount is small, smaller for the invasion of former service logic, The problems such as big code revision amount present in the relevant technologies, increase system failure risk can effectively be avoided.
In some embodiments, service cluster 400 can be based on Paxos protocol realization, and each management node can for example lead to Cross the realization of Gdriver node.
Fig. 5 is the schematic diagram of the service cluster based on Paxos protocol realization in one embodiment.
With reference to Fig. 5, Paxos cluster 500 elects a Gdriver node as 51 (Master of main management node Gdriver), the routing address of client and service processes is set to the address of main management node (Master Gdriver). Main management node 51 receives the first information (Log is both the binLog and service message of Paxos) from client 1 and divides It is sent to the service node 20 (service node) of multiple equities, one in multiple second information (Log) that service node 20 is sent It is a to be sent to client 10.Pass through Paxos protocol synchronization data, main management between main management node 51 and backup management node 52 Node 51 will receive to the first information of client 10, one received into the second information of service node 20 and be synchronized to whole Backup management node 52.
Each management node possesses the local pond Log, the message sequence number strictly increasing in the pond Log, and Paxos cluster guarantees The pond Log of all management nodes is completely the same.
In order to ensure the service node of multiple equities is identical, in the embodiments of the present disclosure, service cluster is also used to pipe Manage the synchronization, registration, driving of each service node.
It after service node starting, can be registered to service cluster, so that service cluster is being received corresponding to the service section When the first information of the identifier of point, which can be sent to all registered service nodes.It is same in order to ensure The corresponding registered service node of one identifier is completely the same, in the registration process of service node, can to service node into Row verification.
Fig. 6 is the register flow path figure of service node in an embodiment of the present disclosure.
With reference to Fig. 6, the registration process of service node may include:
Step S61, response service node registration request determine the identifier of new demand servicing node to be registered, service node note Volume request includes the execution file encryption value of new demand servicing node;
Step S62 judges whether there is registered service node corresponding with identifier, if it does not, entering step S63 starts isolated node register flow path to new demand servicing node, otherwise enters step S64;
Step S64, the execution file encryption of execution the file encryption value and registered service node of comparison new demand servicing node Whether value is consistent, if unanimously, entering step S65, otherwise entering step S66;
Step S65 starts redundant node register flow path to new demand servicing node;
Step S66 refuses service node registration request.
Above-mentioned execution file encryption value can be for example MD5 (Message-Digest Algorithm, eap-message digest calculation Method) value, the disclosure is not particularly limited this.
All execution files node different from registered nodes can be rejected by executing process shown in Fig. 6, tentatively ensure that The consistency of new demand servicing node and registered service node.Isolated node when service node does not have state, in step S63 Redundant node register flow path in register flow path and step S65 can be directly registration new demand servicing node, but in service node When with state, it is also necessary to be adjusted to the state of new demand servicing node.
Fig. 7 A be service node have state when isolated node register flow path flow chart.
With reference to Fig. 7 A, when service node has state, isolated node register flow path may include:
Step S71 is determined in the maximum executed instructions serial number n1 and caching of the new demand servicing node corresponding to described new The maximum of the identifier of service node has received the serial number absolute value of the difference x of instruction serial number n2;
Step S72 registers the new demand servicing node if x is equal to zero;
Step S73 calls the x first information of serial number n1 to n2 if x is less than or equal to the first preset value from caching It is sent to the new demand servicing node, after the new demand servicing node has executed the x first information, registers the new demand servicing section Point;
Step S74 refuses the service node registration request if x is greater than first preset value.
Though new demand servicing node be cold start-up (executed instructions are recorded as zero) or thermal starting (restarted after closing, Executed instructions before load is closed record), before registering new demand servicing node, require to be had received according to main management node The state that new demand servicing node is adjusted to the corresponding first information of new demand servicing node (instruction), so that it is guaranteed that new demand servicing node at State is normal after function registration, can provide correct feedback to the new first information.
It, can be in an embodiment of the present disclosure in order to facilitate the state after managing and recording each service node before restarting The shared drive being used in conjunction with for multiple service nodes is set, is that independent storage is arranged in each service node in shared drive Space.The memory space used in shared drive before being closed can be continued to use when service node restarts as a result, The service node for being conducive to restart restores state in time.
Main management node numbers the received first information, is stored in caching (Log cache pool), service node is to The first information of processing is numbered, and is stored in shared drive, maximum executed instructions serial number is read when restarting, according to shared drive In record restore state to close before.
If the maximum executed instructions serial number of new demand servicing node is identical as the instruction serial number that received maximum in caching, explanation Main management node does not receive new in new demand servicing node closing restarting process or before the cold start-up of new demand servicing node The first information can directly register new demand servicing node at this time.
If the maximum executed instructions serial number of new demand servicing node differs smaller with the instruction serial number that received maximum in caching (x is less than or equal to the first preset value) illustrates to close in restarting process in new demand servicing node or is cold-started it in new demand servicing node Preceding main management node receives a small amount of new first information, can read these new first information (x) from caching at this time, And new demand servicing node is sent it in order, last state is obtained after sequentially handling for new demand servicing node, and it is new then to register this Service node.Wherein, the first preset value can be by those skilled in the art's self-setting, can for example, store in caching first The maximum value of information content.
If the maximum executed instructions serial number of new demand servicing node differs larger with the instruction serial number that received maximum in caching (x is greater than the first preset value) illustrates main in new demand servicing node closing restarting process or before the cold start-up of new demand servicing node Management node has received a large amount of new first information, since buffer memory capacity is limited, can not obtain all these new the One information can not restore the state of new demand servicing node to last state, can only refuse the registration of new demand servicing node at this time.
Fig. 7 B be service node have state when redundant node register flow path flow chart.
With reference to Fig. 7 B, when service node has state, redundant node register flow path may include:
Step S75 determines the maximum executed instructions serial number n1 and the registered service node of the new demand servicing node Maximum executed instructions serial number n3 serial number absolute value of the difference y;
Step S76 registers the new demand servicing node if y is equal to zero;
Step S77 calls the y first information of serial number n1 to n3 if y is less than or equal to the second preset value from caching It is sent to the new demand servicing node, after the new demand servicing node has executed the y first information, registers the new demand servicing section Point;
Step S78, if y is greater than second preset value, copy state information is sent out from the registered service node After giving the new demand servicing node, the new demand servicing node is registered.
When registered service node has been processed by some first information, the state of each registered service node has been sent out Change has been given birth to, has needed that new demand servicing node is made equally to execute the processed first information of registered service node at this time, has been adjusted in time The state of new demand servicing node is consistent with registered service node, registers new demand servicing node, is just able to maintain all registered services Node is identical.
In some cases, new demand servicing node is the service node restarted after being closed, and new demand servicing node can be read at this time The data such as node state before closing are taken, are restored to state according to these data before being closed.In other cases, new clothes The starting state of business node can be cold start-up, execute record (i.e. n1=0) without information.
It has executed and has referred in the maximum of the maximum executed instructions serial number n1 for calculating new demand servicing node and registered service node When enabling the serial number absolute value of the difference y of serial number n3, if the new demand servicing node is in closing restarting process or when being cold-started, other are right Deng service node and when the untreated new first information, the processed maximum executed instructions serial number n1 of new demand servicing node has can Can be identical as the maximum executed instructions serial number n3 of registered service node, i.e. y=0.The new demand servicing can be directly registered at this time Node.
In other cases, if the clothes of the new demand servicing node other equities in closing restarting process or when being cold-started Business node has only handled a small amount of new first information, such as less than the second preset value is equal to, can read these from caching The new first information (y), and send it to new demand servicing node in order, obtains and after sequentially handle for new demand servicing node The consistent state of node is handled, the new demand servicing node is then registered.
It is less due to caching the data volume that can be stored, when the new demand servicing node is in closing restarting process or is cold-started Service nodes of other equities have handled the more new first information, such as when more than the second preset value, can be from registered It copy state information and is copied in new demand servicing node in service node, with service node that realize new demand servicing node and registered State Fast synchronization.
In some embodiments, the second preset value both can be identical as the first preset value, can also be different.Second preset value Can be by those skilled in the art's self-setting, such as can be the maximum value for the first information quantity that can be stored in caching, or Person, can also according to the maximum value for the first information quantity that can be stored in caching, read and the first information and be synchronized to from caching The time T1 of new demand servicing node and copy state information and the time T2 of new demand servicing node is copied to from registered service node Critical value determine.
Fig. 8 is the interaction schematic diagram in an embodiment of the present disclosure in service node registration process.
With reference to Fig. 8, in service node registration process, service node note is sent from new demand servicing node to main management node first Volume request, the service node registration request include the identifier (ID, Identity document) of new demand servicing node and execute text Part secret value (MD5).
Main management node judges whether the ID corresponds to registered nodes, if not, judging that the maximum of new demand servicing node has been held Row instructs the maximum for the identifier for corresponding to new demand servicing node in serial number n1 and caching to receive the serial number difference for instructing serial number n2 Whether absolute value x is equal to zero, if being equal to zero, registers the new demand servicing node and disappears to the transmission service node registration of new demand servicing node Breath;If being not equal to zero, judge whether x is less than or equal to the first preset value, if not, sending service node registration to service node Refuse information is held if so, reading the nearest x first information from caching and being sent to new demand servicing node in new demand servicing node It has gone and the first information and has sent the first information and be finished after message, registration new demand servicing node, which simultaneously sends new demand servicing node, to be taken Business Node registry message.
If the ID corresponds to registered nodes, judge whether new demand servicing node is consistent with the MD5 value of registered service node, If inconsistent, service node registration reject message is sent to new demand servicing node;If consistent, the maximum of new demand servicing node is judged The serial number absolute value of the difference y of the maximum executed instructions serial number n3 of executed instructions serial number n1 and registered service node whether etc. In zero, if being equal to zero, registers the new demand servicing node and service node registration message is sent to new demand servicing node, if be not equal to Zero, judge y whether less than the second preset value.If y is less than or equal to the second preset value, nearest y first letter is read from caching New demand servicing node is ceased and be sent to, the first information has been executed in new demand servicing node and has sent the first information and be finished message Afterwards, it registers new demand servicing node and service node registration message is sent to new demand servicing node;If y is greater than the second preset value, to Registration service node sends status information copy instruction, in registered service node copy state information and is sent to new demand servicing section Point, new demand servicing node occupied state information are simultaneously sent after status information filling finishes feedback, and registration new demand servicing node is simultaneously to new clothes Business node sends service node registration message.
Fig. 9 is the schematic diagram of an embodiment of the present disclosure interior joint registration.
With reference to Fig. 9, service node 92 can transmit executable to main management node 91 (Gdriver) during Node registry File encryption value (MD5 value) and maximum executed instructions serial number, for main management node 91 examine the service node whether with infused Volume node is identical.
Determine registration new demand servicing node after, in order to keep identifier equity service node completely the same, service cluster Initialization information can be issued to new demand servicing node, initialization information includes random seed and drives logic certainly, from driving logic Timestamp including management node.
For example, service node (Application Node, application program node) is used as a service node, internal One is had from the logic (Tick operation) of driving and the logic of judgement of some times, in order to guarantee the shape of each service node State is consistent, needs time and the Tick logic of unified each service node, in one embodiment of the present disclosure grasps the two It is uniformly put into service cluster and is managed, grasped by the Tick of main management node (Master Gdriver) driving service node Make and issues timestamp.
Figure 10 is the schematic diagram initialized in an embodiment of the present disclosure to service node.
With reference to Figure 10, main management node 101 (Master Gdriver) is requested Tick by uplink Log circular buffering pond It is sent to service node 102, includes the timestamp of service cluster in each Tick request.The time interval of Tick request for example may be used It is primary to be configured to 10ms.Service node 102 is replaced using the time that main management node 101 (Master Gdriver) is sent System time function (hook time/gettimeofday), if " the management setting time " in figure is under main management node 101 The time (being accurate to millisecond) of hair.
After the completion of initialization, each service node is identical.In conjunction with above-described embodiment, each service node becomes one A independent " state machine ", each state machine possesses consistent initialization, consistent execution logic, as long as guaranteeing consistent defeated Enter, so that it may consistent response and data are obtained, to realize the disaster tolerance of stateful service.
Figure 11 is that main management node in an embodiment of the present disclosure (Master Gdriver) carries out pipe to upstream data packet The schematic diagram of reason.
With reference to Figure 11, client 111 locally can be set a lesser Log circular buffering pond and (be mainly used for retransmitting, one As be management node switching when can just use).Log (the corresponding Log of a first information) quilt that client 111 generates After being sent to server, the first incremental serial number, thus main management can be stamped by main management node 112 (Master Gdriver) Node 112 (Master Gdriver) can recorde the first serial number of maximum for the information that each client 111 is sent, according to one Whether the first serial number for the Log that client is sent and its maximum first serial number for having recorded continuously detect whether bottom pour ladle, if First serial number discontinuously then judges bottom pour ladle, and notice client retransmits Log.In some embodiments, main management node is for client The bottom pour ladle at end can be skipped, therefore the capacity of Log queue can be smaller and nonessential inside client.
It should be noted that each client possesses first serial number, each clothes in the caching of main management node The identifier of business node possesses a maximum and has received instruction serial number, and the corresponding object of the two serial numbers is different.
The Log that all clients are sent is pooled in uplink Log queue by main management node 112 (Master Gdriver) It sends, and stamps new the second incremental serial number to each Log.Service node 113 records the maximum second of the Log executed Serial number, the Log sent to main management node 112 check whether bottom pour ladle, and main management node 112 is notified to retransmit if bottom pour ladle.Clothes Node 113 be engaged in bottom pour ladle zero tolerance, any one data packet cannot be skipped, due to the nearest request of all clients to be cached Packet, and each management node is consistent with the Log of main management node buffering tankage, content, therefore the Log buffer pool of each management node It can be set to larger.
Figure 12 is that main management node in an embodiment of the present disclosure (Master Gdriver) carries out pipe to downlink data packet The schematic diagram of reason.
With reference to Figure 12, local Log circular buffering pond is being locally located (under retransmitting in each service node 121 Row Log, capacity can be set to it is smaller, it is nonessential).Each service node 121 beats downlink data packet (the second information) in local Upper incremental third serial number, main management node 122 (Master Gdriver) determine these downlinks according to third serial number is identical Data packet is to the response of the same upstream data packet (first information) or the active from the identical service node 121 of state It sends.
To multiple downlink data packets with the same third serial number, main management node 122 (Master Gdriver) can Only to take the downlink data packet reached at first to be handed down to client 123, by the downlink data packet of rear arrival and first same serial number Data packet compare (mode be, for example, compare using memcmp function), if after the downlink data packet that reaches with first Downlink data packet with serial number is inconsistent, then it is assumed that it goes wrong in the state of the corresponding service node of rear downlink data packet, it can It is offline in the corresponding service node of rear downlink data packet to notify.Client 123 can locally record paid-in lower line number Whether bottom pour ladle continuously is detected according to the maximum third serial number of packet, and then by third serial number, main management section is notified if bottom pour ladle Point (Master Gdriver) retransmits (can jump packet).
Figure 13 is the schematic diagram of one application scenarios of the disclosure.
With reference to Figure 13, in an application scenarios, the embodiment of the present disclosure can be applied to the service cluster 11 of mobile phone games. Service cluster 11 realizes that service cluster 11 connects multiple client 12, client by server 13 by multiple servers 13 12 mobile terminals such as can be mobile phone, tablet computer.In the embodiment shown in fig. 12, for example it is arranged on every server 13 There are a management node (main management node 111 or backup management node 112) and the service node in service cluster 11 131。
Main management node 111 in service cluster 11 receives the first information that client 12 is sent, and distributes them to and is located at The service node 131 of multiple equities in multiple servers 13;Main management node 111 sends out the service node 131 of multiple equities One in multiple second information sent is sent to client 12, thus, when part server or service node break down, It can ensure that the instruction of client 12 can be executed correctly and can receive correct feedback, alternatively, ensureing service node to client The normal execution of 12 transmission message.
In addition, main management node 111 is when receiving the first information or the second information by the first information and the second synchronizing information To full backup management node 112, so that service cluster 11 can be when monitoring the failure of main management node 111, more New main management node is elected in a backup management node 112, backup management node is enable to timely enter working condition.It is logical The dual redundant design for crossing service node and management node, improves redundancy ability, increases the reliability of system.
Cell Partial Process is often stateful in mobile phone games architecture design, and not can be removed state, whole During a system operation, these stateful processes exist as single-point, when one of process failure, can lead It causes partial service unavailable, will lead to cell when serious and withdraw.
Using the embodiment of the present disclosure provide disaster recovery method management client and server between communication when, service cluster 11 message distributions for sending client 12 to multiple complete equities activity service node 131, according to these service nodes 131 The multiple information sent send information to client 12, even if partial service node or server break down, server cluster 11 feedbacks that again may be by remaining activity service node select correct feedback information, not will lead to service it is unavailable or The loss of state of service node does not influence player's normal experience game, it can realizes in the case where not removal state to having The disaster tolerance of the service processes of state improves the availability of system.In addition, this method is also applied for other, there are status services is The business of the problem of single-point.
In conclusion the disaster tolerance of stateful service may be implemented in the method that the embodiment of the present disclosure provides, stateful clothes are solved The problem of business Single Point of Faliure leads to the system failure, improve the availability of whole system.
Corresponding to above method embodiment, the disclosure also provides a kind of disaster tolerance device, can be used for executing above method reality Apply example.
Figure 14 is a kind of block diagram of disaster tolerance device in one exemplary embodiment of the disclosure.
With reference to Figure 14, the disaster tolerance device 1400 that the embodiment of the present disclosure provides may include: information distribution module 1402, can be with It is set as the first information from client being sent to the service node of N number of equity, N is the integer more than or equal to 2;Information connects Module 1404 is received, can be set to M the second information for receiving the service node from N number of equity, M is less than or equal to N Positive integer;Information sending module 1406 can be set to one in the M the second information being sent to the client End.
In the exemplary embodiment, disaster tolerance device 1400 is set in service cluster, and the service cluster includes main management Node and more than one backup management node, the main management node is for executing the disaster recovery method, the backup management section Point is for the candidate target when the main management node breaks down as new main management node.
In the exemplary embodiment, information distribution module 1402 is arranged are as follows: determines the corresponding service section of the first information The first information, the service node of N number of equity is sent to according to the identifier by the identifier of point.
In the exemplary embodiment, second information includes the serial number for the second information that the service node is sent, letter It ceases receiving module 1404 to be arranged are as follows: in multiple information from service node, according to the second of service node transmission The serial number of information determines the second information for corresponding to the first information.
In the exemplary embodiment, information sending module 1406 is arranged are as follows: by second information that arrival time is earliest It is sent to the client.
In the exemplary embodiment, information sending module 1406 be arranged are as follows: if arrival time posterior second information with The second information of the arrival time earliest is identical, abandons the arrival time posterior second information, otherwise arrives described in notice It is closed up to the corresponding service node of time posterior second information.
In the exemplary embodiment, disaster tolerance device 1400 further includes Node registry module 1408, Node registry module 1408 Setting are as follows: response service node registration request determines the identifier of new demand servicing node to be registered, and the service node registration is asked Seek the execution file encryption value including the new demand servicing node;It is saved if there is no registered service corresponding with the identifier Point starts isolated node register flow path to the new demand servicing node;Otherwise, the execution file encryption of the new demand servicing node is compared It is worth whether consistent with the execution file encryption value of the registered service node;If consistent, the new demand servicing node is started Redundant node register flow path;If inconsistent, refuse the service node registration request.
In the exemplary embodiment, isolated node register flow path comprises determining that the maximum executed instructions of new demand servicing node Serial number n1 and the serial number absolute value of the difference for having received instruction serial number n2 in caching corresponding to the maximum of the identifier of new demand servicing node x;If x is equal to zero, new demand servicing node is registered;If x is less than or equal to preset value, the x of serial number n1 to n2 is called from caching A first information is sent to new demand servicing node, after new demand servicing node has executed the x first information, registers new demand servicing node;Such as Fruit x is greater than preset value, refuses service node registration request.
In the exemplary embodiment, redundant node register flow path comprises determining that the maximum executed instructions of new demand servicing node The serial number absolute value of the difference y of the maximum executed instructions serial number n3 of serial number n1 and registered service node;If y is equal to zero, note Volume new demand servicing node;If y is less than or equal to preset value, the y first information of serial number n1 to n3 is called to be sent to from caching New demand servicing node registers new demand servicing node after new demand servicing node has executed the y first information;If y is greater than preset value, from After copy state information is sent to new demand servicing node in registered service node, new demand servicing node is registered.
In the exemplary embodiment, Node registry module 1408 is arranged are as follows: issues initialization letter to the new demand servicing node Breath, the initialization information include random seed and from driving logic, it is described from driving logic include the management node when Between stab.
In the exemplary embodiment, information distribution module 1402 is arranged are as follows: the first information is sent to described in whole Backup management node;Information sending module 1406 is arranged are as follows: is sent to one in the M the second information all described standby Part management node.
In the exemplary embodiment, the service node of N number of equity uses shared drive, and each service node exists There is independent memory space in the shared drive.
In the exemplary embodiment, when the service node restarted after being to be closed in new demand servicing node, the new demand servicing section The memory space used in the shared drive before being closed is continued to use when point restarting.
Since each function of device 1400 has been described in detail in its corresponding embodiment of the method, the disclosure in this not It repeats again.
It should be noted that although being referred to several modules or list for acting the equipment executed in the above detailed description Member, but this division is not enforceable.In fact, according to embodiment of the present disclosure, it is above-described two or more Module or the feature and function of unit can embody in a module or unit.Conversely, an above-described mould The feature and function of block or unit can be to be embodied by multiple modules or unit with further division.
In an exemplary embodiment of the disclosure, a kind of electronic equipment that can be realized the above method is additionally provided.
Person of ordinary skill in the field it is understood that various aspects of the invention can be implemented as system, method or Program product.Therefore, various aspects of the invention can be embodied in the following forms, it may be assumed that complete hardware embodiment, complete The embodiment combined in terms of full Software Implementation (including firmware, microcode etc.) or hardware and software, can unite here Referred to as circuit, " module " or " system ".
The electronic equipment 1500 of this embodiment according to the present invention is described referring to Figure 15.The electricity that Figure 15 is shown Sub- equipment 1500 is only an example, should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in figure 15, electronic equipment 1500 is showed in the form of universal computing device.The component of electronic equipment 1500 can To include but is not limited to: at least one above-mentioned processing unit 1510, connects not homologous ray at least one above-mentioned storage unit 1520 The bus 1530 of component (including storage unit 1520 and processing unit 1510).
Wherein, the storage unit is stored with program code, and said program code can be held by the processing unit 1510 Row, so that various according to the present invention described in the execution of the processing unit 1510 above-mentioned " illustrative methods " part of this specification The step of illustrative embodiments.For example, the processing unit 1510 can execute step as shown in Figure 2.
Storage unit 1520 may include the readable medium of volatile memory cell form, such as Random Access Storage Unit (RAM) 15201 and/or cache memory unit 15202, it can further include read-only memory unit (ROM) 15203.
Storage unit 1520 can also include program/utility with one group of (at least one) program module 15205 15204, such program module 15205 includes but is not limited to: operating system, one or more application program, other programs It may include the realization of network environment in module and program data, each of these examples or certain combination.
Bus 1530 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.
Electronic equipment 1500 can also be with one or more external equipments 1600 (such as keyboard, sensing equipment, bluetooth equipment Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 1500 communicate, and/or with make The electronic equipment 1500 can with it is one or more of the other calculating equipment be communicated any equipment (such as router, modulation Demodulator etc.) communication.This communication can be carried out by input/output (I/O) interface 1550.Also, electronic equipment 1500 Network adapter 1560 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public affairs can also be passed through Common network network, such as internet) communication.As shown, network adapter 1560 passes through its of bus 1530 and electronic equipment 1500 The communication of its module.It should be understood that although not shown in the drawings, other hardware and/or software can be used in conjunction with electronic equipment 1500 Module, including but not limited to: microcode, device driver, redundant processing unit, external disk drive array, RAID system, magnetic Tape drive and data backup storage system etc..
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the disclosure The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating Equipment (can be personal computer, server, terminal installation or network equipment etc.) is executed according to disclosure embodiment Method.
Above-mentioned attached drawing is only schematically illustrating for processing included by method according to an exemplary embodiment of the present invention, without It is limitation purpose.It can be readily appreciated that above-mentioned processing shown in the drawings does not indicate or limits the time sequencing of these processing.In addition, It is also easy to understand, these processing, which can be, for example either synchronously or asynchronously to be executed in multiple modules.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the disclosure Its embodiment.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or Person's adaptive change follows the general principles of this disclosure and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.The description and examples are only to be considered as illustrative, and the true scope of the disclosure and design are wanted by right It asks and points out.

Claims (15)

1. a kind of disaster recovery method, which is characterized in that executed by the service cluster being set in server, the service cluster includes Main management node and more than one backup management node, the main management node is for executing the disaster recovery method, the backup Management node is used for the candidate target when the main management node breaks down as new main management node, the disaster tolerance side Method includes:
The first information from client is sent to the service node of N number of equity, N is the integer more than or equal to 2;
M the second information of the service node from N number of equity are received, M is the positive integer less than or equal to N;
One in the M the second information is sent to the client.
2. disaster recovery method as described in claim 1, which is characterized in that described that the first information from client is sent to N The service node of a equity includes: that the first information is sent to all backup management nodes;It is described by the M the One in two information is sent to the client further include: one in the M the second information is sent to described in whole Backup management node.
3. disaster recovery method as described in claim 1, which is characterized in that a transmission by the M the second information Include: to the client
Arrival time earliest second information is sent to the client.
4. disaster recovery method as claimed in claim 3, which is characterized in that a transmission by the M the second information To the client further include:
If arrival time posterior second information is identical as the second earliest information of the arrival time, the arrival is abandoned Time posterior second information;
If arrival time posterior second information is different from the second earliest information of the arrival time, the arrival is notified The corresponding service node of time posterior second information is closed.
5. disaster recovery method as described in claim 1, which is characterized in that further include:
Response service node registration request determines the identifier of new demand servicing node to be registered, the service node registration request packet Include the execution file encryption value of the new demand servicing node;
If there is no registered service node corresponding with the identifier, to new demand servicing node starting isolated node note Volume process;
If there is registered service node corresponding with the identifier, the execution file encryption of the new demand servicing node is compared It is worth whether consistent with the execution file encryption value of the registered service node;
If consistent, redundant node register flow path is started to the new demand servicing node;
If inconsistent, refuse the service node registration request.
6. disaster recovery method as claimed in claim 5, which is characterized in that the isolated node register flow path includes:
Determine the knowledge for corresponding to the new demand servicing node in the maximum executed instructions serial number n1 and caching of the new demand servicing node The maximum not accorded with has received the serial number absolute value of the difference x of instruction serial number n2;
If x is equal to zero, the new demand servicing node is registered;
If x is less than or equal to the first preset value, the x first information of serial number n1 to n2 is called to be sent to from caching described new Service node registers the new demand servicing node after the new demand servicing node has executed the x first information;
If x is greater than first preset value, refuse the service node registration request, wherein n1, n2 are positive integer.
7. disaster recovery method as claimed in claim 5, which is characterized in that the redundant node register flow path includes:
The maximum of the maximum executed instructions serial number n1 for determining the new demand servicing node and the registered service node have executed Instruct the serial number absolute value of the difference y of serial number n3;
If y is equal to zero, the new demand servicing node is registered;
If y is less than or equal to the second preset value, the y first information of serial number n1 to n3 is called to be sent to from caching described new Service node registers the new demand servicing node after the new demand servicing node has executed the y first information;
If y is greater than second preset value, copy state information is sent to the new clothes from the registered service node It is engaged in after node, registers the new demand servicing node, wherein n1, n3 are positive integer.
8. such as the described in any item disaster recovery methods of claim 5~7, which is characterized in that the registration new demand servicing node packet It includes:
Initialization information is issued to the new demand servicing node, the initialization information includes random seed and drives logic, institute certainly Stating from logic is driven includes the timestamp of the management node.
9. disaster recovery method as described in claim 1, which is characterized in that described that the first information from client is sent to N The service node of a equity includes:
Determine the identifier of the corresponding service node of the first information;
The first information is sent to the service node of N number of equity according to the identifier.
10. disaster recovery method as described in claim 1, which is characterized in that second information includes that the service node is sent The second information serial number, a second information of M for receiving the service node from N number of equity include:
In multiple information from service node, the serial number of the second information sent according to the service node, which determines, to be corresponded to In the second information of the first information.
11. disaster recovery method as described in claim 1, which is characterized in that the service node of N number of equity uses shared interior It deposits, each service node has independent memory space in the shared drive.
12. disaster recovery method as claimed in claim 11, which is characterized in that the service restarted after being to be closed in new demand servicing node When node, when new demand servicing node reboot, continues to use be closed before stored used in the shared drive it is empty Between.
13. a kind of disaster tolerance device characterized by comprising
Information distribution module is set as the first information from client being sent to the service node of N number of equity, N be greater than Integer equal to 2;
Information receiving module, is set as receiving M the second information of the service node from N number of equity, M be less than or equal to The positive integer of N;
Information sending module is set as one in the M the second information being sent to the client.
14. a kind of disaster tolerance system characterized by comprising
At least one client;
Server cluster is coupled to the client, is provided with service cluster, and the service cluster includes main management node and one A above backup management node, the backup management node are used for when the main management node breaks down as new supervisor The candidate target of node is managed, the main management node is for executing such as the described in any item disaster recovery methods of claim 1-12.
15. a kind of electronic equipment characterized by comprising
Memory;And
It is coupled to the processor of the memory, the processor is configured to the instruction based on storage in the memory, Execute such as the described in any item disaster recovery methods of claim 1-12.
CN201910521769.0A 2019-06-17 2019-06-17 Disaster recovery method, device, system and electronic equipment Active CN110351122B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910521769.0A CN110351122B (en) 2019-06-17 2019-06-17 Disaster recovery method, device, system and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910521769.0A CN110351122B (en) 2019-06-17 2019-06-17 Disaster recovery method, device, system and electronic equipment

Publications (2)

Publication Number Publication Date
CN110351122A true CN110351122A (en) 2019-10-18
CN110351122B CN110351122B (en) 2022-02-25

Family

ID=68182216

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910521769.0A Active CN110351122B (en) 2019-06-17 2019-06-17 Disaster recovery method, device, system and electronic equipment

Country Status (1)

Country Link
CN (1) CN110351122B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110740045A (en) * 2019-10-28 2020-01-31 支付宝(杭州)信息技术有限公司 Instruction multicast method and system
CN111147567A (en) * 2019-12-23 2020-05-12 中国银联股份有限公司 Service calling method, device, equipment and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102394807A (en) * 2011-08-23 2012-03-28 北京京北方信息技术有限公司 System and method for decentralized scheduling of autonomous flow engine load balancing clusters
CN102938778A (en) * 2012-10-19 2013-02-20 浪潮电子信息产业股份有限公司 Method for realizing multi-node disaster tolerance in cloud storage
CN103580902A (en) * 2012-08-07 2014-02-12 腾讯科技(深圳)有限公司 Computer information system and dynamic disaster recovery method thereof
US20150169414A1 (en) * 2013-12-14 2015-06-18 Netapp, Inc. Techniques for lif placement in san storage cluster synchronous disaster recovery
CN106874142A (en) * 2015-12-11 2017-06-20 华为技术有限公司 A kind of real time data fault-tolerance processing method and system
CN109194718A (en) * 2018-08-09 2019-01-11 玄章技术有限公司 A kind of block chain network and its method for scheduling task
CN109656911A (en) * 2018-12-11 2019-04-19 江苏瑞中数据股份有限公司 Distributed variable-frequencypump Database Systems and its data processing method
CN109739685A (en) * 2018-11-22 2019-05-10 广州市保伦电子有限公司 A kind of principal and subordinate's hot backup data synchronous method and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102394807A (en) * 2011-08-23 2012-03-28 北京京北方信息技术有限公司 System and method for decentralized scheduling of autonomous flow engine load balancing clusters
CN103580902A (en) * 2012-08-07 2014-02-12 腾讯科技(深圳)有限公司 Computer information system and dynamic disaster recovery method thereof
CN102938778A (en) * 2012-10-19 2013-02-20 浪潮电子信息产业股份有限公司 Method for realizing multi-node disaster tolerance in cloud storage
US20150169414A1 (en) * 2013-12-14 2015-06-18 Netapp, Inc. Techniques for lif placement in san storage cluster synchronous disaster recovery
CN106874142A (en) * 2015-12-11 2017-06-20 华为技术有限公司 A kind of real time data fault-tolerance processing method and system
CN109194718A (en) * 2018-08-09 2019-01-11 玄章技术有限公司 A kind of block chain network and its method for scheduling task
CN109739685A (en) * 2018-11-22 2019-05-10 广州市保伦电子有限公司 A kind of principal and subordinate's hot backup data synchronous method and storage medium
CN109656911A (en) * 2018-12-11 2019-04-19 江苏瑞中数据股份有限公司 Distributed variable-frequencypump Database Systems and its data processing method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110740045A (en) * 2019-10-28 2020-01-31 支付宝(杭州)信息技术有限公司 Instruction multicast method and system
CN111147567A (en) * 2019-12-23 2020-05-12 中国银联股份有限公司 Service calling method, device, equipment and medium

Also Published As

Publication number Publication date
CN110351122B (en) 2022-02-25

Similar Documents

Publication Publication Date Title
US7225356B2 (en) System for managing operational failure occurrences in processing devices
CN105511805B (en) The data processing method and device of cluster file system
CN109951331B (en) Method, device and computing cluster for sending information
CN100591031C (en) Methods and apparatus for implementing a high availability fibre channel switch
US9934242B2 (en) Replication of data between mirrored data sites
CN111338773B (en) Distributed timing task scheduling method, scheduling system and server cluster
CN110224871A (en) A kind of high availability method and device of Redis cluster
US20080244552A1 (en) Upgrading services associated with high availability systems
EP2224341B1 (en) Node system, server switching method, server device, and data transfer method
CN113641511A (en) Message communication method and device
CN102394914A (en) Cluster brain-split processing method and device
CN110099084B (en) Method, system and computer readable medium for ensuring storage service availability
WO2012097588A1 (en) Data storage method, apparatus and system
CN104158707A (en) Method and device of detecting and processing brain split in cluster
CN108958984A (en) Dual-active based on CEPH synchronizes online hot spare method
CN108063813A (en) The method and system of cryptographic service network parallelization under a kind of cluster environment
CN110351122A (en) Disaster recovery method, device, system and electronic equipment
CN110895469A (en) Method and device for upgrading dual-computer hot standby system, electronic equipment and storage medium
CN108512753B (en) Method and device for transmitting messages in cluster file system
CN107357800A (en) A kind of database High Availabitity zero loses solution method
EP2456163B1 (en) Registering an internet protocol phone in a dual-link architecture
JP4806382B2 (en) Redundant system
CN112306755B (en) High-availability implementation method and system based on micro front-end architecture
CN115086203A (en) Data transmission method, data transmission device, electronic equipment and computer-readable storage medium
CN114301763A (en) Distributed cluster fault processing method and system, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant