CN104702693A - Processing method for two-node system partitioning and node - Google Patents

Processing method for two-node system partitioning and node Download PDF

Info

Publication number
CN104702693A
CN104702693A CN201510121396.XA CN201510121396A CN104702693A CN 104702693 A CN104702693 A CN 104702693A CN 201510121396 A CN201510121396 A CN 201510121396A CN 104702693 A CN104702693 A CN 104702693A
Authority
CN
China
Prior art keywords
node
correspondent
distributed application
message
sends
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510121396.XA
Other languages
Chinese (zh)
Other versions
CN104702693B (en
Inventor
佟强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201510121396.XA priority Critical patent/CN104702693B/en
Publication of CN104702693A publication Critical patent/CN104702693A/en
Application granted granted Critical
Publication of CN104702693B publication Critical patent/CN104702693B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services

Abstract

The embodiment of the invention provides a processing method for two-node system partitioning and a node. Nodes of the two-node system comprise distribution type application and communication agent. The method comprises the following steps: determining whether the nodes are effective when the two-node system has faults; when the nodes are effective, sending correct response information that indicates that the distribution application reaches quorum to the distribution application. According to the embodiment of the invention, communication agent is increased on the nodes of the two-node system, so that the distribution application of one node reaches the quorum through the communication agent when the two-node system has faults; the distribution type based on the quorum can be applied to the two-node system, and can normally work.

Description

The processing method of two node system subregions and node
Technical field
The present invention relates to distributed system field, and more specifically, relate to processing method and the node of two node system subregions.
Background technology
Distributed system is that multiple computer is interconnected by communication line and the coupled system formed.Distributed system is the set of several independently computers, but concerning the user of this system, whole system is just as a computer.Under the support of distributed system, the computer of interconnection can co-ordination mutually, jointly accomplishes a task.In the high-availability cluster of multinode, resolving strategy is used to decide the operating state of cluster.Normally used resolving strategy is the half whether node number active in computing cluster exceedes whole clustered node sum.Each node can be decided by the connection of heartbeat network between node whether to enliven.
For the distributed system of a N node, the quorum of system is N/2+1.Usually, distributed system interior joint number is odd number.And when distributed system interior joint number exceedes quorum, whole system can normally work.So, based on the distributed system of quorum, usually need configuration at least three nodes, interstitial content just can be made to be greater than quorum.Such distributed system can also tolerate that part of nodes lost efficacy, and makes effective interstitial content be more than or equal to quorum.
Distributed system based on quorum is generally not used in the situation of two nodes.And if only have two nodes in distributed system, as long as so have a node failure in two nodes, whole system cannot normally work owing to having a quorum, and causes two node system subregion or fissures.
Summary of the invention
The embodiment of the present invention provides a kind of processing method and node of two node system subregions, and the distributed system based on quorum can be made to work for two node systems are also normal.
First aspect, provide a kind of processing method of two node system subregions, described method is used for two node systems based on quorum, node in described two node systems comprises Correspondent and Distributed Application, it is characterized in that, described method comprises: when described two node systems break down, and described Correspondent determines that whether the node at described Correspondent place is effective; When the node at described Correspondent place is effective, described Correspondent sends to the Distributed Application of the node at described Correspondent place the correct response message indicating described Distributed Application to have a quorum.
In conjunction with first aspect, in a kind of implementation of first aspect, described method also comprises: when the node failure at described Correspondent place, described Correspondent sends the error response message of the described Distributed Application quorum is not constituted of instruction to described Distributed Application, or, no longer send message to described Distributed Application.
In conjunction with first aspect and above-mentioned implementation thereof, in the another kind of implementation of first aspect, described method also comprises: described Correspondent sends co-ordination message to the Correspondent of another node in described two node systems; Described Correspondent to the Correspondent of another node described send moment of co-ordination message play in the first duration the reply message that the Correspondent that do not receive another node described sends for described co-ordination message time, determine that described two node systems break down.
In conjunction with first aspect and above-mentioned implementation thereof, in the another kind of implementation of first aspect, described two node systems also comprise the network equipment, and described Correspondent determines whether the node at described Correspondent place effectively comprises: described Correspondent sends test packet to the described network equipment; Described Correspondent does not receive the response message that the described network equipment sends for described test packet from the moment sending described test packet to the described network equipment in the second duration time, determine described node failure; Described Correspondent receives the response message that the described network equipment sends for described test packet from the moment sending described test packet to the described network equipment in the second duration time, determine that described node is effective.
In conjunction with first aspect and above-mentioned implementation thereof, in the another kind of implementation of first aspect, described two node systems also comprise the serial ports of another node connected in described node and described two node systems, and described Correspondent determines whether the node at described Correspondent place effectively comprises: described Correspondent sends detect-message by described serial ports to another node in described two node systems; During the feedback message that another node described in described Correspondent does not receive in the 3rd duration from the moment sending described detect-message to another node described sends for described detect-message, determine that described node is effective; During the feedback message that another node described in described Correspondent receives in the 3rd duration from the moment sending described detect-message to another node described sends for described detect-message, according to the effective priority information of node, determine that whether described node is effective.
In conjunction with first aspect and above-mentioned implementation thereof, in the another kind of implementation of first aspect, described two node systems also comprise shared disk, and described Correspondent determines whether the node at described Correspondent place effectively comprises: described Correspondent sends check data bag to the Correspondent of another node in described two node systems; Described Correspondent does not receive reply packet that the Correspondent for another node described sends constantly from send the moment of described check data bag to the Correspondent of another node described in the 4th duration, determine that described node is effective; Described Correspondent receives the reply packet that the Correspondent for another node described sends from send the moment of described check data bag to the Correspondent of another node described in the 4th duration time, determine that whether described node is effective according to the effective priority information of node.
In conjunction with first aspect and above-mentioned implementation thereof, in the another kind of implementation of first aspect, described node also comprises the shadow process of the Distributed Application of another node in described two node systems, described method also comprises: described Correspondent, when described node is effective, starts the shadow process of the Distributed Application of another node described; Described Distributed Application receive that client sends for asking the request message processed data, and by the shadow process transmission described request message of described Correspondent to the Distributed Application of another node described; The shadow process of the Distributed Application of another node described receives described request message, and processes described data according to described request message.
In conjunction with first aspect and above-mentioned implementation thereof, in the another kind of implementation of first aspect, described method also comprises: when described two node systems do not break down, described Correspondent receives the first packet that described Distributed Application sends, and forwards described first packet to the Correspondent of another node in described two node systems; Or, when described two node systems do not break down, the second packet that the Correspondent that described Correspondent receives another node in described two node systems sends, and forward described second packet to described Distributed Application.
In conjunction with first aspect and above-mentioned implementation thereof, in the another kind of implementation of first aspect, described node is physical server or virtual server.
Second aspect, provides a kind of node, and described node belongs to two node systems based on quorum, it is characterized in that, described node comprises Distributed Application and Correspondent; When described two node systems break down, described Correspondent, for determining that whether described node is effective; Described Correspondent, also for sending the correct response message indicating described Distributed Application to have a quorum to described Distributed Application when described node is effective.
In conjunction with second aspect, in a kind of implementation of second aspect, described Correspondent, also for sending the error response message of the described Distributed Application quorum is not constituted of instruction to described Distributed Application when the node failure at described Correspondent place, or, no longer send message to described Distributed Application.
In conjunction with second aspect and above-mentioned implementation thereof, in the another kind of implementation of second aspect, described Correspondent, also for from when sending reply message that the Correspondent that do not receive another node described in the first duration the moment of co-ordination message sends for described co-ordination message to the Correspondent of another node in described two node systems, determine that described two node systems break down.
In conjunction with second aspect and above-mentioned implementation thereof, in the another kind of implementation of second aspect, described two node systems also comprise the network equipment; Described Correspondent, during for not receiving response message that the described network equipment sends for described test packet in the second duration from the moment sending test packet to the described network equipment, determines described node failure; Described Correspondent, during for receiving response message that the described network equipment sends for described test packet in the second duration from the moment sending test packet to the described network equipment, determines that described node is effective.
In conjunction with second aspect and above-mentioned implementation thereof, in the another kind of implementation of second aspect, described two node systems also comprise the serial ports of another node connected in described node and described two node systems; Described Correspondent, for sending detect-message by described serial ports to another node in described two node systems; Described Correspondent, during for not receiving feedback message that another node described sends for described detect-message in the 3rd duration from the moment sending described detect-message to another node described, determines that described node is effective; Described Correspondent, during for receiving feedback message that another node described sends for described detect-message in the 3rd duration from the moment sending described detect-message to another node described, according to the effective priority information of node, determine that whether described node is effective.
In conjunction with second aspect and above-mentioned implementation thereof, in the another kind of implementation of second aspect, described two node systems also comprise shared disk; Described Correspondent, sends check data bag for the Correspondent to another node in described two node systems; Described Correspondent, during the reply packet that the Correspondent for not receiving in the 4th duration from send the moment of described check data bag to the Correspondent of another node described for another node described sends, determines that described node is effective; Described Correspondent, during the reply packet sent for the Correspondent receiving another node described in the 4th duration from send the moment of described check data bag to the Correspondent of another node described, determine that whether described node is effective according to the effective priority information of node.
In conjunction with second aspect and above-mentioned implementation thereof, in the another kind of implementation of second aspect, described node also comprises the shadow process of the Distributed Application of another node in described two node systems; Described Correspondent, for when described node is effective, starts the shadow process of the Distributed Application of another node described; Described Distributed Application, for receive that client sends for asking the request message processed data, and by the shadow process transmission described request message of described Correspondent to the Distributed Application of another node described; The shadow process of the Distributed Application of another node described, for receiving described request message, and processes described data according to described request message.
In conjunction with second aspect and above-mentioned implementation thereof, in the another kind of implementation of second aspect, described Correspondent, for when described two node systems do not break down, receive the first packet that described Distributed Application sends, and forward described first packet to the Correspondent of another node in described two node systems; Or, described Correspondent, for when described two node systems do not break down, the second packet that the Correspondent receiving another node in described two node systems sends, and forward described second packet to described Distributed Application.
In conjunction with second aspect and above-mentioned implementation thereof, in the another kind of implementation of second aspect, described node is physical server or virtual server.
The embodiment of the present invention is by the node in two node systems increases Correspondent, the Distributed Application of a node can be made to have a quorum by Correspondent when two node system faults, thus make the distributed system based on quorum can be used in two node systems, and can normally work.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, be briefly described to the accompanying drawing used required in the embodiment of the present invention below, apparently, accompanying drawing described is below only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the schematic diagram of the communication system scene can applying the embodiment of the present invention.
Fig. 2 is the indicative flowchart of the processing method of two node system subregions of one embodiment of the invention.
Fig. 3 is the schematic diagram of the processing method of two node system subregions of one embodiment of the invention.
Fig. 4 is the schematic diagram of the processing method of the two node system subregions of another embodiment of the present invention.
Fig. 5 is the block diagram of the node of one embodiment of the invention.
Fig. 6 is the block diagram of the node of another embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is a part of embodiment of the present invention, instead of whole embodiment.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain under the prerequisite not making creative work, all should belong to the scope of protection of the invention.
Fig. 1 is the schematic diagram of the communication system scene can applying the embodiment of the present invention.As shown in Figure 1, two node systems of the embodiment of the present invention comprise node 1, node 2 102 and the network equipment 103.
Two node systems comprise two nodes.When node is server, can operation system in server, such as, can the operating systems such as Windows, linux be run, and server comprises network card equipment, can network with other server.Should be understood that the node in the embodiment of the present invention can be physical server, also can be virtual server.The network equipment 103 can be switch, gateway or router.Two nodes are connected with switch respectively, by the mutual co-ordination of switch, jointly accomplish a task.In two node systems, because the network fracture between node power-off or node makes node failure, two node systems break down, namely cannot normal communication between two nodes, and system cannot be had a quorum, and causes two node system subregion or fissures.
Each node in two node systems comprises Distributed Application, when in two node systems during a node failure, cannot by two mutual co-ordinations of Distributed Application of two nodes between two nodes, each node also cannot receive the feedback information of another node, effective interstitial content cannot be had a quorum, and causes system normally to work and to break down.
The embodiment of the present invention is by increasing Correspondent at each node, when node failure, the Correspondent of effective node can send to the Distributed Application of correspondence the correct response message that Distributed Application have a quorum, thus effective node is thought have a quorum, continue to perform the Distributed Application on effective node.Like this, two node systems can process in request to external world, and two node systems are normally worked.
Fig. 2 is the indicative flowchart of the processing method of two node system subregions of one embodiment of the invention.
201, when two node systems break down, whether the node at Correspondent determination Correspondent place is effective;
202, when the node at Correspondent place is effective, Correspondent sends to the Distributed Application of the node at Correspondent place the correct response message indicating Distributed Application to have a quorum.
The embodiment of the present invention is by the node in two node systems increases Correspondent, the Distributed Application of a node can be made to have a quorum by Correspondent when two node system faults, thus make the distributed system based on quorum can be used in two node systems, and can normally work.
When node is effective, the correct response message that the Correspondent that effective node is corresponding can be had a quorum to the Distributed Application of correspondence to send instruction Distributed Application.When Distributed Application receives normal response message, think that Distributed Application is had a quorum, normal work can be continued.
Should be understood that when node failure, Correspondent corresponding to failure node sends the error response message of this Distributed Application quorum is not constituted of instruction to its Distributed Application.When node failure, failure node can not feed back to the Distributed Application of correspondence the message whether Distributed Application have a quorum yet.When Distributed Application receives error response message or do not receive message that whether Distributed Application have a quorum in a period of time, failure node stops its Distributed Application.
In traditional highly available cluster system, each node of two node systems increases and has Distributed Application, but do not increase Correspondent, node is connected with ballot dish.For two node systems, because ballot dish accounts for 1 ticket, for any one node failure, as long as effective node is connected with ballot dish, just can have a quorum.But, the method of this solution two node system subregion needs to configure ballot dish in two node systems, configure the more difficult realization of ballot dish in systems in which, also can increase the complexity of system, and, each request process all will calculate quorum, check and the operation of the whether normal and ping gateway of connection between ballot dish and node can affect distributed system like this to the efficiency of asking to process.
The embodiment of the present invention is by increasing Correspondent on each node, so that when two node systems break down, Correspondent sends correct response message to the Distributed Application of correspondence, and to have a quorum, system can proceed communication.
When two node systems normally work, two node systems can receive the request that extraneous client sends, and Distributed Application 1 and Distributed Application 2, by switch and the mutual co-ordination of Correspondent, complete the work for the treatment of to request jointly.Such as, when Distributed Application 1 in two node systems receives the request of the Update Table that extraneous client sends, in order to keep the data consistency in two nodes, Distributed Application 1 is in renewal distribution formula application 1 while data, Distributed Application 1 can also send message by Correspondent 1, switch and Correspondent 2 to Distributed Application 2, and this message also upgrades above-mentioned data for asking Distributed Application 2.In renewal data procedures, Distributed Application 1 and Distributed Application 2 are coordinated mutually, to carry out data syn-chronization.Here, Distributed Application 1 and Distributed Application 2 pairs of data perform identical retouching operation.Correspondent is for forwarding the related news of coordination and data syn-chronization operation.After Distributed Application 2 upgrades the success of above-mentioned data, Distributed Application 2 sends response message by Correspondent 1 with by Agent 2 to Distributed Application 1, to represent that Distributed Application 2 successfully upgrades above-mentioned data.Distributed Application 1 upgrades the success of above-mentioned data, and after receiving the response message of Distributed Application 2 transmission, the result be updated successfully is turned back to the outside client initiating request by Distributed Application 1.
Correspondent is for forwarding the related news of coordination and data syn-chronization operation.Such as, Correspondent 1 can send co-ordination message to Distributed Application 2.Such as, Correspondent 1 forwards the related news of coordination and data syn-chronization operation to Distributed Application 2 by Correspondent 2.When a period of time (such as, from Correspondent 1 to the moment that the Correspondent 2 of node 2 sends co-ordination message the first duration) in, Distributed Application 1 does not receive the reply message for co-ordination message, or receive the message of the network error that network interface returns, so think that message sends or takes defeat, namely two node systems break down.In like manner, Correspondent 2 also can send co-ordination message to Correspondent 1, when a period of time (such as, first duration) in, if when Correspondent 2 does not receive the reply message for co-ordination message of Correspondent 1 transmission, Correspondent program 2 can think that the network communication of two node systems is broken down.
Should be understood that the embodiment of the present invention breaks down to two node systems not limit.As long as can not normal communication between two nodes, be all considered as two node systems and break down, such as network link failure, net card failure, node power-off etc.
Two above-mentioned node systems occur two node systems break down time, can not normal communication be carried out between two nodes.At this moment, when determine, which node failure, which node are effectively, to utilize Correspondent to make effective node continue to maintain the process of Distributed Application, and the Distributed Application stopping invalid node corresponding.
Alternatively, as one embodiment of the present of invention, Correspondent can by procotol (the Internet Protocol of the ping network equipment, IP) address (such as, the IP address of switch or the IP address of gateway) method determine which node failure, which node is effective.Order ping is exactly the IP address transmission test packet of Correspondent to the network equipment, and whether test this IP address has response, and counting response time, the connection status of test network is carried out by response message.
The network equipment can be switch, router or gateway etc.Be gateway below with the network equipment for example carries out exemplary illustration.The address of the Correspondent ping switch in a node can be: Correspondent calls ping order generation ICMP (Internet Control Message Protocol, ICMP) packet, i.e. test packet, and send this test packet by the network interface card of server place node to switch.When receiving the response message of switch in regular hour interior nodes, represent that this node can lead to this switch by ping, this node is considered as effectively; When not receiving the response message of this switch within the regular hour, represent that this node can not lead to this switch by ping, this node is considered as losing efficacy.Node failure may be that net card failure or the network link between switch and node break down and cause.
The embodiment of the present invention can carry out the ping network equipment by Correspondent 1, also can carry out the ping network equipment by Correspondent 2, to determine that whether node is effective.Such as, Correspondent 1 can send test packet to the network equipment.Test packet a period of time is sent (such as to the network equipment at Correspondent 1, second duration) in, when Correspondent 1 receives the response message for above-mentioned test packet of network equipment transmission, node can be defined as effective node by Correspondent 1, and node 2 is defined as failure node.Also node 1 can be defined as failure node, node 2 is defined as effective node.Here, deciding which node by means of only the ping network equipment effective, which node failure, is not that to only have ping to lead to the ability of switch effective.Because effective node will continue to maintain Distributed Application, and failure node no longer maintains Distributed Application, and the work after the system failure can't use the network equipment.In addition, Correspondent 2 also can decide effective node and failure node by the ping network equipment.Such as, Correspondent 2 can send test packet to the network equipment.Test packet a period of time is sent (such as to the network equipment at Correspondent 2, second duration) in, when Correspondent 2 receives the response message of network equipment transmission, node 2 can be defined as effective node by Correspondent 2, and node 1 is defined as failure node.In like manner, now also node 2 can be defined as failure node, node 1 is defined as effective node.
Should be understood that the network equipment in the embodiment of the present invention can switch, router, gateway etc.The embodiment of the present invention does not limit this.
Should be understood that by the IP address of the ping network equipment, the embodiment of the present invention determines that node is whether effective, can be the node of ping open network equipment can be defined as effective node, another one node be invalid node.Also the node of ping open network equipment can will can not be defined as effective node, another one node is invalid node.The embodiment of the present invention does not limit this.As long as can determine which node is effective, to continue to maintain Distributed Application corresponding to this node.Further, determine which node is invalid, with the Distributed Application stopping invalid node corresponding.
Alternatively, as one embodiment of the present of invention, can be connected by serial ports between two nodes and carry out communication, Correspondent can determine effective node and failure node by ping serial ports.In highly available cluster system, substantially only process Single Point of Faliure, be exactly that synchronization considers to only have a kind of fault to occur, do not consider that network failure and hardware fault occur simultaneously.Because the continuation that Serial Port Line interrupts affecting distributed system normally runs, that is, do not use serial ports to carry out communication during distributed system work, Serial Port Line interrupts detecting obtaining two node system faults.So, when two node systems break down, do not consider Serial Port Line fault.
When not receiving feedback message after Correspondent forwards co-ordination message a period of time, think and forward co-ordination message failure.At this moment, by ping serial ports, Correspondent can determine that the Distributed Application of which node continues to perform, the Distributed Application of which node is out of service.Particularly, in two nodes, the whether effective priority information of a node can be selected in advance.Priority information is that Correspondent is formulated, and this priority information may be used for determining whether node continues to perform the priority order of Distributed Application.Alternatively, priority information can preset, and also can dynamically determine according to the performance of server, can also dynamically determine according to the busy-idle condition of server.
Such as, Correspondent 1 can send detect-message by serial ports to another node (such as, node 2) in two node systems.Send in detect-message a period of time (such as, the 3rd duration) at Correspondent 1, when Correspondent 1 does not receive the feedback message that node 2 sends for above-mentioned detect-message, determine that node 2 lost efficacy.When Correspondent 1 determines that node 2 lost efficacy, can determine that node 1 is effective.Send in detect-message a period of time (such as, the 3rd duration) at Correspondent 1, when Correspondent 1 receives the feedback message of node 2, according to priority information, effective node and failure node can be determined.
In like manner, Correspondent 2 also can determine effective node in two node systems and failure node.Concrete steps are that Correspondent 2 sends detect-message by serial ports to node 1.Send in detect-message a period of time (such as, the 3rd duration) at Correspondent 2, when Correspondent 2 does not receive the feedback message of node 1, determine that node 1 lost efficacy.When node 1 lost efficacy, can determine that node 2 was effective.Send in detect-message a period of time (such as, the 3rd duration) at Correspondent 2, when Correspondent 2 receives the feedback message of node 1, according to priority information, effective node and failure node can be determined.
How lower mask body introduction determines which node is effective, and which node is invalid.Correspondent can send a message (such as message ping) by serial ports to another node, when not receiving feedback message after Correspondent sends message a period of time by serial ports, can think that Correspondent Node is no longer valid (such as, Correspondent Node power-off), namely Correspondent Node is invalid, and the node sending message corresponding is effective.Now, do not need in conjunction with priority information to determine effective node and invalid node.But when receiving feedback message after Correspondent sends message a period of time by serial ports, think that Correspondent Node is survived (such as, Correspondent Node does not have power-off), the node sending message end is also survival simultaneously.Such as, the Correspondent that Correspondent Node is corresponding can receive the message ping of transmission, responds a feedback message (such as message pong) simultaneously.During due to two node system faults, quit work so need a node in two nodes.At this moment effective node and failure node can be determined according to precedence information.In precedence information, the Correspondent corresponding to node of high priority can send a message (such as message stop) to the node of low priority, to require that the node of low priority no longer continues to perform Distributed Application.The node of high priority continues to perform corresponding Distributed Application.Here, the node of high priority thinks effective, and the node of low priority thinks inefficacy.
Alternatively, as one embodiment of the present of invention, when two node systems also comprise shared disk, can determine which node failure by the method for ping shared disk, which node is effective.
When not receiving feedback message after Correspondent forwards co-ordination message a period of time, think and forward co-ordination message failure.At this moment, by ping shared disk, Correspondent can determine that the Distributed Application of which node continues to perform, the Distributed Application of which node is out of service.Particularly, in two nodes, select the whether effective priority information of a node in advance.Priority information is for determining whether node continues to perform the priority order of Distributed Application.Alternatively, priority information can preset, and also dynamically determines according to the performance of server, can also dynamically determine according to the busy-idle condition of server.
Such as, Correspondent 1 determines that effective node in two node systems and failure node can determine the whether effective priority information of two nodes.Correspondent 1 can by check data bag write shared disk.When Correspondent does not receive the reply message sent for check data bag from check data bag being write the moment of described shared disk in the 4th duration, Correspondent can determine that node 1 is effective, and node 2 lost efficacy.When Correspondent receives the reply message sent for check data bag in the 4th duration by the moment of check data bag write shared disk, whether Correspondent can be effective according to node effective priority information determination node.Such as, can be whether effective according to the busy-idle condition decision node of node.Give an example, node 1 is in not busy state, and so can think that node 1 is effective, node 2 lost efficacy.
Correspondent 1 can send check data bag to Correspondent 2, and when after a period of time, when Correspondent 1 does not receive the reply packet for check data bag of Correspondent 2 reply, think that node 2 lost efficacy, node 1 is effective.Otherwise then can think that node 2 is effective, node 1 lost efficacy.Particularly, Correspondent 1 can send check data bag to Correspondent 2.That is, Correspondent 1 is by check data bag write shared disk, and wherein check data bag can be the packet of Correspondent 1 for ping shared disk.Correspondent 2 can read this check data bag from shared disk, and namely Correspondent 2 receives check data bag.Correspondent 2 sends a reply packet according to the check data bag read to Correspondent 1.Namely Correspondent 2 will be replied in packet write shared disk, and Correspondent 1 can read this reply packet, when Correspondent reads this reply packet, thinks that Correspondent 1 receives reply packet.When system does not break down, Correspondent 1 can receive the reply packet that Correspondent 2 sends, and namely Correspondent 1 can read and reply packet from shared disk.When system jam, if Correspondent 1 can receive the reply packet that Correspondent 2 sends, at this moment think that node 2 can be survived, node 1 also can be survived, the all non-power-off of two nodes, which need to determine which node is effective, node failure according to the whether effective priority information of node.Such as, when the performance of node 1 is higher than node 2, node 1 can be selected effective, and node 2 lost efficacy.When system jam, if Correspondent 1 can not receive the reply packet that Correspondent 2 sends, at this moment think node 2 power-off, so node 2 lost efficacy, and node 1 is effective.
Should be understood that when system jam, if by ping shared disk, Correspondent 1 can receive the reply packet that Correspondent 2 sends, and so according to Single Point of Faliure principle, the system failure is network failure here.
In like manner, Correspondent 2 also can send check data bag to Correspondent 2, determines which node is effective, which node failure.Correspondent 2 determines that the defining method of the concrete grammar of effective node and failure node and above-mentioned Correspondent 1 is similar, does not repeat them here.
Duration (such as, the first duration, the second duration, the 3rd duration and the 4th duration) in the embodiment of the present invention can be preset value, and also can dynamically set, the embodiment of the present invention does not limit this.
When two node systems do not break down, Correspondent 1 can by the Distributed Application 2 forwarding data bag of the Correspondent 2 of the network equipment and node 2 to described node 2.In like manner, Correspondent 2 also can by the Distributed Application 1 forwarding data bag of the Correspondent 1 of the network equipment and node 1 to described node 1.When two node systems do not break down, Correspondent 1 and Correspondent 2 can forward data.
Principle of Communication before and after two node system faults and communication process is described in detail below in conjunction with Fig. 3 and Fig. 4.
Fig. 3 is the schematic diagram of the processing method of two node system subregions of one embodiment of the invention.Two node systems in Fig. 3 comprise node 1 (301), node 2 (302) and switch (303).Wherein, node 1 comprises Distributed Application 1 (304) and Correspondent 1 (305), and node 2 comprises Distributed Application 2 (306) and Correspondent 2 (307).
The embodiment of the present invention is by the node in two node systems increases Correspondent, the Distributed Application of a node can be made to have a quorum by Correspondent when two node system faults, thus make the distributed system based on quorum can be used in two node systems, and can normally work.
Should be understood that the embodiment of the present invention does not limit the network equipment, the embodiment of the present invention only for the network equipment between two nodes for switch carries out exemplary illustration.
What the embodiment of the present invention did not revise distributed application program realizes logic, but distributed system is applied to two node systems and two node systems can normally be worked by increasing Correspondent, the extraneous request sent of process.
When two node systems do not break down, Correspondent receives the first packet that Distributed Application sends, and forwards this first packet to the Correspondent of another node in two node systems.Or, when two node systems do not break down, the second packet that the Correspondent that Correspondent receives another node in two node systems sends, and forward described second packet to Distributed Application.That is, when two node systems do not break down, Correspondent is for forwarding the packet between the Distributed Application of Correspondent place node and another Correspondent.
Particularly, when two node systems do not break down, when namely normally working, two node systems can receive the request that extraneous client sends, Distributed Application 1 and Distributed Application 2, by switch and the mutual co-ordination of Correspondent, complete the work for the treatment of to request jointly.Such as, when Distributed Application 1 in two node systems receives the request of the Update Table that extraneous client sends, in order to keep the data consistency in two nodes, Distributed Application 1 is in renewal distribution formula application 1 while data, Distributed Application 1 can also send message by Correspondent 1, switch and Correspondent 2 to Distributed Application 2, and this message also upgrades above-mentioned data for asking Distributed Application 2.In renewal data procedures, Distributed Application 1 and Distributed Application 2 are coordinated mutually, to carry out data syn-chronization.Correspondent is for forwarding the related news of coordination and data syn-chronization operation.After Distributed Application 2 upgrades the success of above-mentioned data, Distributed Application 2 sends response message by Correspondent 1 with by Agent 2 to Distributed Application 1, to represent that Distributed Application 2 successfully upgrades above-mentioned data.Distributed Application 1 upgrades the success of above-mentioned data, and after receiving the response message of Distributed Application 2 transmission, the result be updated successfully is turned back to the outside client initiating request by Distributed Application 1.
When two node system faults, Correspondent forwards co-ordination message failure, and when namely forwarding the message failure of coordination and data syn-chronization operation, whether Correspondent effectively can judge node.By the IP address (such as, the address of switch or the address of router) of the ping network equipment, Correspondent can determine that whether this node is effective, and whether another node is effective.Alternatively, when two node systems break down, Correspondent can determine effective node and failure node by ping serial ports.Alternatively, when two node systems break down, Correspondent can determine effective node and failure node by ping shared disk.Determining which node is effective, after which node is invalid, the nullified node of whole system quits work, and makes effective node continue the normal work of maintenance system, receives extraneous request, and processes the request of outside.
Two node systems break down and can comprise network medium interruption, net card failure, node power-off etc.The embodiment of the present invention is not construed as limiting this.
Particularly, when Correspondent determination node 1 is effective, when node 2 lost efficacy, Correspondent 1 can analog communication agency 2 make response to Distributed Application 1.When Distributed Application 1 receives the correct response message that Distributed Application has a quorum, Distributed Application 1 thinks that node 2 can normally work, thus can maintain normal quorum, can continue the Distributed Application 1 of XM 1.And now, the Distributed Application 2 on node 2 quits work owing to not having enough quorums.
When Correspondent determination node 2 lost efficacy, Correspondent 2 can not respond within a period of time, represented that node 2 lost efficacy, can not continue normal work.Or when Correspondent determination node 2 lost efficacy, Correspondent 2 sent the error response message of quorum is not constituted to Distributed Application 2.When Distributed Application 2 receives the error response message of Correspondent 2 transmission, can learn that Distributed Application formula program 2 is not had a quorum, node 2 cisco unity malfunction, namely Distributed Application 2 quits work.
Continue normal work in Distributed Application 1, and when Distributed Application 2 quits work, Distributed Application 1 can receive the extraneous request processed data sent, and according to request, data be processed.Now, have a quorum as long as Correspondent 1 can ensure that Distributed Application 1 is thought, can normally work.To carry out request after process completes, result being returned to the client of the request of sending in Distributed Application 1.
When two node systems comprise multiple switch, the Correspondent on node can pass through the IP address of ping self Correspondent and each switch, thus whether decision node is effective.Select effective node normally to work, and invalid node quit work.
After two node system faults, Correspondent 1 role is that simulation distribution formula application 2 sends the correct response message of having a quorum to Distributed Application 1.This just needs Correspondent to understand the quorum consistency protocol of distributed system completely, and when system acceptance is to different requests, Correspondent 1 simulation distribution formula application 2 can make correct response to request.
Embodiments of the invention go for the fairly simple situation of the consistency protocol of the quorum of distributed system.Such as: the consistency protocol between Distributed Application is only this kind of protocol message of synchrodata.After Distributed Application 1 receives the request that client modifies to data, first Distributed Application 1 upgrades the data of oneself, and the request of modifying to data is sent to Distributed Application 2.After Distributed Application 2 receives request, the also data of synchronized update oneself, and transmit a reply message to Distributed Application 1.Here reply message represents that Distributed Application 2 also synchronously completes the amendment to data.Reply after message when Distributed Application 1 receives, think the amendment of Distributed Application 2 synchronous complete paired data, at this moment, Distributed Application 1 can return successfully modified response message to the client initiated the data request of modifying.Under this simple scenario, if find retransmission failure during Correspondent 1 forwarding data, and after judging that Correspondent Node lost efficacy, Distributed Application 1 acknowledge message can be replied to, such Distributed Application 1 is thought Distributed Application 2 synchronously success can return successfully modified response message to client.
Should be understood that embodiments of the invention are effective with node 1, node 2 lost efficacy for example carries out exemplary illustration, but the present invention is not limited thereto.Embodiments of the invention may testing result be also that node 1 is invalid, and node 2 is effective, system to the process of this situation and node 1 effective, it is similar that node 2 lost efficacy, and no longer describes in detail at this.
Fig. 4 is the schematic diagram of the processing method of the two node system subregions of another embodiment of the present invention.Two node systems in Fig. 4 comprise node 1 (401), node 2 (402) and switch (403).Wherein, node 1 comprises the shadow process (406) of Distributed Application 1 (404), Correspondent 1 (405) and Distributed Application 2, and node 2 comprises the shadow process (409) of Distributed Application 2 (407), Correspondent 2 (408) and Distributed Application 1.
The embodiment of the present invention is by the node in two node systems increases Correspondent, the Distributed Application of a node can be made to have a quorum by Correspondent when two node system faults, thus make the distributed system based on quorum can be used in two node systems, and can normally work.
When two node systems normally work, two node systems can receive the request that extraneous client sends, and Distributed Application 1 and Distributed Application 2, by switch and the mutual co-ordination of Correspondent, complete the work for the treatment of to request jointly.Such as, when Distributed Application 1 in two node systems receives the request of the Update Table that extraneous client sends, in order to keep the data consistency in two nodes, Distributed Application 1 is in renewal distribution formula application 1 while data, Distributed Application 1 can also send message by Correspondent 1, switch and Correspondent 2 to Distributed Application 2, and this message also upgrades above-mentioned data for asking Distributed Application 2.In renewal data procedures, Distributed Application 1 and Distributed Application 2 are coordinated mutually, to carry out data syn-chronization.Correspondent is for forwarding the related news of coordination and data syn-chronization operation.After Distributed Application 2 upgrades the success of above-mentioned data, Distributed Application 2 sends response message by Correspondent 1 with by Agent 2 to Distributed Application 1, to represent that Distributed Application 2 successfully upgrades above-mentioned data.Distributed Application 1 upgrades the success of above-mentioned data, and after receiving the response message of Distributed Application 2 transmission, the result be updated successfully is turned back to the outside client initiating request by Distributed Application 1.
When two node system faults, can determine which node is effective by modes such as ping switch, ping serial ports or ping shared disks, which node failure.When node 1 is effective, during node failure, Distributed Application 1 can continue normal work, and Distributed Application 2 quits work.Distributed Application 1 continues normal work, and when Distributed Application 2 quits work, Correspondent 1 and the shadow process of Distributed Application 2 connect, and have started the shadow process of Distributed Application 2.
Distributed Application 1 can receive the extraneous request processed data sent, and processes data according to request.Now, Correspondent 1 needs to ensure that Distributed Application 1 is thought and has a quorum, and also needs the connection of setting up between the shadow process of Distributed Application 1 and Distributed Application 2, forwards between to data.In this case, the shadow process of Distributed Application 1 and Distributed Application 2 can co-ordination, and the request of client processes to external world.The shadow process of Distributed Application 2 can simulate the work performing Distributed Application 2, but shadow process cannot receive the request sent from extraneous client, can receive request or the packet of Correspondent 1 forwarding, and process request.The shadow process of Distributed Application 1 and Distributed Application 2 forms two node systems, to maintain the normal work of system.The shadow process of Distributed Application 1 and Distributed Application 2 has coordinated the request of client to external world by Correspondent 1 after, result can have been returned to the client of the request of sending by Distributed Application 1.
Embodiments of the invention go for the situation of the consistency protocol more complicated of the quorum of distributed system.The consistency protocol more complicated of some Distributed Application.Such as, 16 kinds of dissimilar data are had in Distributed Application, the more new logic more complicated of these data.When Distributed Application 1 receive request need to upgrade a certain data time, renewal rewards theory is divided into again 5 steps, each step Distributed Application needs to be confirmed whether with other nodes to upgrade, so each step needs to send different message to other nodes.When any step in renewal rewards theory goes wrong, renewal rewards theory just cannot proceed, and after completing in steps, Distributed Application 1 also needs the data after renewal to send to other nodes to carry out data syn-chronization.A variety of message format is had between this Distributed Application, and also relevant between the different messages of a renewal rewards theory, and namely, the message of a step needs to generate according to the message of previous step.This implementation Correspondent of Distributed Application is just difficult to simulation, so, in this case, more easily realize with the shadow of Distributed Application.
Above Fig. 2 to Fig. 4, describes the processing method for two node system subregions according to the embodiment of the present invention in detail from node angle, describes the node according to the embodiment of the present invention in detail below in conjunction with Fig. 5 and Fig. 6.
Fig. 5 is the block diagram of the node of one embodiment of the invention.The node 50 of Fig. 5 comprises Distributed Application 51 and Correspondent 52.Node 50 is based on the node in two node systems of quorum.
Correspondent 52 is for determining when two node systems break down whether effectively node, also for sending the correct response message indicating Distributed Application to have a quorum to Distributed Application 51 when node is effective.
The embodiment of the present invention is by the node in two node systems increases Correspondent, the Distributed Application of a node can be made to have a quorum by Correspondent when two node system faults, thus make the distributed system based on quorum can be used in two node systems, and can normally work.
Alternatively, as one embodiment of the present of invention, Correspondent also for sending the error response message of described Distributed Application quorum is not constituted to the instruction of described Distributed Application when the node failure at Correspondent place, or, no longer send message to described Distributed Application.
Alternatively, as one embodiment of the present of invention, described Correspondent also for from when sending reply message that the Correspondent that do not receive another node in the first duration the moment of co-ordination message sends for co-ordination message to the Correspondent of another node in two node systems, determines that two node systems break down.
Alternatively, as one embodiment of the present of invention, two node systems also comprise the network equipment.Correspondent, during for not receiving response message that the network equipment sends for described test packet from the moment sending test packet to the network equipment in the second duration, determines described node failure.Correspondent, during for receiving response message that the network equipment sends for test packet from the moment sending test packet to the network equipment in the second duration, determines that node is effective.
Alternatively, as one embodiment of the present of invention, two node systems also comprise the serial ports of another node connected in described node and described two node systems.Correspondent, for sending detect-message by serial ports to another node in described two node systems.When Correspondent is used for not receiving in the 3rd duration from the moment sending detect-message to another node the feedback message that another node sends for detect-message, determine that node is effective.Correspondent, during for receiving feedback message that another node sends for detect-message from the moment sending detect-message to another node in the 3rd duration, according to the effective priority information of node, determines that whether node is effective.
Alternatively, as one embodiment of the present of invention, institute's node system also comprises shared disk.Correspondent is used for check data bag write shared disk.Correspondent is used for not receiving in the 4th duration from send the moment of check data bag to the Correspondent of another node in described two node systems reply packet that the Correspondent for another node described sends constantly, determines that node is effective.Whether described Correspondent is used for receiving in the 4th duration from send the moment of check data bag to the Correspondent of another node reply packet that the Correspondent for another node described sends constantly, effective according to node effective priority information determination node.
Alternatively, as one embodiment of the present of invention, node also comprises the shadow process of the Distributed Application of another node in described two node systems.Correspondent is used for when node is effective, starts the shadow process of the Distributed Application of another node.Described Distributed Application be used for that client sends for asking the request message processed data, and to be sent a request message by the shadow process of Correspondent to the Distributed Application stating another node.The shadow process of the Distributed Application of another node for receiving request message, and processes data according to request message.
Alternatively, as one embodiment of the present of invention, described Correspondent is used for when two node systems do not break down, and receives the first packet that Distributed Application sends, and forwards the first packet to the Correspondent of another node in two node systems.Or Correspondent is used for when two node systems do not break down, the second packet that the Correspondent receiving another node in two node systems sends, and forward the second packet to Distributed Application.
By the Correspondent of another node of the network equipment and the two node systems Distributed Application forwarding data bag to another node.
Alternatively, as one embodiment of the present of invention, node is physical server or virtual server.
The node 50 of Fig. 5 can perform each flow process of the method shown in Fig. 2, Fig. 3 and Fig. 4, for avoiding repetition, is not described in detail at this.
Fig. 6 is the block diagram of the node of another embodiment of the present invention.Node 60 in Fig. 6 comprises transmitter 61, receiver 62, processor 63 and memory 64.Each assembly of node 60 is coupled by bus system 65.
Memory 64 is for storing instruction, and processor 63 is for performing the instruction and data of this memory 64 storage.A part for memory 64 can also comprise non-volatile row random access memory (NVRAM, Non-Volatile Random Access Memory).Each assembly of device is coupled by bus system 65, and wherein bus system 65 is except comprising data/address bus, also comprises power bus, control bus and status signal bus in addition.But for the purpose of clearly demonstrating, in the drawings various bus is all designated as bus system 65.
The method that the invention described above embodiment discloses can be applied in processor 63, or is realized by processor 63.In implementation procedure, each step of said method can be completed by the instruction of the integrated logic circuit of the hardware in processor 51 or software form.Processor 63 can be general processor, digital signal processor, application-specific integrated circuit (ASIC), field programmable gate array or other programmable logic devices, discrete gate or transistor logic, discrete hardware components, can realize or perform disclosed each method, step and the logic diagram in the embodiment of the present invention.General processor can be the processor etc. of microprocessor or any routine.Step in conjunction with the method disclosed in the embodiment of the present invention can directly be presented as that hardware processor is complete, or hardware in purpose processor and software module combination complete.Software module can be positioned at random asccess memory, flash memory, read-only memory, in the storage medium of this area maturations such as programmable read only memory or electrically erasable programmable memory, register.This storage medium is positioned at memory 64, and processor 63 reads the information in memory 64, completes the step of said method in conjunction with its hardware.
Particularly, processor 63 may be used for determining that when two node systems break down whether place node is effective, and for sending the correct response message indicating Distributed Application to have a quorum to the Distributed Application of correspondence when place node is effective.
The embodiment of the present invention is by the node in two node systems increases Correspondent, the Distributed Application of a node can be made to have a quorum by Correspondent when two node system faults, thus make the distributed system based on quorum can be used in two node systems, and can normally work.
Alternatively, as one embodiment of the present of invention, transmitter 61 for sending the error response message of instruction Distributed Application quorum is not constituted to Distributed Application when the node failure at Correspondent place, or no longer sends message to described Distributed Application.
Alternatively, as one embodiment of the present of invention, processor 63, for from when sending reply message that the Correspondent that do not receive another node in the first duration the moment of co-ordination message sends for co-ordination message to the Correspondent of another node in two node systems, determines that two node systems break down.
Alternatively, as one embodiment of the present of invention, two node systems also comprise the network equipment, processor 63 for from the moment sending test packet to the network equipment, do not receive the network equipment in the second duration send for described test packet response message time, determine described node failure.When processor 63 also for receiving response message that the network equipment sends for described test packet in the second duration from the moment sending test packet to the network equipment, determine that node is effective.
Alternatively, as one embodiment of the present of invention, two node systems comprise the serial ports of another node connected in described node and described two node systems, and transmitter 61 is for sending detect-message by described serial ports to another node in two node systems.Processor 63 for from the moment sending detect-message to another node, do not receive another node described in the 3rd duration send for detect-message feedback message time, determine that node is effective.When processor 63 also for receiving feedback message that another node sends for detect-message in the 3rd duration from the moment sending detect-message to another node, according to the effective priority information of node, determine that whether node is effective.
Alternatively, as one embodiment of the present of invention, two node systems comprise shared disk, and transmitter 61 sends check data bag for the Correspondent to another node in described two node systems.During the reply packet that processor 63 sends for the Correspondent do not received in the 4th duration from send the moment of described check data bag to the Correspondent of another node described for another node described, determine that node is effective.During the reply packet that processor 63 also sends for the Correspondent received in the 4th duration from send the moment of described check data bag to the Correspondent of another node described for another node described, determine that whether described node is effective according to the effective priority information of node.
Alternatively, as one embodiment of the present of invention, node also comprises the shadow process of the Distributed Application of another node in two node systems, and processor 63, for when described node is effective, starts the shadow process of the Distributed Application of another node described.Receiver 62 for receive client send for asking the request message processed data, transmitter 61 is for sending a request message by the shadow process of Correspondent to the Distributed Application of another node.Receiver 62 is also for receiving request message, and processor 63 is also for processing described data according to described request message.
Alternatively, as one embodiment of the present of invention, receiver 62 is not for when two node systems break down, and receive the first packet that Distributed Application sends, transmitter 61 is for forwarding the first packet to the Correspondent of another node in two node systems.Or receiver 62 is not for when two node systems break down, and the second packet that the Correspondent that Correspondent receives another node in two node systems sends, transmitter 61 is for forwarding the second packet to Distributed Application.
By the Correspondent of another node of the network equipment and the two node systems Distributed Application forwarding data bag to another node.
The node 60 of Fig. 6 can perform each flow process of the method shown in Fig. 2, Fig. 3 and Fig. 4, for avoiding repetition, is not described in detail at this.
Should be understood that the network equipment in the embodiment of the present invention can be switch, gateway or router etc., the present invention does not limit this.
Node in the embodiment of the present invention can be server.Server can be physical server, also can be virtual server.The embodiment of the present invention does not limit this.
Should be understood that during specification in the whole text that " embodiment " or " embodiment " mentioned means that the special characteristic relevant with embodiment, structure or characteristic comprise at least one embodiment of the present invention.Therefore, " in one embodiment " or " in one embodiment " that occur everywhere at whole specification does not necessarily refer to identical embodiment.In addition, these specific feature, structure or characteristics can combine in one or more embodiments in any suitable manner.
Should understand, in various embodiments of the present invention, the size of the sequence number of above-mentioned each process does not also mean that the priority of execution sequence, and the execution sequence of each process should be determined with its function and internal logic, and should not form any restriction to the implementation process of the embodiment of the present invention.
Should be understood that in embodiments of the present invention, " B corresponding to A " represents that B and A is associated, and can determine B according to A.But should also be understood that and determine B and do not mean that only to determine B according to A according to A, B can also be determined according to A and/or out of Memory.
Should be understood that term "and/or" herein, being only a kind of incidence relation describing affiliated partner, can there are three kinds of relations in expression, and such as, A and/or B, can represent: individualism A, exists A and B simultaneously, these three kinds of situations of individualism B.In addition, character "/" herein, general expression forward-backward correlation is to the relation liking a kind of "or".
The described unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, also can be that the independent physics of unit exists, also can two or more unit in a unit integrated.
Those of ordinary skill in the art can recognize, in conjunction with the various method steps described in embodiment disclosed herein and unit, can realize with electronic hardware, computer software or the combination of the two, in order to the interchangeability of hardware and software is clearly described, generally describe step and the composition of each embodiment in the above description according to function.These functions perform with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Those of ordinary skill in the art can use distinct methods to realize described function to each specifically should being used for, but this realization should not thought and exceeds scope of the present invention.
The software program that the method described in conjunction with embodiment disclosed herein or step can use hardware, processor to perform, or the combination of the two is implemented.Software program can be placed in the storage medium of other form any known in random asccess memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field.
Although by reference to accompanying drawing and mode in conjunction with the preferred embodiments to invention has been detailed description, the present invention is not limited to this.Without departing from the spirit and substance of the premise in the present invention, those of ordinary skill in the art can carry out amendment or the replacement of various equivalence to embodiments of the invention, and these amendments or replacement all should in covering scopes of the present invention.

Claims (18)

1. a processing method for two node system subregions, described method is used for two node systems based on quorum, and the node in described two node systems comprises Correspondent and Distributed Application, it is characterized in that, described method comprises:
When described two node systems break down, described Correspondent determines that whether the node at described Correspondent place is effective;
When the node at described Correspondent place is effective, described Correspondent sends to the Distributed Application of the node at described Correspondent place the correct response message indicating described Distributed Application to have a quorum.
2. the method for claim 1, is characterized in that, described method also comprises:
When the node failure at described Correspondent place, described Correspondent sends the error response message of the described Distributed Application quorum is not constituted of instruction to described Distributed Application or no longer sends message to described Distributed Application.
3. method as claimed in claim 1 or 2, it is characterized in that, described method also comprises:
Described Correspondent sends co-ordination message to the Correspondent of another node in described two node systems;
Described Correspondent to the Correspondent of another node described send moment of co-ordination message play in the first duration the reply message that the Correspondent that do not receive another node described sends for described co-ordination message time, determine that described two node systems break down.
4. method as claimed any one in claims 1 to 3, it is characterized in that, described two node systems also comprise the network equipment, and described Correspondent determines whether the node at described Correspondent place effectively comprises:
Described Correspondent sends test packet to the described network equipment;
Described Correspondent does not receive the response message that the described network equipment sends for described test packet from the moment sending described test packet to the described network equipment in the second duration time, determine described node failure;
Described Correspondent receives the response message that the described network equipment sends for described test packet from the moment sending described test packet to the described network equipment in the second duration time, determine that described node is effective.
5. method as claimed any one in claims 1 to 3, it is characterized in that, described two node systems also comprise the serial ports of another node connected in described node and described two node systems, and described Correspondent determines whether the node at described Correspondent place effectively comprises:
Described Correspondent sends detect-message by described serial ports to another node described;
During the feedback message that another node described in described Correspondent does not receive in the 3rd duration from the moment sending described detect-message to another node described sends for described detect-message, determine that described node is effective;
During the feedback message that another node described in described Correspondent receives in the 3rd duration from the moment sending described detect-message to another node described sends for described detect-message, according to the effective priority information of node, determine that whether described node is effective.
6. method as claimed any one in claims 1 to 3, it is characterized in that, described two node systems also comprise shared disk, and described Correspondent determines whether the node at described Correspondent place effectively comprises:
Described Correspondent sends check data bag to the Correspondent of another node in described two node systems;
Described Correspondent does not receive the reply packet that the Correspondent for another node described sends from send the moment of described check data bag to the Correspondent of another node described in the 4th duration time, determine that described node is effective;
Described Correspondent receives the reply packet that the Correspondent for another node described sends from send the moment of described check data bag to the Correspondent of another node described in the 4th duration time, determine that whether described node is effective according to the effective priority information of node.
7. the method according to any one of claim 1 to 6, is characterized in that, described node also comprises the shadow process of the Distributed Application of another node in described two node systems, and described method also comprises:
Described Correspondent, when described node is effective, starts the shadow process of the Distributed Application of another node described;
Described Distributed Application receive that client sends for asking the request message processed data, and by the shadow process transmission described request message of described Correspondent to the Distributed Application of another node described;
The shadow process of the Distributed Application of another node described receives described request message, and processes described data according to described request message.
8. the method according to any one of claim 1 to 7, is characterized in that, described method also comprises:
When described two node systems do not break down, described Correspondent receives the first packet that described Distributed Application sends, and forwards described first packet to the Correspondent of another node in described two node systems; Or,
When described two node systems do not break down, the second packet that the Correspondent that described Correspondent receives another node in described two node systems sends, and forward described second packet to described Distributed Application.
9. the processing method according to any one of claim 1 to 8, is characterized in that, described node is physical server or virtual server.
10. a node, described node belongs to two node systems based on quorum, it is characterized in that,
Described node comprises Distributed Application and Correspondent;
When described two node systems break down, described Correspondent, for determining that whether described node is effective;
Described Correspondent, also for sending the correct response message indicating described Distributed Application to have a quorum to described Distributed Application when described node is effective.
11. nodes as claimed in claim 10, is characterized in that,
Described Correspondent, also for sending the error response message of the described Distributed Application quorum is not constituted of instruction to described Distributed Application when the node failure at described Correspondent place, or no longer sends message to described Distributed Application.
12. nodes as described in claim 10 or 11, is characterized in that,
Described Correspondent, also for from when sending reply message that the Correspondent that do not receive another node described in the first duration the moment of co-ordination message sends for described co-ordination message to the Correspondent of another node in described two node systems, determine that described two node systems break down.
13. nodes according to any one of claim 10 to 12, is characterized in that,
Described two node systems also comprise the network equipment;
Described Correspondent, during for not receiving response message that the described network equipment sends for described test packet in the second duration from the moment sending test packet to the described network equipment, determines described node failure;
Described Correspondent, during for receiving response message that the described network equipment sends for described test packet in the second duration from the moment sending test packet to the described network equipment, determines that described node is effective.
14. nodes according to any one of claim 10 to 12, is characterized in that,
Described two node systems also comprise the serial ports of another node connected in described node and described two node systems;
Described Correspondent, for sending detect-message by described serial ports to another node described;
Described Correspondent, during for not receiving feedback message that another node described sends for described detect-message in the 3rd duration from the moment sending described detect-message to another node described, determines that described node is effective;
Described Correspondent, during for receiving feedback message that another node described sends for described detect-message in the 3rd duration from the moment sending described detect-message to another node described, according to the effective priority information of node, determine that whether described node is effective.
15. nodes according to any one of claim 10 to 12, is characterized in that,
Described two node systems also comprise shared disk;
Described Correspondent, for sending check data bag to the agency of another node in described two node systems;
Described Correspondent, during the reply packet that the Correspondent for not receiving in the 4th duration from send the moment of described check data bag to the Correspondent of another node described for another node described sends, determines that described node is effective;
Described Correspondent, during the reply packet sent for the Correspondent receiving another node described in the 4th duration from send the moment of described check data bag to the Correspondent of another node described, determine that whether described node is effective according to the effective priority information of node.
16. nodes according to any one of claim 10 to 15, is characterized in that,
Described node also comprises the shadow process of the Distributed Application of another node in described two node systems;
Described Correspondent, for when described node is effective, starts the shadow process of the Distributed Application of another node described;
Described Distributed Application, for receive that client sends for asking the request message processed data, and by the shadow process transmission described request message of described Correspondent to the Distributed Application of another node described;
The shadow process of the Distributed Application of another node described, for receiving described request message, and processes described data according to described request message.
17. nodes according to any one of claim 10 to 16, is characterized in that,
Described Correspondent, for when described two node systems do not break down, receives the first packet that described Distributed Application sends, and forwards described first packet to the Correspondent of another node in described two node systems; Or,
Described Correspondent, for when described two node systems do not break down, the second packet that the Correspondent receiving another node in described two node systems sends, and forward described second packet to described Distributed Application.
18. nodes according to any one of claim 10 to 17, it is characterized in that, described node is physical server or virtual server.
CN201510121396.XA 2015-03-19 2015-03-19 The processing method and node of two node system subregions Active CN104702693B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510121396.XA CN104702693B (en) 2015-03-19 2015-03-19 The processing method and node of two node system subregions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510121396.XA CN104702693B (en) 2015-03-19 2015-03-19 The processing method and node of two node system subregions

Publications (2)

Publication Number Publication Date
CN104702693A true CN104702693A (en) 2015-06-10
CN104702693B CN104702693B (en) 2018-01-23

Family

ID=53349451

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510121396.XA Active CN104702693B (en) 2015-03-19 2015-03-19 The processing method and node of two node system subregions

Country Status (1)

Country Link
CN (1) CN104702693B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107171849A (en) * 2017-05-31 2017-09-15 郑州云海信息技术有限公司 The failure monitoring method and device of a kind of cluster virtual machine
CN107403003A (en) * 2017-07-21 2017-11-28 南京智网云联信息科技有限公司 A kind of distributed copies file referee method
CN109218141A (en) * 2018-11-20 2019-01-15 郑州云海信息技术有限公司 A kind of malfunctioning node detection method and relevant apparatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1882935A (en) * 2003-12-23 2006-12-20 思科技术公司 Providing location-specific services to a mobile node
WO2010013092A1 (en) * 2008-07-30 2010-02-04 Telefonaktiebolaget Lm Ericsson (Publ) Systems and method for providing trusted system functionalities in a cluster based system
CN103718533A (en) * 2013-06-29 2014-04-09 华为技术有限公司 Zoning balance subtask issuing method, apparatus and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1882935A (en) * 2003-12-23 2006-12-20 思科技术公司 Providing location-specific services to a mobile node
WO2010013092A1 (en) * 2008-07-30 2010-02-04 Telefonaktiebolaget Lm Ericsson (Publ) Systems and method for providing trusted system functionalities in a cluster based system
CN103718533A (en) * 2013-06-29 2014-04-09 华为技术有限公司 Zoning balance subtask issuing method, apparatus and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107171849A (en) * 2017-05-31 2017-09-15 郑州云海信息技术有限公司 The failure monitoring method and device of a kind of cluster virtual machine
CN107171849B (en) * 2017-05-31 2020-03-31 郑州云海信息技术有限公司 Fault monitoring method and device for virtual machine cluster
CN107403003A (en) * 2017-07-21 2017-11-28 南京智网云联信息科技有限公司 A kind of distributed copies file referee method
CN109218141A (en) * 2018-11-20 2019-01-15 郑州云海信息技术有限公司 A kind of malfunctioning node detection method and relevant apparatus

Also Published As

Publication number Publication date
CN104702693B (en) 2018-01-23

Similar Documents

Publication Publication Date Title
JP6362120B2 (en) Arbitration processing method, quorum storage device, and system after cluster brain division
JP3932994B2 (en) Server handover system and method
US20180077007A1 (en) Redundant storage solution
CN104702693A (en) Processing method for two-node system partitioning and node
CN113626139B (en) High-availability virtual machine storage method and device
Cisco Error Messages
Cisco System Error Messages
Cisco Error Messages
Cisco Error Messages
Cisco Error Messages
Cisco Error Messages
Cisco Error Messages
Cisco Error Messages
Cisco System Error Messages Internetwork Operating System Release 10
Cisco System Error Messages
Cisco System Error Messages
Cisco System Error Messages
Cisco Error Messages
Cisco System Error Messages
Cisco System Error Messages
Cisco System Error Messages
Cisco System Error Messages
Cisco System Error Messages
Cisco System Error Messages
Cisco Miscellaneous Error Messages

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant