CN101247273B - Maintenance method of service cooperated node organization structure in distributed environment - Google Patents

Maintenance method of service cooperated node organization structure in distributed environment Download PDF

Info

Publication number
CN101247273B
CN101247273B CN2008101009864A CN200810100986A CN101247273B CN 101247273 B CN101247273 B CN 101247273B CN 2008101009864 A CN2008101009864 A CN 2008101009864A CN 200810100986 A CN200810100986 A CN 200810100986A CN 101247273 B CN101247273 B CN 101247273B
Authority
CN
China
Prior art keywords
node
model
partial structurtes
neighbors
announcement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2008101009864A
Other languages
Chinese (zh)
Other versions
CN101247273A (en
Inventor
马殿富
刘敏
赵永望
赵博
韩军
黄永刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN2008101009864A priority Critical patent/CN101247273B/en
Publication of CN101247273A publication Critical patent/CN101247273A/en
Application granted granted Critical
Publication of CN101247273B publication Critical patent/CN101247273B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Computer And Data Communications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Step of maintenance method of service coordinated node structure in distributed surroundings is the following: (1) initialization, determining each node maintenance neighborhood scope according to computation method of maintainability probability; (2) executing maintenance for new node adding, node active quitting and node invalidation in the neighborhood scope; the steps (1) has three computation method of maintainability probability according to coordinated structure, three computation method is line shape, ring shape and tree structure. The present invention does not exist single point disabled problem, at the same time will not add network load pressure without over-great communications traffic.

Description

The maintaining method of service cooperated node organization structure in distributed environment
Technical field
The present invention relates to Web service dynamic cooperation technical field under the open environment, relate in particular to the maintaining method of service cooperated node organization structure in distributed environment.
Background technology
The development of computerese information technology experienced one from focus on distribution, from single to various, from independently of one another to the evolution of constantly merging mutually.Yet calculating still exists with information island.Under present this opening, dynamic, complicated computing environment, traditional computing technique is difficult to realize fully sharing of computing capability and information resources, service-oriented computing (SOC) and service-oriented architecture (SOA) are widely accepted, become the calculating pattern and the architecture of structure Distributed Application of future generation.SOA has embodied the structure and the compositing characteristic of distributed software system, portrayal ISP, service requester and service broker this three kinds of roles and the relation between them, by issue, find, mutual substantially between the three described in three kinds of basic operations of binding.Web service is a kind of embodied technique of service-oriented architecture, is based upon on a series of exploitation standard, comprises XML, SOAP, WSDL and UDDI etc.
In the Distributed Application of SOA, service is the essential structure unit, and service-oriented complicated applications is to be realized by working in coordination with alternately between the service.Because the inherent distributed nature of Web service for a large amount of service nodes of coordinating under the distributed environment carry out collaborative work, need be organized cooperative nodes according to its functional character, set up incidence relation between node, forms certain topological structure.
Under dynamic network environment, the stability of cooperative nodes and reliability are often unpredictable, and node may add at any time or withdraw from even lose efficacy.Can destroy the topological structure that internodal incidence relation and node form like this.Common distributed frame maintaining method has two kinds.
There is main controlled node in one in the distributed frame, be responsible for safeguarding global Topological Structure figure.When the new node desire adds structure, will send request to its main controlled node, be responsible for settling new node after main controlled node accepts request, and upgrade global Topological Structure figure; When node is desired exit structure, will send out its main controlled node of message informing, main controlled node is responsible for removing the relation that withdraws from node and its neighbors, sets up new annexation according to architectural feature between interdependent node, and upgrades global Topological Structure figure; When node is found failure node, will send error reporting to its main controlled node, main controlled node directly removes failure node or seeks appropriate node and replace failure node, and upgrades global Topological Structure figure.In this method, the inefficacy of main controlled node will cause whole distributed frame to safeguard, have the shortcoming of single point failure.
Its two, do not have main controlled node in the distributed frame, each node is the maintenance of participation structure coequally, and preserves the copy of a global Topological Structure figure.When the new node desire adds structure, can send request to arbitrary node, receive that the node of request is responsible for settling new node, and all the node broadcasts message in structure, upgrade the global Topological Structure figure copy of each node; When node is desired exit structure, to send out its neighbors of message informing, the relation of releasing and neighbors may be set up new annexation by needs according to architectural feature between its neighbors, and all the node broadcasts message in structure, upgrade the global Topological Structure figure copy of each node; When node was found failure node, mistake was found node responsible for rehabilitation structure, directly removed failure node or searching appropriate node replacement failure node, and all the node broadcasts message in structure, upgraded the global Topological Structure figure copy of each node.In this method, each variation of structure all needs to notify its global Topological Structure of all node updates figure copy, bring a large amount of message communicatings like this, increase offered load pressure, simultaneously because frequent message communicating may cause more information drop-out, thereby cause problems such as the global Topological Structure figure of each node is inconsistent.
Summary of the invention
Technology of the present invention is dealt with problems: overcome the deficiencies in the prior art, a kind of maintaining method of service cooperated node organization structure in distributed environment is provided, there is not the single point failure problem in this method, does not increase offered load pressure again simultaneously, can not make the traffic excessive.
Technical solution of the present invention: the maintaining method of service cooperated node organization structure in distributed environment, step is as follows:
(1) initialization determines that according to the computational methods of maintainable probability the neighborhood scope that each node needs is safeguarded is the radius of known neighborhood; As shown in Figure 1.
(2) in described neighborhood scope, new node adding, node are initiatively withdrawed from node failure and safeguard.
Computational methods for the maintainable probability of linear structure in the described step (1) are:
The maintainable probability of linear structure
Figure S2008101009864D00021
K be the radius of known neighborhood be in the known neighborhood node with the maximum distance of present node, q is the probability that individual node lost efficacy, n is a linear structure node sum.
Computational methods for the maintainable probability of loop configuration in the described step (1) are:
The maintainable probability of loop configuration:
Figure S2008101009864D00022
K be the radius of known neighborhood be in the known neighborhood node with the maximum distance of present node, q is the probability that individual node lost efficacy, n is a loop configuration node sum.
Computational methods for the maintainable probability of tree structure in the described step (1) are:
Figure S2008101009864D00031
M T k ( R ) = Pr ( E M k ( R , k ) ) ,
Wherein:
E M k(X d) is | L F Max ( X ) | < d ( d &le; k ) , And or T (X) not have all subtrees of subtree or T (X) all be that k-is maintainable; L F(X) be the stale link of nodes X, be from X to node Z, the path of Z ∈ T (X), note is done
Figure S2008101009864D00034
And all nodes that satisfy on (1) path all lost efficacy; (2) Z is that the child node of leaf node or Z is a normal node; Distinguishingly, when X was normal node, nodes X only had an empty stale link, and note is made φ, and usually, nodes X has one usually to many stale links, and the stale link set of note nodes X is S LF(X)={ L F(X) | L F(X) be the stale link of node }, wherein the longest stale link of note is L F Max(X);
The probability that q (X) lost efficacy for nodes X;
P (X) is the effective probability of nodes X, p (X)=1-q (X);
M T k(X) for tree T (X) is that k-can safeguard the probability of (reliably), tree T (X) be can not occur on maintainable all paths that are meant (X) leaf node of k-from root node X to T k or k individual more than continuous failure node;
T (X) is for being the tree structure of root node with X;
Childs (X) is the set that all child nodes of nodes X are formed;
D is E M k(X, the d) variable in, incident E M k(X, d) one of condition of Fa Shenging is | L F Max ( X ) | < d ( d &le; k ) , YY ∈ Childs (X), k be the radius of known neighborhood be in the known neighborhood node with the maximum distance of present node, R is the tree root node of tree structure.
The maintaining method that the new node of described step (2) adds is as follows:
The first step: settle new node, when certain node is received joining request of new node, according to nodal function, for new node is arranged a suitable position;
Second step: initialization new node structural model, obtain the partial structurtes model of its adjacent node, according to these information, the partial structurtes model of initialization self;
The 3rd step: node is set up incidence relation around the structural remodeling, new node and insertion position, for keeping architectural feature constant, correspondingly removes original partial association relation;
The 4th step: the partial structurtes model that upgrades affected node, the adding announcement that new node sends, comprise this node partial structurtes model information, after node is received announcement, consistency according to announcement inspection self partial structurtes model, if inconsistent, new model and change the announcement content more then with the partial structurtes model of self, and continue to transmit announcement.
The maintaining method that the node of described step (2) initiatively withdraws from is as follows:
The first step: structural remodeling, the deletion node desiring to leave and with the incidence relation of neighbors, for keeping architectural feature constant, need correspondingly in leaving the neighbors of node, set up some new incidence relations;
Second step: the partial structurtes model that upgrades affected node, leaving node sends and leaves announcement, comprise this node partial structurtes model information, after node is received announcement, consistency according to announcement inspection self partial structurtes model, if inconsistent, new model and change the announcement content more then with the partial structurtes model of self, and continue to transmit announcement.
The maintaining method of the node failure of described step (2) is as follows:
The first step: node state observation, two kinds of optional modes are wherein arranged: (1) adopts the Probe mode, each node periodically sends inquiry message to neighbors, and expect the response message of its neighbors, if in time-out time, do not receive certain node response message, then observe this neighbors and be in failure state; Otherwise, observe this neighbors and be in effective status; (2) adopt the Gossip mode, each node periodically sends heartbeat message to neighbors, if node is not all received heartbeat message from certain neighbors at one-period in the time, then this node observes this neighbors and is in failure state; Otherwise, then observe this neighbors and be in effective status;
Second step: failure node judges that two class optional methods are wherein arranged: (1) independent decision method, and promptly whether node only comes predicate node to lose efficacy according to self to the observed result of neighbors state; (2) cooperation decision method, the all corresponding observation group of each node, constitute by its all or part neighbors, this destination node is observed and with part or all of the sharing in group of observed result, node judges jointly according to the observed result of other nodes sharing in self observed result and the group whether destination node lost efficacy.
The 3rd step: structural remodeling, after certain node judges that its certain neighbors lost efficacy, will be according to known partial structurtes model, nearest successively preferential trial connects other nodes of its known neighborhood scope, perhaps successfully be connected to an effective node, then directly carried out for the 4th step, perhaps all points that can be attempted connecting all lost efficacy, then can't need manually to get involved from safeguarding;
The 4th step: the partial structurtes model that upgrades affected node, the initiation node of structural remodeling also is that the finder of failure node will send the reconstruct announcement, comprise this node partial structurtes model information, after node is received announcement, consistency according to announcement inspection self partial structurtes model, if inconsistent, new model and change the announcement content more then with the partial structurtes model of self, and continue to transmit announcement.
Principle of the present invention: the present invention proposes a kind of service cooperated node organization mechanism, and, provided distributed frame maintaining method based on certain probability of success at node dynamic feature based on structure.Coordinate each service node collaborative work under the distributed environment in order to be implemented in, the separate autonomous service node of a large amount of scripts that the present invention will be dispersed on the Internet is organized, set up incidence relation (organizing as the service node that will have identity function) according to its functional characteristics between node, what form a series of interconnection has a fixed structure service organization of (comprising linear, annular, tree-like).Because the dynamic of node (add or withdraw from, especially may lose efficacy), the incidence relation of service node can be destroyed, so that the structure of service organization needs is maintained.Owing to do not have main controlled node in the service organization, simultaneously for fear of requiring all nodes all to safeguard the global structure model, each node maintenance partial structurtes view, promptly known nodal information in its certain neighborhood scope, these information are called the partial structurtes model.When node adds or initiatively withdraws from, have only the neighborhood interior nodes to be affected, need to revise its partial structurtes model.When some node failures occurring, node is according to the partial structurtes model repair structure of its grasp.Because the only known partial structurtes model of node, not every failure conditions can both be repaired, and for example when all nodes in the known neighborhood lost efficacy simultaneously, node can't connect with other effective nodes in the service organization, and structure can't be maintained this moment.Obviously, the probability safeguarded of structure depends on the size of the partial structurtes model scope that each node is known, and the neighborhood node that promptly node is known is many more, can safeguard that probability is big more, but the expense of each node structure model consistency maintenance is also big more simultaneously.
The present invention's advantage compared with prior art is:
(1) the present invention does not need main controlled node in the implementation structure maintenance process, has avoided the problem that exists in the focus control modes such as single point failure, performance bottleneck.
(2) the present invention adopts the method for safeguarding the partial structurtes model, guarantees the maintainability of structure under certain probability.Safeguard all that with each node the method for global structure model compares, sacrificed some and can safeguard probability, but greatly reduced maintenance costs.
Description of drawings
Fig. 1 is realization flow figure of the present invention;
Fig. 2 is a tree structure service organization in the embodiment of the invention 1;
Fig. 3 is the process that 1 new node adds tree structure in the embodiment of the invention;
Fig. 4 is the tree structure after the 1 employing father node alternative reconstruct in the embodiment of the invention;
Fig. 5 is the tree structure after the reconstruct of employing child node alternative in the embodiment of the invention 1;
Fig. 6 is the tree structure restructuring procedure under the node failure situation in the embodiment of the invention 1;
Fig. 7 is a linear structure service organization in the embodiment of the invention 2;
Fig. 8 is the process that new node adds linear structure in the embodiment of the invention 2;
Fig. 9 initiatively withdraws from the process of linear structure for node in the embodiment of the invention 2;
Figure 10 is the linear structure restructuring procedure under the node failure situation in the embodiment of the invention 2;
Figure 11 is a loop configuration service organization in the embodiment of the invention 3;
Figure 12 is the process that new node adds loop configuration in the embodiment of the invention 3;
Figure 13 initiatively withdraws from the process of loop configuration for node in the embodiment of the invention 3;
Figure 14 is the loop configuration restructuring procedure under the node failure situation in the embodiment of the invention 3.
Embodiment
Embodiment 1
As shown in Figure 2, with the tree structure be the embodiment that example provides above-mentioned technology path below.Cooperative nodes forms tree structure as shown in Figure 2, and each node is finished the function to its superior node (father node) reported data in the tree, and the rank of establishing arbitrary node X is L (X), and L (B)>L (N)>L (D) 〉=L (C) is arranged.If the probability that individual node lost efficacy is q=0.01, and requires the probability safeguarded of total to be not less than 0.99.
Calculate shown in Figure 2 with A be the tree structure of root respectively at k=1,2 ... the time the probability safeguarded as follows:
M T 1 ( A ) = Pr ( E M 1 ( A , 1 ) ) = 0.99 &CenterDot; Pr ( E M 1 ( B , 1 ) ) 0.01 &CenterDot; Pr ( E M 1 ( B , 0 ) )
= 0.99 &CenterDot; Pr ( E M 1 ( B , 1 ) ) = 0.9509900499
Pr ( E M 1 ( B , 1 ) ) = 0.99 &CenterDot; Pr ( E M 1 ( C , 1 ) ) &CenterDot; Pr ( E M 1 ( D , 1 ) ) + 0.01 &CenterDot; Pr ( E M 1 ( C , 0 ) ) &CenterDot; Pr ( E M 1 ( D , 0 ) )
= 0.99 &CenterDot; Pr ( E M 1 ( C , 1 ) ) &CenterDot; Pr ( E M 1 ( D , 1 ) ) = 0.96059601
Pr ( E M 1 ( C , 1 ) ) = 0.99
Pr ( E M 1 ( D , 1 ) ) = 0.99 &CenterDot; Pr ( E M 1 ( E , 1 ) ) + 0.01 &CenterDot; Pr ( E M 1 ( E , 0 ) ) = 0.9 9 &times; 0.99 = 0.9801
M T 2 ( A ) = Pr ( E M 2 ( A , 2 ) ) = 0.99 &CenterDot; Pr ( E M 2 ( B , 2 ) ) + 0.01 &CenterDot; Pr ( E M 2 ( B , 1 ) )
= 0.99 &CenterDot; Pr ( E M 2 ( B , 2 ) ) + 0.01 &CenterDot; Pr ( E M 2 ( B , 1 ) ) = 0.9992139102
Pr ( E M 2 ( B , 1 ) ) = 0.99 &CenterDot; Pr ( E m 2 ( C , 1 ) ) &CenterDot; Pr ( E M 2 ( D , 1 ) ) + 0.01 &CenterDot; Pr ( E m 2 ( C , 0 ) ) &CenterDot; Pr ( E M 2 ( D , 0 ) )
= 0.99 &CenterDot; Pr ( E M 2 ( C , 1 ) ) &CenterDot; Pr ( E M 2 ( D , 1 ) ) = 0.96059601
Pr ( E M 2 ( B , 2 ) ) = 0.99 &CenterDot; Pr ( E M 2 ( C , 2 ) ) &CenterDot; Pr ( E M 2 ( D , 2 ) ) + 0.01 &CenterDot; Pr ( E M 2 ( C , 1 ) ) &CenterDot; Pr ( E M 2 ( D , 1 ) )
= 0.989901 + 0.00970299 = 0.99960399
Pr ( E M 2 ( C , 1 ) ) = 0.99
Pr ( E M 2 ( C , 2 ) ) = 1
Pr ( E M 2 ( D , 1 ) ) = 0.99 &CenterDot; Pr ( E M 2 ( E , 1 ) ) + 0.01 &CenterDot; Pr ( E M 2 ( E , 0 ) )
= 0.99 &CenterDot; Pr ( E M 2 ( E , 1 ) ) = 0.9801
Pr ( E M 2 ( D , 2 ) ) = 0.99 &CenterDot; Pr ( E M 2 ( E , 2 ) ) + 0.01 &CenterDot; Pr ( E M 2 ( E , 1 ) ) = 0.9999
Pr ( E M 2 ( E , 1 ) ) = 0.99
Pr ( E M 2 ( E , 2 ) ) = 1
Because M T 1 ( A ) < 0.99 < M T 2 ( A ) , Therefore k gets 2 and just satisfies and can safeguard that probability is not less than 0.99.Therefore, in this routine node partial structurtes model, except that child node, father node, also write down its grandfather's node (if any) information.
Enumerate three kinds of situations below and describe the maintenance process of this tree structure respectively.
(1) new node N adds
The first step: after Node B is received the joining request of node N (information such as address, rank that comprises node N), relatively itself and N hierarchical relationship, because L (N)<L (B), Node B compares N and its all child nodes, because L (N)>L (C) and L (N)>L (D), so, N is arranged under the B node, on C, the D node.
Second step: node N is from the partial structurtes model of Node B acquisition B, and then N can be known the information of its grandfather's node A.Thereby finish the initialization of self partial structurtes model, i.e. N.grandparent=A, N.parent=B, N.childs={C, D}.
The 3rd step: node N connects as child node and B, connects as father node and C, D, removes the set membership between B and C, D simultaneously, as shown in Figure 3.Revise simultaneously the partial structurtes model of B, C, D partial information (B.childs={N}, C.parent=N, D.parent=N).
The 4th step: node N sends to B, C, D and adds announcement (the partial structurtes model information that comprises N), and according to announcement, the B node judges that its partial structurtes model need not to upgrade, so need not to carry out any operation; The C node is checked out the inconsistency of its partial structurtes model, then new model (C.grandparent=B) more, but because C does not have child node, it need not to transmit announcement; The D node is checked out the inconsistency of its partial structurtes model, and more new model (D.grandparent=B), and forwarding is announced (the partial structurtes model that comprises D) and given E; The E node is checked out the inconsistency of its partial structurtes model equally, more new model (E.grandparent=N), and no longer forwarding announcement.
(2) Node B initiatively withdraws from
The first step: if L (D)=L (C), B selects the father node alternative, C, D node are connected under the A node, delete the incidence relation between B and A, C, D, revise partial information (A.childs={C, the D of the partial structurtes model of A, C, D simultaneously, ..., C.parent=A, D.parent=A), as shown in Figure 4; Otherwise L (D)>L (C), B chooser node alternative, the D node is connected under the A node, the C node is connected under the D node, deletes the incidence relation between B and A, C, D, revises the partial information (A.childs={D of the partial structurtes model of B, C, D simultaneously, ..., C.parent=D, D.parent=A), as shown in Figure 5.
Second step: Node B sends to A, C, D and withdraws from announcement, and when L (D)=L (C), according to announcement, the A node judges that its partial structurtes model need not to upgrade, so need not to carry out any operation; The C node is checked out the inconsistency of its partial structurtes model, then new model (it is C.grandparent=NULL that grandfather's node of C is changed to sky) more, but because C does not have child node, it need not to transmit announcement; The D node is checked out the inconsistency of its partial structurtes model, and more new model (it is D.grandparent=NULL that grandfather's node of D is changed to sky), and forwarding is announced (the partial structurtes model that comprises D) and given E; The E node is checked out the inconsistency of its partial structurtes model equally, more new model (E.grandparent=A), and no longer forwarding announcement.When L (D)>L (C), upgrade similarly affected node D, E the partial structurtes model (D.grandparent=NULL, E.grandparent=A).
(3) Node B lost efficacy
The first step: node state observation.Adopt the Probe mode, each node is every T PeriodTime to neighbors send inquiry message (" Are you alive? "), and expect the response message (" Yes, I am ") of its neighbors, if at T TimeoutDo not receive certain node response message in time, then observe this neighbors and be in failure state; Otherwise, observe this neighbors and be in effective status.
Second step: failure node is judged.Adopt the cooperation decision method, each node is provided with a counter C for its each neighbors (destination node hereinafter referred to as) FailThe record object node is observed the number of times that is in failure state, and initial value is 0, works as C FailDuring=M, judge that then destination node lost efficacy, and cancelled corresponding counter C Fail, and carried out for the 3rd step.When (for example observing group, the observation group of B can be made up of node A, C, D) in certain node observe destination node and be in inefficacy, it will announce other all nodes in the group (can quicken like this in the group other nodes to the judgement of destination node failure state); Be in effective status when observing destination node, and C FailBe not 0 o'clock, it is with other all nodes (because the C in its observed result announcement group FailBe not to observe destination node to some node possible errors in the 0 explanation group to lose efficacy, need this moment to share the effective information of destination node that observes, avoid wrong and judge).Node judges jointly according to the observed result of other nodes sharing in self observed result and the group whether destination node lost efficacy.When certain node observe destination node be in failure state or receive from group in the failure notification of other nodes, then with C FailAdd 1; When observing the vaild notice that destination node is in failure state or receives other nodes in group, C then resets FailBe 0.
The 3rd step: structural remodeling.After node D finds that B lost efficacy, D attempts connecting its grandfather's node A, if A is effective, then D applies for child node and is connected under the A node, the partial information of modification A, D partial structurtes model (A.childs={D ... }, D.parent={A}, D.grandparent=A.parent=NULL), as shown in Figure 6.
The 4th step: the partial structurtes model that upgrades affected node.Node D sends reconstruct announcement (comprising this node partial structurtes model information), after node E receives announcement, is checked through the inconsistency of self partial structurtes model, more new model (E.grandparent=D) but need not to continue to transmit announcement.
Embodiment 2
Be the embodiment that example provides above-mentioned technology path below with the linear structure.Cooperative nodes forms linear structure as shown in Figure 7, and each node is to its right neighbors pass-along message on the line.If the probability that individual node lost efficacy is q=0.01, and requires the probability safeguarded of total to be not less than 0.99.
Determine the neighborhood scope k that each node needs is safeguarded according to the computational methods of maintainable probability.
The line structure that calculates node shown in Figure 7 sum n=4 is respectively at k=1,2 ... the time the probability safeguarded as follows:
M L 1 ( 4,0.01 ) = 0.99 &CenterDot; M L 1 ( 3,0.01 )
= 0.96059601
M L 1 ( 1 , 0 . 01 ) = 0.99 &CenterDot; M L 1 ( 0,0.01 ) = 0.99
M L 1 ( 2,0.01 ) = 0.99 &CenterDot; M L 1 ( 1,0.01 ) = 0.9801
M L 1 ( 3,0.01 ) = 0.99 &CenterDot; M L 1 ( 2,0.01 ) = 0.970299
M L 2 ( 4,0.01 ) = 0.99 &CenterDot; ( M L 1 ( 3,0.01 ) + 0.01 &CenterDot; M L 1 ( 2,0.01 ) )
= 0.999702
M L 2 ( 1,0.01 ) = 1
M L 2 ( 2,0.01 ) = 0.99 &CenterDot; ( M L 2 ( 1,0.01 ) + 0.01 &CenterDot; M L 1 ( 0,0.01 ) ) = 0.9999
M L 2 ( 3,0.01 ) = 0.99 &CenterDot; ( M L 2 ( 2,0.01 ) + 0.01 &CenterDot; M L 2 ( 1,0.01 ) ) = 0.999801
Because M L 1 ( 4,0.01 ) < 0.99 < M L 2 ( 4,0.01 ) , Therefore k gets 2 and just satisfies and can safeguard that probability is not less than 0.99.Therefore, in this routine node partial structurtes model, except that the direct connected node of node, also write down right neighbors (if any) information of its right neighbors.
Enumerate three kinds of situations below and describe the maintenance process of this linear structure respectively.
(1) new node N adds
The first step: after Node B was received join request (address that comprises node N) of node N, Node B added N as right neighbors.
Second step: node N is from the partial structurtes model of Node B acquisition B, and then N can be known the information of the former left and right neighbors of B.Thereby finish the initialization of self partial structurtes model, i.e. N.left=B, N.right=B.right=C, N.right.right=B.right.right=D.
The 3rd step: the former right neighbors C of node N and B and B sets up incidence relation, removes the incidence relation between B, C simultaneously, as shown in Figure 8.Revise simultaneously the partial structurtes model of B, C partial information (B.right=N, C.left=N).
The 4th step: node N sends to B, C and adds announcement (the partial structurtes model information that comprises N), and according to announcement, the C node judges that its partial structurtes model need not to upgrade, so need not to carry out any operation; The B node is checked out the inconsistency of its partial structurtes model, then new model (B.right.right=N.right=C) more, and announcement is transmitted to its left neighbors A; The A node is checked out the inconsistency of its partial structurtes model, and more new model (A.right.right=N) because A does not have left neighbors, need not to transmit announcement again.
(2) Node B initiatively withdraws from
The first step: remove the incidence relation between B and A, C, revise simultaneously the partial structurtes model of A, C partial information (A.right=C, C.left=A), as shown in Figure 9.
Second step: Node B sends to A, C and withdraws from announcement, and according to announcement, the A node is checked out the inconsistency of its partial structurtes model, and new model (A.right.right=B.right.right=D) more is not because A has left neighbors, so need not to transmit announcement again; The C node judges that its partial structurtes model need not to upgrade, so need not to carry out any operation.
(3) Node B lost efficacy
The first step: node state observation.Adopt the Probe mode, each node is every T PeriodTime to neighbors send inquiry message (" Are you alive? "), and expect the response message (" Yes, I am ") of its neighbors, if at T TimeoutDo not receive certain node response message in time, then observe this neighbors and be in failure state; Otherwise, observe this neighbors and be in effective status.
Second step: failure node is judged.Adopt the cooperation decision method, each node is provided with a counter C for its each neighbors (destination node hereinafter referred to as) FailThe record object node is observed the number of times that is in failure state, and initial value is 0, works as C FailDuring=M, judge that then destination node lost efficacy, and cancelled corresponding counter C Fail, and carried out for the 3rd step.When (for example observing group, the observation group of B can be made up of node A, C) in certain node observe destination node and be in inefficacy, it will announce other all nodes in the group (can quicken like this in the group other nodes to the judgement of destination node failure state); Be in effective status when observing destination node, and C FailBe not 0 o'clock, it is with other all nodes (because the C in its observed result announcement group FailBe not to observe destination node to some node possible errors in the 0 explanation group to lose efficacy, need this moment to share the effective information of destination node that observes, avoid wrong and judge).Node judges jointly according to the observed result of other nodes sharing in self observed result and the group whether destination node lost efficacy.When certain node observe destination node be in failure state or receive from group in the failure notification of other nodes, then with C FailAdd 1; When observing the vaild notice that destination node is in failure state or receives other nodes in group, C then resets FailBe 0.
The 3rd step: structural remodeling.After node A finds that B lost efficacy, A attempts connected node C (A.right.right=C), if C is effective, then A applies for the left side that left neighbors is connected to C node ground, revise the partial information (A.right=C of A, C partial structurtes model, A.right.right=D, C.left=A), as shown in figure 10.
The 4th step: the partial structurtes model that upgrades affected node.Node A sends reconstruct announcement (comprising this node partial structurtes model information).Node C judges that its partial structurtes model need not to upgrade.
Embodiment 3
Be the embodiment that example provides above-mentioned technology path below with the loop configuration.Cooperative nodes forms loop configuration as shown in figure 11, and each node is to its clockwise direction neighbors pass-along message on the ring.If the probability that individual node lost efficacy is q=0.01, and requires the probability safeguarded of total to be not less than 0.99.
Determine the neighborhood scope k that each node needs is safeguarded according to the computational methods of maintainable probability.
The ring structure that calculates node shown in Figure 11 sum n=4 is respectively at k=1,2 ... the time the probability safeguarded as follows:
Figure S2008101009864D00101
M C 1 ( 4,0.01 ) = 0.99 &CenterDot; M C 1 ( 3,0.01 )
= 0.96059601
M C 1 ( 1,0.01 ) = 0.99 &CenterDot; M C 1 ( 0 , 0.01 ) = 0.99
M C 1 ( 2,0.01 ) = 0.99 &CenterDot; M C 1 ( 1,0.01 ) = 0.9801
M C 1 ( 3,0.01 ) = 0.99 &CenterDot; M C 1 ( 2,0.01 ) = 0.970299
M C 2 ( 4,0.01 ) = 0.99 &CenterDot; ( M C 1 ( 3,0.01 ) + 0.01 &CenterDot; M C 1 ( 2,0.01 ) )
= 0.99960399
M C 2 ( 1,0.01 ) = 1 - 0.01 = 0.99
M C 2 ( 2,0.01 ) = 1 - 0.01 2 = 0.9999
M C 2 ( 3,0.01 ) = 0.99 &CenterDot; ( M C 2 ( 2 , 0 . 01 ) + 0.01 &CenterDot; M C 2 ( 1,0.01 ) ) = 0.999702
Because M C 1 ( 4,0.01 ) < 0.99 < M C 2 ( 4,0.01 ) , Therefore k gets 2 and just satisfies and can safeguard that probability is not less than 0.99.Therefore, in this routine node partial structurtes model, except that the direct connected node of node, also write down the clockwise neighbors of its clockwise neighbors.
Enumerate three kinds of situations below and describe the maintenance process of this loop configuration respectively.
(1) new node N adds
The first step: after Node B was received join request (address that comprises node N) of node N, Node B added N as clockwise neighbors.
Second step: node N obtains the partial structurtes model of B from Node B, finish the initialization of self partial structurtes model, be N.untiClockwise=B, N.clockwise=B.clockwise=C, N.clockwise.clockwise=B.clockwise.clockwise=D.
The 3rd step: the former clockwise neighbors C of node N and B and B sets up incidence relation, removes the incidence relation between B, C simultaneously, as shown in figure 12.Revise simultaneously the partial structurtes model of B, C partial information (B.clockwise=N, C.untiClockwise=N).
The 4th step: node N sends to B, C and adds announcement (the partial structurtes model information that comprises N), and the C node judges that its partial structurtes model need not to upgrade, so need not to carry out any operation; The B node is checked out the inconsistency of its partial structurtes model, then new model (B.clockwise.clockwise=N.clockwise=C) more, and announcement is transmitted to its counterclockwise neighbors A; The A node is checked out the inconsistency of its partial structurtes model, and more new model (A.clockwise.clockwise=N), and A is transmitted to D with announcement, and the D node judges that its partial structurtes model need not to upgrade, and does not carry out any operation.
(2) Node B initiatively withdraws from
The first step: remove the incidence relation between B and A, C, revise simultaneously the partial structurtes model of A, C partial information (A.clockwise=C, C.untiClockwise=A), as shown in figure 13.
Second step: Node B sends to A, C and withdraws from announcement, and according to announcement, the C node judges that its partial structurtes model need not to upgrade, so need not to carry out any operation; The A node is checked out the inconsistency of its partial structurtes model, new model (A.clockwise.clockwise=B.clockwise.clockwise=D) more, and A is transmitted to node D with announcement, and node D upgrades partial model (D.clockwise.clockwise=C);
(3) Node B lost efficacy
The first step: node state observation.Adopt the Probe mode, each node is every T PeriodTime to neighbors send inquiry message (" Are you alive? "), and expect the response message (" Yes, I am ") of its neighbors, if at T TimeoutDo not receive certain node response message in time, then observe this neighbors and be in failure state; Otherwise, observe this neighbors and be in effective status.
Second step: failure node is judged.Adopt the cooperation decision method, each node is provided with a counter C for its each neighbors (destination node hereinafter referred to as) FailThe record object node is observed the number of times that is in failure state, and initial value is 0, works as C FailDuring=M, judge that then destination node lost efficacy, and cancelled corresponding counter C Fail, and carried out for the 3rd step.When (for example observing group, the observation group of B can be made up of node A, C) in certain node observe destination node and be in inefficacy, it will announce other all nodes in the group (can quicken like this in the group other nodes to the judgement of destination node failure state); Be in effective status when observing destination node, and C FailBe not 0 o'clock, it is with other all nodes (because the C in its observed result announcement group FailBe not to observe destination node to some node possible errors in the 0 explanation group to lose efficacy, need this moment to share the effective information of destination node that observes, avoid wrong and judge).Node judges jointly according to the observed result of other nodes sharing in self observed result and the group whether destination node lost efficacy.When certain node observe destination node be in failure state or receive from group in the failure notification of other nodes, then with C FailAdd 1; When observing the vaild notice that destination node is in failure state or receives other nodes in group, C then resets FailBe 0.
The 3rd step: structural remodeling.After node A finds that B lost efficacy, A attempts connected node C (A.clockwise.clockwise=C), if C is effective, then A applies for counterclockwise neighbors connection C node, revise the partial information (A.clockwise=C of A, C partial structurtes model, A.clockwise.clockwise=D, C.untiClockwise=A), as shown in figure 14.
The 4th step: the partial structurtes model that upgrades affected node.Node A sends reconstruct announcement (comprising this node partial structurtes model information).Node C judges that its partial structurtes model need not to upgrade; Node D checks out the inconsistency of its partial structurtes model, and C is given in the announcement of more new model (D.clockwise.clockwise=C), and forwarding, because the partial structurtes model of C need not to upgrade, C does not carry out any operation after receiving announcement.

Claims (4)

1. the maintaining method of service cooperated node organization structure in distributed environment is characterized in that following steps:
(1) initialization is determined the maintainable neighborhood scope of each node according to the computational methods of maintainable probability;
(2) in described neighborhood scope, new node adding, node are initiatively withdrawed from node failure and safeguard;
The maintaining method that the new node of described step (2) adds is as follows:
The first step: settle new node, when certain node is received joining request of new node, according to the new node function, for new node is arranged a suitable position;
Second step: initialization new node structural model, obtain the partial structurtes model of its adjacent node, according to these information, the partial structurtes model of initialization self;
The 3rd step: node is set up incidence relation around the structural remodeling, new node and insertion position, for keeping architectural feature constant, correspondingly removes original partial association relation;
The 4th step: the partial structurtes model that upgrades affected node, new node sends and adds announcement, comprise this node partial structurtes model information, after node is received announcement, consistency according to announcement inspection self partial structurtes model, if inconsistent, then upgrade the partial structurtes model and change the announcement content, and continue to transmit announcement with the partial structurtes model of self;
The maintaining method that the node of described step (2) initiatively withdraws from is as follows:
The first step: structural remodeling, deletion leave node and with the incidence relation of neighbors, for keeping architectural feature constant, need correspondingly in leaving the neighbors of node, set up some new incidence relations;
Second step: the partial structurtes model that upgrades affected node, leaving node sends and leaves announcement, comprise this node partial structurtes model information, after node is received announcement, consistency according to announcement inspection self partial structurtes model, if inconsistent, then upgrade the partial structurtes model and change the announcement content, and continue to transmit announcement with the partial structurtes model of self;
The maintaining method of the node failure of described step (2) is as follows:
The first step: node state observation, two kinds of optional modes are wherein arranged: (1) adopts the Probe mode, each node periodically sends inquiry message to neighbors, and expect the response message of its neighbors, if in time-out time, do not receive certain neighbors response message, then observe this neighbors and be in failure state; Otherwise, observe this neighbors and be in effective status; (2) adopt the Gossip mode, each node periodically sends heartbeat message to neighbors, if node is not all received heartbeat message from certain neighbors at one-period in the time, then this node observes this neighbors and is in failure state; Otherwise, then observe this neighbors and be in effective status;
Second step: failure node judges that two class optional methods are wherein arranged: (1) independent decision method, and promptly node only judges to the observed result of neighbors state whether neighbors lost efficacy according to self; (2) cooperation decision method, the all corresponding observation group of each node, constitute by its all or part neighbors, destination node is observed and observed result is partly or entirely shared in group, node judges jointly according to the observed result of other nodes sharing in self observed result and the group whether destination node lost efficacy;
The 3rd step: structural remodeling, after certain node judges that its certain neighbors lost efficacy, will be according to known partial structurtes model, according to nearest preferential, attempt connecting other nodes of known neighborhood scope successively, if successfully be connected to an effective node, then directly carried out for the 4th step, if all nodes that can be attempted connecting all lost efficacy, then can't need manually to get involved from safeguarding;
The 4th step: the partial structurtes model that upgrades affected node, the initiation node of structural remodeling is that the finder of failure node will send the reconstruct announcement, comprise this node partial structurtes model information, after node is received announcement, consistency according to announcement inspection self partial structurtes model, if inconsistent, then upgrade the partial structurtes model and change the announcement content, and continue to transmit announcement with the partial structurtes model of self.
2. the maintaining method of service cooperated node organization structure in distributed environment according to claim 1 is characterized in that: the computational methods for the maintainable probability of linear structure node in the described step (1) are:
The maintainable probability of linear structure node:
Figure FSB00000299872900031
K be the radius of known neighborhood be in the known neighborhood node with the maximum distance of present node, q is the probability that individual node lost efficacy, n is a linear structure node sum.
3. the maintaining method of service cooperated node organization structure in distributed environment according to claim 1 is characterized in that: the computational methods for the maintainable probability of loop configuration node in the described step (1) are:
The maintainable probability of loop configuration node:
K be the radius of known neighborhood be in the known neighborhood node with the maximum distance of present node, q is the probability that individual node lost efficacy, n is a loop configuration node sum.
4. the maintaining method of service cooperated node organization structure in distributed environment according to claim 1 is characterized in that: the computational methods for the maintainable probability of tree structures node in the described step (1) are:
Figure FSB00000299872900033
The maintainable probability of tree structures node then
Figure FSB00000299872900034
Wherein:
Figure FSB00000299872900035
Be an incident, the condition that incident takes place is
Figure FSB00000299872900036
It all is that k-is maintainable that wherein d≤k, and T (X) does not have subtree or all subtrees; L F(X) be the stale link of nodes X, be from X to node Z, the path of Z ∈ T (X), note is done
Figure FSB00000299872900037
And all nodes that satisfy on (1) path all lost efficacy; (2) Z is that the child node of leaf node or Z is the operate as normal node; Nodes X has one to many stale links, and the stale link set of note nodes X is S LF(X)={ L F(X) | L F(X) be the stale link of nodes X }, wherein the longest stale link of note is
Figure FSB00000299872900041
When X was the operate as normal node, nodes X only had an empty stale link, and note is made φ;
The probability that q (X) lost efficacy for nodes X;
P (X) is the effective probability of nodes X, p (X)=1-q (X);
For tree T (X) is the maintainable probability of k-, tree T (X) continuous failure node more than k or k can not occur on maintainable all paths that are meant (X) leaf node from root node X to T of k-;
T (X) is for being the tree structure of root node with X;
Childs (X) is the set that all child nodes of nodes X are formed;
D is
Figure FSB00000299872900043
In variable, Y ∈ Childs (X), k be the radius of known neighborhood be in the known neighborhood node with the maximum distance of present node, R is the root node of tree structure.
CN2008101009864A 2008-02-27 2008-02-27 Maintenance method of service cooperated node organization structure in distributed environment Expired - Fee Related CN101247273B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008101009864A CN101247273B (en) 2008-02-27 2008-02-27 Maintenance method of service cooperated node organization structure in distributed environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008101009864A CN101247273B (en) 2008-02-27 2008-02-27 Maintenance method of service cooperated node organization structure in distributed environment

Publications (2)

Publication Number Publication Date
CN101247273A CN101247273A (en) 2008-08-20
CN101247273B true CN101247273B (en) 2011-02-02

Family

ID=39947500

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008101009864A Expired - Fee Related CN101247273B (en) 2008-02-27 2008-02-27 Maintenance method of service cooperated node organization structure in distributed environment

Country Status (1)

Country Link
CN (1) CN101247273B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2394392B1 (en) * 2009-02-05 2019-04-10 Telefonaktiebolaget LM Ericsson (publ) Topological location discovery in an ethernet network
CN102291250A (en) * 2011-04-25 2011-12-21 程旭 Method and device for maintaining network topology in cloud computing
CN102916930B (en) * 2011-08-02 2018-01-23 中兴通讯股份有限公司 Fused business network and its node, resource request routing iinformation acquisition methods
CN103152420B (en) * 2013-03-11 2016-03-02 汉柏科技有限公司 A kind of method avoiding single-point-of-failofe ofe Ovirt virtual management platform
US9323615B2 (en) * 2014-01-31 2016-04-26 Google Inc. Efficient data reads from distributed storage systems
CN104916127B (en) * 2014-03-13 2017-10-31 深圳市赛格导航科技股份有限公司 A kind of method and system of the distributed traffic of analysis in real time of car networking
CN104506357B (en) * 2014-12-22 2018-05-11 国云科技股份有限公司 A kind of high-availability cluster node administration method
CN104935634B (en) * 2015-04-27 2018-03-30 南京大学 Mobile device data sharing method based on Distributed shared memory
US10437212B2 (en) * 2015-04-28 2019-10-08 Schneider Electric Systems Usa, Inc. Distributed computing in a process control environment
CN107579837A (en) * 2016-07-05 2018-01-12 中兴通讯股份有限公司 The method and device that a kind of damaged business is repaired automatically
CN108462586B (en) * 2017-02-17 2021-05-11 中兴通讯股份有限公司 Method and device for selecting cooperative nodes
CN109491957A (en) * 2018-08-30 2019-03-19 中国船舶重工集团公司第七〇五研究所 A kind of general LINK topology detection method of signal processor
CN110380934B (en) * 2019-07-23 2021-11-02 南京航空航天大学 Distributed redundancy system heartbeat detection method
CN111580912A (en) * 2020-05-09 2020-08-25 北京飞讯数码科技有限公司 Display method and storage medium for multi-level structure resource group
WO2022232994A1 (en) * 2021-05-06 2022-11-10 Huawei Technologies Co., Ltd. Devices and methods for autonomous distributed control of computer networks

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1741489A (en) * 2005-09-01 2006-03-01 西安交通大学 High usable self-healing Logic box fault detecting and tolerating method for constituting multi-machine system
WO2007120127A2 (en) * 2006-04-13 2007-10-25 The Mitre Corporation Reliable neighbor node discovery
JP2008005306A (en) * 2006-06-23 2008-01-10 Kddi Corp Transmission controller, radio equipment and transmission control method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1741489A (en) * 2005-09-01 2006-03-01 西安交通大学 High usable self-healing Logic box fault detecting and tolerating method for constituting multi-machine system
WO2007120127A2 (en) * 2006-04-13 2007-10-25 The Mitre Corporation Reliable neighbor node discovery
JP2008005306A (en) * 2006-06-23 2008-01-10 Kddi Corp Transmission controller, radio equipment and transmission control method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郑静,卢锡城,曹建农.移动自组网中基于渗流理论的概率可靠分发协议.软件学报.2007,18(04), *

Also Published As

Publication number Publication date
CN101247273A (en) 2008-08-20

Similar Documents

Publication Publication Date Title
CN101247273B (en) Maintenance method of service cooperated node organization structure in distributed environment
EP3340055A1 (en) Communicating state information in distributed operating systems
US7805407B1 (en) System and method for dynamic configuration of replicated database servers
CN106170782B (en) System and method for creating highly scalable high availability clusters in massively parallel processing clusters of machines in a network
CN108234302B (en) Maintaining consistency in a distributed operating system for network devices
CA2267478C (en) Publish &amp; subscribe data processing apparatus, method and computer program product with use of a stream to distribute local information between neighbors in a broker structure
US6694368B1 (en) Communication apparatus and method between distributed objects
US8095670B2 (en) Protocol for enabling dynamic and scalable federation of enterprise service buses
CN101958805B (en) Terminal access and management method and system in cloud computing
US20060291459A1 (en) Scalable, highly available cluster membership architecture
US20190155261A1 (en) Smart node for a distributed mesh network
CN110351391A (en) A kind of vehicle diagnosis cloud platform system, service implementation method
CN103780497B (en) Extendible distributed coordination service management under a kind of cloud platform
CN105610981A (en) Quick operational information transfer platform
CN101252588A (en) Apparatus, system and method for distributing stream medium content
CN114301972A (en) Block chain link point hierarchical deployment method and system based on cloud edge cooperation
US20100262871A1 (en) Method for implementing highly available data parallel operations on a computational grid
Grace et al. Deep middleware for the divergent grid
CN102916830A (en) Implement system for resource service optimization allocation fault-tolerant management
CN101588388A (en) A kind of based on distributed adaptive service collaboration method and system thereof
Grace et al. A distributed architecture meta-model for self-managed middleware
CN104410529A (en) Minimum cover deployment method for controllers based on software definition data center network
US11061719B2 (en) High availability cluster management of computing nodes
Friedman Fuzzy group membership
CN203951500U (en) A kind of wireless sensor network based on P2P technology

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110202

Termination date: 20120227