CN102722146B - Distributed system control structure with failure protection function, and failure protection method - Google Patents
Distributed system control structure with failure protection function, and failure protection method Download PDFInfo
- Publication number
- CN102722146B CN102722146B CN 201210162638 CN201210162638A CN102722146B CN 102722146 B CN102722146 B CN 102722146B CN 201210162638 CN201210162638 CN 201210162638 CN 201210162638 A CN201210162638 A CN 201210162638A CN 102722146 B CN102722146 B CN 102722146B
- Authority
- CN
- China
- Prior art keywords
- node
- distributed system
- communication
- layer
- failure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Abstract
The invention provides a distributed system control structure with a failure protection function, and a failure protection method. The method comprises the following steps of: performing failure protection on the original connection in the distributed system, starting from a second layer of the distributed system, and performing connection between neighboring nodes at the same layer; during a process of communication between upper and lower layers or management, massively transmitting a control command to the lower layer nodes connected with an upper layer node by the upper layer node, and detecting whether the communication or the management is failed according to information returned back by the lower layer nodes; and if the communication or the management is failed, performing control by the neighboring lower layer nodes to recover the failed communication or the failed management. The method provided by the invention is suitable for occasions with high requirements on safety and reliability, and especially, suitable for the systems, such as a fire alarm system, and a mine safety system.
Description
Technical field
What the present invention relates to is a kind of distributed system control field, the present invention also relates to a kind of distributed system and controls failure protection method.
Background technology
The application that distributed system is controlled is very extensive, so reliability and fail safe that distributed system is controlled are particularly important.The reliability that at present distributed system is controlled is mainly derived from the fault tolerant mechanism of its structure self, communication even of overall importance or management interrupt, and local station still can maintenance work, but can not recover communication or the management interrupted.Therefore need a kind of effective guard method, be effectively protected after making the communication of distributed system control or being controlled at inefficacy, even communication and control are still effective.
As shown in Figure 1, this structure depends on the structure of distributed system control itself to existing distributed system control structure to the reliability of communication or management, and, after communication or management interrupt appear in system, can not recover.Therefore, need a kind of method to address the above problem.
Summary of the invention
The object of the present invention is to provide the distributed system control structure with fail safe that a kind of reliability and safety is high.The failure protection method that provides a kind of distributed system to control is provided.
The object of the present invention is achieved like this:
Distributed system control structure with fail safe of the present invention is: the adjacent node of the second layer of distributed system connects successively; In the 3rd node layer of distributed system, the node belonged under the same node control of last layer connects successively; The connected mode of other layers of distributed system is identical with the 3rd layer.
The failure protection method that distributed system of the present invention is controlled comprises:
The adjacent node of the second layer of distributed system connects successively; In the 3rd node layer of distributed system, the node belonged under the same node control of last layer connects successively; The connected mode of other layers of distributed system is identical with the 3rd layer;
When upper layer node is controlled the some nodes of lower floor, all nodes of controlling to lower floor send information simultaneously; If the node of controlling, return to acknowledge message after executing control task, also to return to acknowledge message if not controlled node;
After a certain lower level node being detected and returning without acknowledge message, i.e. communication or management were lost efficacy and were judged to be failure node, the node that failure node is adjacent is controlled failure node, and communication or supervisory signal that upper layer node is sent are sent to failure node, recovers communication or the management of losing efficacy; Again, after the confirmation information of receiving the failure node transmission until upper layer node, original communication connection is returned in redirect again.
The present invention carries out the fail safe setting to original connection in distributed system, from the second layer of distributed system, with layer adjacent node, connects setting; In levels communication or management process, upper layer node is by the control command mass-sending to coupled lower level node, and whether the information detection communication of returning according to lower level node or management lost efficacy; If find communication or management inefficacy, by adjacent lower level node, controlled, to recover communication or the management of losing efficacy.
The present invention, by the fail safe to the distributed system node, can increase reliability and fail safe that distributed system is controlled.After distributed system is controlled inefficacy, can recover rapidly communication or control, require high occasion for safety and reliability, particularly applicable such as fire alarm system, mine safety system etc.
The accompanying drawing explanation
Fig. 1 has the distributed system architecture schematic diagram now.
Fig. 2 distributed system fail safe of the present invention schematic diagram.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described further:
In conjunction with Fig. 2, it is 3 layers that the distributed system control structure with fail safe is divided into: ground floor has 1 main control computer, and the second layer has n slave, and the 3rd layer has m control unit under each slave is controlled.The second layer is connected slave 1 successively to slave n, and the 3rd layer of each slave controlled m lower control unit and connected successively, and the control units under the control of different control slaves do not connect.
Survey communication or manage and whether lost efficacy.When upper layer node sends control command to a certain node of lower floor, order can be mass-sended to all lower level nodes of controlling to upper layer node.After lower level node receives orders, if give the control command of oneself, carry out control command and return to acknowledge message; If not give the control command of oneself, still to return to acknowledge message.Upper layer node judges that according to the confirmation message returned whether communication is effective, if do not receive acknowledge message, thinks communication failure, otherwise thinks that communication effectively.
Communication or management recover to lose efficacy.If upper layer node finds that there is the confirmation message of not returning, think communication failure, at this moment enable fail safe.Because same node layer connects successively, so need to select the passage of fail safe after losing efficacy.System of selection is as follows: if node i lost efficacy, whether decision node i is last node of current layer node, if last node is sent to the i node by the i-1 node by control information, to recover inefficacy; If i is not last node, by the i+1 node, control information is sent to the i node.After upper layer node finds that the node lost efficacy recovers normally, reactivate former passage, adjacent lower level node will no longer be protected former failure node.
Claims (1)
1. the failure protection method that a distributed system is controlled is characterized in that:
The adjacent node of the second layer of distributed system connects successively; In the 3rd node layer of distributed system, the node belonged under the same node control of last layer connects successively; The connected mode of other layers of distributed system is identical with the 3rd layer;
When upper layer node is controlled the some nodes of lower floor, all nodes of controlling to lower floor send information simultaneously; If the node of controlling, return to acknowledge message after executing control task, also to return to acknowledge message if not controlled node;
After a certain lower level node being detected and returning without acknowledge message, i.e. communication or management were lost efficacy and were judged to be failure node, the node that failure node is adjacent is controlled failure node, and communication or supervisory signal that upper layer node is sent are sent to failure node, recovers communication or the management of losing efficacy; Again, after the confirmation information of receiving the failure node transmission until upper layer node, original communication connection is returned in redirect again.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201210162638 CN102722146B (en) | 2012-05-24 | 2012-05-24 | Distributed system control structure with failure protection function, and failure protection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201210162638 CN102722146B (en) | 2012-05-24 | 2012-05-24 | Distributed system control structure with failure protection function, and failure protection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102722146A CN102722146A (en) | 2012-10-10 |
CN102722146B true CN102722146B (en) | 2013-12-18 |
Family
ID=46947947
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 201210162638 Expired - Fee Related CN102722146B (en) | 2012-05-24 | 2012-05-24 | Distributed system control structure with failure protection function, and failure protection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102722146B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1581813A (en) * | 2003-08-01 | 2005-02-16 | 光桥科技(中国)有限公司 | Method for conducting data transmission using logic loop network in ethernet |
CN1741489A (en) * | 2005-09-01 | 2006-03-01 | 西安交通大学 | High usable self-healing Logic box fault detecting and tolerating method for constituting multi-machine system |
CN1889496A (en) * | 2006-07-19 | 2007-01-03 | 山东富臣发展有限公司 | Layer control tree-shape network based on CAN bus for supporting plug and use |
WO2008058933A1 (en) * | 2006-11-13 | 2008-05-22 | Siemens Aktiengesellschaft | Method for establishing bidirectional data transmission paths in a wireless meshed communication network |
CN101378327A (en) * | 2007-08-29 | 2009-03-04 | 中国移动通信集团公司 | Communication network system and method for processing communication network business |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
SE524863C2 (en) * | 2001-04-23 | 2004-10-12 | Transmode Systems Ab | Optical coarse wavelength division multiplexing system has multiple logical optical rings that form multiplexed ring structure, such that each ring links several nodes of ring structure |
-
2012
- 2012-05-24 CN CN 201210162638 patent/CN102722146B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1581813A (en) * | 2003-08-01 | 2005-02-16 | 光桥科技(中国)有限公司 | Method for conducting data transmission using logic loop network in ethernet |
CN1741489A (en) * | 2005-09-01 | 2006-03-01 | 西安交通大学 | High usable self-healing Logic box fault detecting and tolerating method for constituting multi-machine system |
CN1889496A (en) * | 2006-07-19 | 2007-01-03 | 山东富臣发展有限公司 | Layer control tree-shape network based on CAN bus for supporting plug and use |
WO2008058933A1 (en) * | 2006-11-13 | 2008-05-22 | Siemens Aktiengesellschaft | Method for establishing bidirectional data transmission paths in a wireless meshed communication network |
CN101378327A (en) * | 2007-08-29 | 2009-03-04 | 中国移动通信集团公司 | Communication network system and method for processing communication network business |
Also Published As
Publication number | Publication date |
---|---|
CN102722146A (en) | 2012-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104440923B (en) | A kind of emergent stop signal control system for robot and robot thereof | |
CN105095001B (en) | Virtual machine abnormal restoring method under distributed environment | |
CN103955188A (en) | Control system and method supporting redundancy switching function | |
CN104137477B (en) | For disposing the technology that situation changes in interconnecting nodes | |
CN102427412A (en) | Zero-delay disaster recovery switching method and system of active standby source based on content distribution network | |
CN103401696A (en) | Dual-network redundant communication system in industrial equipment and communication method thereof | |
CN105915426A (en) | Failure recovery method and device of ring network | |
CN104977907A (en) | Direct Connect Algorithm | |
CN105204952A (en) | Fault tolerance management method of multi-core operation system | |
CN108725521B (en) | Hot standby redundancy management system and method for main and standby control centers of rail transit | |
CN103455464A (en) | Relay device, connection management method, and information communication system | |
CN104461811A (en) | Graded and hierarchical spacecraft single particle soft error protection system structure | |
US7836208B2 (en) | Dedicated redundant links in a communicaton system | |
CN101163059B (en) | Network node detection method and apparatus | |
CN108445857B (en) | Design method for 1+ N redundancy mechanism of SCADA system | |
CN102722146B (en) | Distributed system control structure with failure protection function, and failure protection method | |
CN103051482A (en) | Method for isolating and restoring port based on FC (Fiber Channel) switchboard | |
KR101098041B1 (en) | Subway fire-sensing system and control method thereof | |
CN104714439A (en) | Safety relay box system | |
CN101753465B (en) | Protection method taking Ethernet Ring protection system to control VLAN message and device thereof | |
CN101568135A (en) | Communication method, communication equipment and communication system | |
JP2015087918A (en) | Tunnel disaster prevention system | |
CN102638369B (en) | Method, device and system for arbitrating main/standby switch | |
KR101846222B1 (en) | Redundancy system and controllin method thereof | |
CN109361672A (en) | A kind of the data back transmission method and system of safety insulating device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20131218 Termination date: 20190524 |
|
CF01 | Termination of patent right due to non-payment of annual fee |