CN104102559B - A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end - Google Patents
A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end Download PDFInfo
- Publication number
- CN104102559B CN104102559B CN201410337977.2A CN201410337977A CN104102559B CN 104102559 B CN104102559 B CN 104102559B CN 201410337977 A CN201410337977 A CN 201410337977A CN 104102559 B CN104102559 B CN 104102559B
- Authority
- CN
- China
- Prior art keywords
- link
- mainboard
- heart beating
- opposite end
- redundancy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000010247 heart contraction Effects 0.000 title claims abstract description 26
- 238000011069 regeneration method Methods 0.000 claims description 4
- 238000004891 communication Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 abstract description 4
- 230000009977 dual effect Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 208000033999 Device damage Diseases 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000034 method Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
Landscapes
- Hardware Redundancy (AREA)
- Techniques For Improving Reliability Of Storages (AREA)
Abstract
The present invention provides a kind of double controller storage system restarting link based on redundancy heart beating link and opposite end, belong to computer memory technical field, processor and processor are connected by the present invention by NTB and serial ports link can realize data image and buffer memory mirror image, and the duty that monitor in real time is to each other.When one of mainboard breaks down, another block can also pass through to restart link restarts opposite end mainboard, teaching display stand repair function, even if one piece of mainboard occurs cannot repairing fault, another block mainboard can also all working of taking over fault mainboard comprehensively, it is ensured that the high reliability of storage product and high stability.
Description
Technical field
The present invention relates to computer memory technical field, particularly relate to a kind of double controller storage system restarting link based on redundancy heart beating link and opposite end.
Background technology
The information of Computer Storage gets more and more, and it is more and more important, for preventing the data in computer from surprisingly losing, generally all adopt many important safety protection technique to guarantee the safety of data, the inefficacy of the running wastage of equipment, storage medium, running environment and artificial destruction etc., the reliability and stability of product equipment can be impacted, thus cause the phenomenon of device damage, loss of data, so that can not get effective guarantee.
Along with the Internet, the fast development of the industries such as electronic banking, people's safety to network data, stability requirement is more and more higher, and any product has its life problems, how the unstable factors such as failure problems, reduce the difficult problem becoming current single controller and the consideration of initial stage dual controller emphasis by this unstable factor.
Summary of the invention
The present invention provides a kind of double controller storage system restarting link based on redundancy heart beating link and opposite end, it is ensured that the high reliability of storage product and high stability.
So-called dual controller, it is simply that by two pieces of identical mainboards same backboards of collocation, control all of hard disk on this backboard.
Redundancy heart beating link includes NTB(non-transparent bridge) link and serial heartbeat link, the CPU between double control mainboard passes through NTB interconnected communication, carries out data image, buffer memory mirror image, and in real time as heart beating link, monitors the duty of the other side to each other;Being interconnected by serial ports link, as the redundancy heart beating link of system level, when NTB breaks down, heart beating link can be switched to serial ports link, serial ports link detect the other side's main board work state simultaneously;
Link is restarted in opposite end, it is simply that when a control mainboard finds that another control mainboard does not respond to for a long time, is considered as the other side's mainboard and breaks down, then send Reset signal to restart the other side to fault mainboard, it is achieved the self-regeneration of system;If one piece of mainboard occurs in that the fault that cannot repair, another block mainboard can all working of comprehensive taking over fault mainboard, prevention data is lost, system failure, it is ensured that the high reliability of storage product and high stability.
Present invention relates generally to the design of hardware circuit and software view.It is specifically divided into following several aspect:
1, two control mainboards use NTB to be connected by backboard, carry out data image, buffer memory mirror image by NTB, and as the heart beating link of system level, monitor two mainboards duty to each other;Two mainboards are connected to come all of hard disk of Access Management Access with between backboard by high speed connector simultaneously.
2, two control mainboards are connected by serial ports, and as the redundancy heart beating link of system level, when NTB breaks down, heart beating link can be switched to serial ports link, serial ports link detect opposite end main board work state.
3, link is restarted in opposite end, when mainboard detect opposite end mainboard do not respond to for a long time time, be considered as the other side and break down, restart the other side by this link, it is achieved self-regeneration.
Above 3 hardware circuit realize simultaneously need to the design of software view.
What the present invention proposed restart the double controller storage system of link based on redundancy heart beating link and opposite end can be greatly improved stability and the self-repairing capability of product.Processor and processor are connected by NTB and serial ports link and can realize data image and buffer memory mirror image, and the duty that monitor in real time is to each other.When one of mainboard breaks down, another block can also pass through to restart link restarts opposite end mainboard, teaching display stand repair function, even if one piece of mainboard occurs cannot repairing fault, another block mainboard can also all working of taking over fault mainboard comprehensively, it is ensured that the high reliability of storage product and high stability.
Accompanying drawing explanation
Below in conjunction with accompanying drawing, the invention will be further described;
Accompanying drawing 1 is the whole topology diagram that link is restarted in redundancy heart beating link and opposite end.
Detailed description of the invention
Hardware circuit according to the several aspects that the present invention relates to and the design of software view, be divided into following three aspect:
1. two control mainboards use NTB to be connected by backboard, carry out data image, buffer memory mirror image by NTB, and as the heart beating link of system level, monitor two mainboards duty to each other;Two mainboards are connected to come all of hard disk of Access Management Access with between backboard by high speed connector simultaneously.
2. two control mainboards are connected by serial ports, and as the redundancy heart beating link of system level, when NTB breaks down, heart beating link can be switched to serial ports link, serial ports link detect opposite end main board work state.
3. link is restarted in opposite end, when mainboard detect opposite end mainboard do not respond to for a long time time, be considered as the other side and break down, restart the other side by this link, it is achieved self-regeneration.
Fig. 1 is the whole topology diagram that link is restarted in redundancy heart beating link and opposite end, is described in further detail according to Fig. 1,
First being the redundancy structure between dual controller, two controllers are connected with backboard respectively through high speed connector.Each control mainboard takes onboard SAS controller and SASExpander.SASExpander connects all of SAS hard disk on backboard by the SAS bus of backboard, and two control mainboard and manage reading hard disk information by the mode of redundancy.
Secondly, the CPU on two mainboards passes through NTB(non-transparent bridge) bus connects, and this NTB adopts the mode bus of PCIEx8, and different protocol specifications is connected, and plays data image, the effect of buffer memory mirror image, and as heart beating link detecting real-time opposite end mainboard situation.For preventing cabling long, the outfan at each NTB adopts redriver to be strengthened by signal, and can be adjusted the driving force of redriver by bios software.
Again, it is connected by serial ports link between two pieces of mainboards.As the redundancy heart beating link of system level, when NTB breaks down and cannot work, serial ports link can replace the heart beating chain circuit function of NTB.
Finally, the CPU of every piece of mainboard is connected with the CPLD of another block mainboard by GPIOPin, and the effect that this CPLD module plays on mainboard is to control the electrifying timing sequence of monoblock mainboard and restart mode.When certain block mainboard detect another block mainboard break down time, normal mainboard sends Reset signal by the GPIOPin foot on local cpu to the CPLD on fault mainboard, and CPLD will restart the mainboard at oneself place after receiving order.Engineer can be WarmReset or ColdReset according to this Reset of code definition in collocation CPLD that needs of oneself.
By what time achieving the function of the mutual redundancy of dual controller above, it is achieved that the reduction of system failure rate, it is ensured that the high reliability of storage product and high stability.
Claims (3)
1. restart the double controller storage system of link based on redundancy heart beating link and opposite end, it is characterised in that by two pieces of identical mainboards same backboards of collocation, control all of hard disk on this backboard;Two are controlled mainboard and are restarted the redundant fashion management reading hard disk information of link by redundancy heart beating link and opposite end;
Redundancy heart beating link includes NTB link and serial ports link, and the CPU between double control mainboard passes through NTB link interconnected communication, carries out data image, buffer memory mirror image, and in real time as heart beating link, monitors the duty of the other side to each other;
Being interconnected by serial ports link, as the redundant link of system level, when NTB link failure, heart beating link can be switched to serial ports link, serial ports link detect the other side's main board work state simultaneously;
Link is restarted in opposite end, it is simply that when a control mainboard finds that another control mainboard does not respond to for a long time, is considered as the other side's mainboard and breaks down, then send Reset signal to restart the other side to fault mainboard, it is achieved the self-regeneration of system;If one piece of mainboard occurs in that the fault that cannot repair, another block mainboard can all working of comprehensive taking over fault mainboard.
2. double controller storage system according to claim 1, it is characterised in that two mainboards are connected to come all of hard disk of Access Management Access with between backboard by high speed connector simultaneously.
3. double controller storage system according to claim 1, it is characterised in that for preventing cabling long, the outfan at each NTB link adopts redriver to be strengthened by signal, and can be adjusted the driving force of redriver by bios software.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410337977.2A CN104102559B (en) | 2014-07-16 | 2014-07-16 | A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410337977.2A CN104102559B (en) | 2014-07-16 | 2014-07-16 | A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104102559A CN104102559A (en) | 2014-10-15 |
CN104102559B true CN104102559B (en) | 2016-07-06 |
Family
ID=51670730
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410337977.2A Active CN104102559B (en) | 2014-07-16 | 2014-07-16 | A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104102559B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104460611B (en) * | 2014-11-18 | 2017-11-21 | 启辰电子(苏州)有限公司 | A kind of distributed locker control system and its control method |
CN104461386A (en) * | 2014-12-01 | 2015-03-25 | 北京同有飞骥科技股份有限公司 | Double-control disk array based on godson processor |
CN104486128B (en) * | 2014-12-23 | 2018-07-17 | 浪潮(北京)电子信息产业有限公司 | A kind of system and method for realizing redundancy heartbeat between dual controller node |
CN104536853B (en) * | 2015-01-09 | 2016-07-27 | 浪潮电子信息产业股份有限公司 | Device for guaranteeing continuous availability of resources of dual-controller storage equipment |
CN105072029B (en) * | 2015-08-31 | 2018-05-04 | 浪潮(北京)电子信息产业有限公司 | The redundant link design method and system of a kind of dual-active dual control storage system |
CN105426118B (en) * | 2015-10-28 | 2018-06-05 | 浪潮(北京)电子信息产业有限公司 | A kind of method that serial ports backup heartbeat passage is utilized in double-control system |
CN106354594A (en) * | 2016-08-26 | 2017-01-25 | 浪潮(北京)电子信息产业有限公司 | Fault-tolerance method and device of multi-controller communication, and NTB facility |
CN108664361B (en) * | 2017-03-27 | 2021-07-16 | 杭州宏杉科技股份有限公司 | PCIE non-transparent channel repairing method and device |
CN107423167A (en) * | 2017-07-31 | 2017-12-01 | 郑州云海信息技术有限公司 | A kind of ISCSI target redundancy control methods and system based on dual control storage |
CN107766181B (en) * | 2017-09-12 | 2021-04-20 | 中国电子科技集团公司第五十二研究所 | Double-controller storage high-availability subsystem based on PCIe non-transparent bridge |
CN107844440A (en) * | 2017-10-26 | 2018-03-27 | 郑州云海信息技术有限公司 | Single port NVMe SSD access method, device and readable storage medium storing program for executing |
CN107967195A (en) * | 2017-12-07 | 2018-04-27 | 郑州云海信息技术有限公司 | A kind of fault repairing method and system based on dual control storage |
CN109407648B (en) * | 2018-08-30 | 2020-12-01 | 深圳市易成自动驾驶技术有限公司 | Method and system for removing fault in power-on process and computer readable storage medium |
US11194678B2 (en) * | 2020-03-02 | 2021-12-07 | Silicon Motion, Inc. | Method and apparatus for performing node information exchange management of all flash array server |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1599292B (en) * | 2003-09-19 | 2010-04-28 | 中兴通讯股份有限公司 | Single-board backup method and device with line protection |
CN101257405B (en) * | 2008-04-03 | 2010-12-08 | 中兴通讯股份有限公司 | Method for implementing double chain circuits among master-salve equipments |
CN101364137B (en) * | 2008-09-22 | 2010-06-23 | 浪潮电子信息产业股份有限公司 | Synchronization double-control ATX power supply |
CN103019333A (en) * | 2011-09-28 | 2013-04-03 | 英业达股份有限公司 | Servo |
-
2014
- 2014-07-16 CN CN201410337977.2A patent/CN104102559B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN104102559A (en) | 2014-10-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104102559B (en) | A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end | |
CN105700969B (en) | server system | |
US9697166B2 (en) | Implementing health check for optical cable attached PCIE enclosure | |
TWI529624B (en) | Method and system of fault tolerance for multiple servers | |
US8745437B2 (en) | Reducing impact of repair actions following a switch failure in a switch fabric | |
CN102622279A (en) | Redundant control system and method and management controllers | |
CN105072029A (en) | Redundant link design method and system of active-active storage system | |
CN105487609A (en) | Server | |
CN105717787A (en) | Dual-redundancy control system and control method for intelligent power distribution device | |
US8099634B2 (en) | Autonomic component service state management for a multiple function component | |
US8421614B2 (en) | Reliable redundant data communication through alternating current power distribution system | |
WO2023061327A1 (en) | Core board reset method and apparatus, device, storage medium and program product | |
CN104484243A (en) | High-reliability system device and method combining virtual machine fault-tolerant technique and high-availability cluster technique | |
US11010086B2 (en) | Data synchronization method and out-of-band management device | |
JP2003330626A (en) | Controller communication over always-on controller interconnect | |
TW201423582A (en) | SAS expanders switching system and method | |
KR100928187B1 (en) | Fault-safe structure of dual processor control unit | |
CN102156669B (en) | Arbitration system of vehicle-mounted train control equipment | |
CN109726055B (en) | Method for detecting PCIe chip abnormity and computer equipment | |
US7293198B2 (en) | Techniques for maintaining operation of data storage system during a failure | |
CN110781111B (en) | But real-time supervision's dual-redundancy USB port extension device | |
CN103257907A (en) | Computer and hard disk data recovery system and method for computer | |
CN115333979B (en) | Link error code processing method and device and computer readable storage medium | |
TW201335747A (en) | Computer, system and method for recovering data of hard disks | |
JP3661665B2 (en) | How to close a package |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |