CN104102559B - A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end - Google Patents

A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end Download PDF

Info

Publication number
CN104102559B
CN104102559B CN201410337977.2A CN201410337977A CN104102559B CN 104102559 B CN104102559 B CN 104102559B CN 201410337977 A CN201410337977 A CN 201410337977A CN 104102559 B CN104102559 B CN 104102559B
Authority
CN
China
Prior art keywords
link
mainboard
heart beating
opposite end
redundancy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410337977.2A
Other languages
Chinese (zh)
Other versions
CN104102559A (en
Inventor
唐传贞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410337977.2A priority Critical patent/CN104102559B/en
Publication of CN104102559A publication Critical patent/CN104102559A/en
Application granted granted Critical
Publication of CN104102559B publication Critical patent/CN104102559B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Hardware Redundancy (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

The present invention provides a kind of double controller storage system restarting link based on redundancy heart beating link and opposite end, belong to computer memory technical field, processor and processor are connected by the present invention by NTB and serial ports link can realize data image and buffer memory mirror image, and the duty that monitor in real time is to each other.When one of mainboard breaks down, another block can also pass through to restart link restarts opposite end mainboard, teaching display stand repair function, even if one piece of mainboard occurs cannot repairing fault, another block mainboard can also all working of taking over fault mainboard comprehensively, it is ensured that the high reliability of storage product and high stability.

Description

A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end
Technical field
The present invention relates to computer memory technical field, particularly relate to a kind of double controller storage system restarting link based on redundancy heart beating link and opposite end.
Background technology
The information of Computer Storage gets more and more, and it is more and more important, for preventing the data in computer from surprisingly losing, generally all adopt many important safety protection technique to guarantee the safety of data, the inefficacy of the running wastage of equipment, storage medium, running environment and artificial destruction etc., the reliability and stability of product equipment can be impacted, thus cause the phenomenon of device damage, loss of data, so that can not get effective guarantee.
Along with the Internet, the fast development of the industries such as electronic banking, people's safety to network data, stability requirement is more and more higher, and any product has its life problems, how the unstable factors such as failure problems, reduce the difficult problem becoming current single controller and the consideration of initial stage dual controller emphasis by this unstable factor.
Summary of the invention
The present invention provides a kind of double controller storage system restarting link based on redundancy heart beating link and opposite end, it is ensured that the high reliability of storage product and high stability.
So-called dual controller, it is simply that by two pieces of identical mainboards same backboards of collocation, control all of hard disk on this backboard.
Redundancy heart beating link includes NTB(non-transparent bridge) link and serial heartbeat link, the CPU between double control mainboard passes through NTB interconnected communication, carries out data image, buffer memory mirror image, and in real time as heart beating link, monitors the duty of the other side to each other;Being interconnected by serial ports link, as the redundancy heart beating link of system level, when NTB breaks down, heart beating link can be switched to serial ports link, serial ports link detect the other side's main board work state simultaneously;
Link is restarted in opposite end, it is simply that when a control mainboard finds that another control mainboard does not respond to for a long time, is considered as the other side's mainboard and breaks down, then send Reset signal to restart the other side to fault mainboard, it is achieved the self-regeneration of system;If one piece of mainboard occurs in that the fault that cannot repair, another block mainboard can all working of comprehensive taking over fault mainboard, prevention data is lost, system failure, it is ensured that the high reliability of storage product and high stability.
Present invention relates generally to the design of hardware circuit and software view.It is specifically divided into following several aspect:
1, two control mainboards use NTB to be connected by backboard, carry out data image, buffer memory mirror image by NTB, and as the heart beating link of system level, monitor two mainboards duty to each other;Two mainboards are connected to come all of hard disk of Access Management Access with between backboard by high speed connector simultaneously.
2, two control mainboards are connected by serial ports, and as the redundancy heart beating link of system level, when NTB breaks down, heart beating link can be switched to serial ports link, serial ports link detect opposite end main board work state.
3, link is restarted in opposite end, when mainboard detect opposite end mainboard do not respond to for a long time time, be considered as the other side and break down, restart the other side by this link, it is achieved self-regeneration.
Above 3 hardware circuit realize simultaneously need to the design of software view.
What the present invention proposed restart the double controller storage system of link based on redundancy heart beating link and opposite end can be greatly improved stability and the self-repairing capability of product.Processor and processor are connected by NTB and serial ports link and can realize data image and buffer memory mirror image, and the duty that monitor in real time is to each other.When one of mainboard breaks down, another block can also pass through to restart link restarts opposite end mainboard, teaching display stand repair function, even if one piece of mainboard occurs cannot repairing fault, another block mainboard can also all working of taking over fault mainboard comprehensively, it is ensured that the high reliability of storage product and high stability.
Accompanying drawing explanation
Below in conjunction with accompanying drawing, the invention will be further described;
Accompanying drawing 1 is the whole topology diagram that link is restarted in redundancy heart beating link and opposite end.
Detailed description of the invention
Hardware circuit according to the several aspects that the present invention relates to and the design of software view, be divided into following three aspect:
1. two control mainboards use NTB to be connected by backboard, carry out data image, buffer memory mirror image by NTB, and as the heart beating link of system level, monitor two mainboards duty to each other;Two mainboards are connected to come all of hard disk of Access Management Access with between backboard by high speed connector simultaneously.
2. two control mainboards are connected by serial ports, and as the redundancy heart beating link of system level, when NTB breaks down, heart beating link can be switched to serial ports link, serial ports link detect opposite end main board work state.
3. link is restarted in opposite end, when mainboard detect opposite end mainboard do not respond to for a long time time, be considered as the other side and break down, restart the other side by this link, it is achieved self-regeneration.
Fig. 1 is the whole topology diagram that link is restarted in redundancy heart beating link and opposite end, is described in further detail according to Fig. 1,
First being the redundancy structure between dual controller, two controllers are connected with backboard respectively through high speed connector.Each control mainboard takes onboard SAS controller and SASExpander.SASExpander connects all of SAS hard disk on backboard by the SAS bus of backboard, and two control mainboard and manage reading hard disk information by the mode of redundancy.
Secondly, the CPU on two mainboards passes through NTB(non-transparent bridge) bus connects, and this NTB adopts the mode bus of PCIEx8, and different protocol specifications is connected, and plays data image, the effect of buffer memory mirror image, and as heart beating link detecting real-time opposite end mainboard situation.For preventing cabling long, the outfan at each NTB adopts redriver to be strengthened by signal, and can be adjusted the driving force of redriver by bios software.
Again, it is connected by serial ports link between two pieces of mainboards.As the redundancy heart beating link of system level, when NTB breaks down and cannot work, serial ports link can replace the heart beating chain circuit function of NTB.
Finally, the CPU of every piece of mainboard is connected with the CPLD of another block mainboard by GPIOPin, and the effect that this CPLD module plays on mainboard is to control the electrifying timing sequence of monoblock mainboard and restart mode.When certain block mainboard detect another block mainboard break down time, normal mainboard sends Reset signal by the GPIOPin foot on local cpu to the CPLD on fault mainboard, and CPLD will restart the mainboard at oneself place after receiving order.Engineer can be WarmReset or ColdReset according to this Reset of code definition in collocation CPLD that needs of oneself.
By what time achieving the function of the mutual redundancy of dual controller above, it is achieved that the reduction of system failure rate, it is ensured that the high reliability of storage product and high stability.

Claims (3)

1. restart the double controller storage system of link based on redundancy heart beating link and opposite end, it is characterised in that by two pieces of identical mainboards same backboards of collocation, control all of hard disk on this backboard;Two are controlled mainboard and are restarted the redundant fashion management reading hard disk information of link by redundancy heart beating link and opposite end;
Redundancy heart beating link includes NTB link and serial ports link, and the CPU between double control mainboard passes through NTB link interconnected communication, carries out data image, buffer memory mirror image, and in real time as heart beating link, monitors the duty of the other side to each other;
Being interconnected by serial ports link, as the redundant link of system level, when NTB link failure, heart beating link can be switched to serial ports link, serial ports link detect the other side's main board work state simultaneously;
Link is restarted in opposite end, it is simply that when a control mainboard finds that another control mainboard does not respond to for a long time, is considered as the other side's mainboard and breaks down, then send Reset signal to restart the other side to fault mainboard, it is achieved the self-regeneration of system;If one piece of mainboard occurs in that the fault that cannot repair, another block mainboard can all working of comprehensive taking over fault mainboard.
2. double controller storage system according to claim 1, it is characterised in that two mainboards are connected to come all of hard disk of Access Management Access with between backboard by high speed connector simultaneously.
3. double controller storage system according to claim 1, it is characterised in that for preventing cabling long, the outfan at each NTB link adopts redriver to be strengthened by signal, and can be adjusted the driving force of redriver by bios software.
CN201410337977.2A 2014-07-16 2014-07-16 A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end Active CN104102559B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410337977.2A CN104102559B (en) 2014-07-16 2014-07-16 A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410337977.2A CN104102559B (en) 2014-07-16 2014-07-16 A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end

Publications (2)

Publication Number Publication Date
CN104102559A CN104102559A (en) 2014-10-15
CN104102559B true CN104102559B (en) 2016-07-06

Family

ID=51670730

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410337977.2A Active CN104102559B (en) 2014-07-16 2014-07-16 A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end

Country Status (1)

Country Link
CN (1) CN104102559B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104460611B (en) * 2014-11-18 2017-11-21 启辰电子(苏州)有限公司 A kind of distributed locker control system and its control method
CN104461386A (en) * 2014-12-01 2015-03-25 北京同有飞骥科技股份有限公司 Double-control disk array based on godson processor
CN104486128B (en) * 2014-12-23 2018-07-17 浪潮(北京)电子信息产业有限公司 A kind of system and method for realizing redundancy heartbeat between dual controller node
CN104536853B (en) * 2015-01-09 2016-07-27 浪潮电子信息产业股份有限公司 Device for guaranteeing continuous availability of resources of dual-controller storage equipment
CN105072029B (en) * 2015-08-31 2018-05-04 浪潮(北京)电子信息产业有限公司 The redundant link design method and system of a kind of dual-active dual control storage system
CN105426118B (en) * 2015-10-28 2018-06-05 浪潮(北京)电子信息产业有限公司 A kind of method that serial ports backup heartbeat passage is utilized in double-control system
CN106354594A (en) * 2016-08-26 2017-01-25 浪潮(北京)电子信息产业有限公司 Fault-tolerance method and device of multi-controller communication, and NTB facility
CN108664361B (en) * 2017-03-27 2021-07-16 杭州宏杉科技股份有限公司 PCIE non-transparent channel repairing method and device
CN107423167A (en) * 2017-07-31 2017-12-01 郑州云海信息技术有限公司 A kind of ISCSI target redundancy control methods and system based on dual control storage
CN107766181B (en) * 2017-09-12 2021-04-20 中国电子科技集团公司第五十二研究所 Double-controller storage high-availability subsystem based on PCIe non-transparent bridge
CN107844440A (en) * 2017-10-26 2018-03-27 郑州云海信息技术有限公司 Single port NVMe SSD access method, device and readable storage medium storing program for executing
CN107967195A (en) * 2017-12-07 2018-04-27 郑州云海信息技术有限公司 A kind of fault repairing method and system based on dual control storage
CN109407648B (en) * 2018-08-30 2020-12-01 深圳市易成自动驾驶技术有限公司 Method and system for removing fault in power-on process and computer readable storage medium
US11194678B2 (en) * 2020-03-02 2021-12-07 Silicon Motion, Inc. Method and apparatus for performing node information exchange management of all flash array server

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1599292B (en) * 2003-09-19 2010-04-28 中兴通讯股份有限公司 Single-board backup method and device with line protection
CN101257405B (en) * 2008-04-03 2010-12-08 中兴通讯股份有限公司 Method for implementing double chain circuits among master-salve equipments
CN101364137B (en) * 2008-09-22 2010-06-23 浪潮电子信息产业股份有限公司 Synchronization double-control ATX power supply
CN103019333A (en) * 2011-09-28 2013-04-03 英业达股份有限公司 Servo

Also Published As

Publication number Publication date
CN104102559A (en) 2014-10-15

Similar Documents

Publication Publication Date Title
CN104102559B (en) A kind of double controller storage system restarting link based on redundancy heart beating link and opposite end
CN105700969B (en) server system
US9697166B2 (en) Implementing health check for optical cable attached PCIE enclosure
TWI529624B (en) Method and system of fault tolerance for multiple servers
US8745437B2 (en) Reducing impact of repair actions following a switch failure in a switch fabric
CN102622279A (en) Redundant control system and method and management controllers
CN105072029A (en) Redundant link design method and system of active-active storage system
CN105487609A (en) Server
CN105717787A (en) Dual-redundancy control system and control method for intelligent power distribution device
US8099634B2 (en) Autonomic component service state management for a multiple function component
US8421614B2 (en) Reliable redundant data communication through alternating current power distribution system
WO2023061327A1 (en) Core board reset method and apparatus, device, storage medium and program product
CN104484243A (en) High-reliability system device and method combining virtual machine fault-tolerant technique and high-availability cluster technique
US11010086B2 (en) Data synchronization method and out-of-band management device
JP2003330626A (en) Controller communication over always-on controller interconnect
TW201423582A (en) SAS expanders switching system and method
KR100928187B1 (en) Fault-safe structure of dual processor control unit
CN102156669B (en) Arbitration system of vehicle-mounted train control equipment
CN109726055B (en) Method for detecting PCIe chip abnormity and computer equipment
US7293198B2 (en) Techniques for maintaining operation of data storage system during a failure
CN110781111B (en) But real-time supervision's dual-redundancy USB port extension device
CN103257907A (en) Computer and hard disk data recovery system and method for computer
CN115333979B (en) Link error code processing method and device and computer readable storage medium
TW201335747A (en) Computer, system and method for recovering data of hard disks
JP3661665B2 (en) How to close a package

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant