CN109271274B - Dual-computer hot standby method of embedded system - Google Patents

Dual-computer hot standby method of embedded system Download PDF

Info

Publication number
CN109271274B
CN109271274B CN201811346918.6A CN201811346918A CN109271274B CN 109271274 B CN109271274 B CN 109271274B CN 201811346918 A CN201811346918 A CN 201811346918A CN 109271274 B CN109271274 B CN 109271274B
Authority
CN
China
Prior art keywords
dual
core
hot standby
kernel
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811346918.6A
Other languages
Chinese (zh)
Other versions
CN109271274A (en
Inventor
赵昶宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Jinhang Computing Technology Research Institute
Original Assignee
Tianjin Jinhang Computing Technology Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Jinhang Computing Technology Research Institute filed Critical Tianjin Jinhang Computing Technology Research Institute
Priority to CN201811346918.6A priority Critical patent/CN109271274B/en
Publication of CN109271274A publication Critical patent/CN109271274A/en
Application granted granted Critical
Publication of CN109271274B publication Critical patent/CN109271274B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0796Safety measures, i.e. ensuring safe condition in the event of error, e.g. for controlling element
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/04Programme control other than numerical control, i.e. in sequence controllers or logic controllers
    • G05B19/048Monitoring; Safety

Abstract

The invention relates to a dual-computer hot standby method of an embedded system, belonging to the technical field of embedded systems. The invention utilizes the characteristic of an asymmetric multi-core processor under an AMP framework to operate an independent VxWorks operating system on each core of a mainboard (comprising two cores), and realizes a dual-computer hot standby function on each independent operating system. Two sets of equipment of a host machine and a standby machine are not required to be adopted, and in one set of equipment case, each board card realizes dual redundancy of hardware resources. By the mode, double redundancy of software and hardware is realized.

Description

Dual-computer hot standby method of embedded system
Technical Field
The invention belongs to the technical field of embedded systems, and particularly relates to a dual-computer hot standby method of an embedded system.
Background
At present, two common dual-computer fault-tolerant technologies exist:
(1) a third party arbitration mechanism is adopted to realize fault detection and dual-computer switching;
(2) the heartbeat communication between the main machine and the standby machine is established through the identification of the main machine and the standby machine without a third party arbitration mechanism, and the fault detection and switching of the main machine and the standby machine are realized on the premise of synchronous working of the main machine and the standby machine.
The two methods can realize the dual-computer hot standby function, but have the defects.
The first dual-computer fault-tolerant technology has the defect that if the third-party arbitration mechanism fails, dual-computer fault detection cannot be realized; although the second dual-computer fault-tolerant technique does not utilize a third-party arbitration mechanism, if the host and the standby computer detect different types of faults at the same time, the dual-computer switching function cannot be correctly realized.
Disclosure of Invention
Technical problem to be solved
The technical problem to be solved by the invention is as follows: how to design a dual-computer hot standby method in an embedded system effectively ensures the stable, reliable, effective and continuous operation of the embedded system.
(II) technical scheme
In order to solve the above technical problem, the present invention provides a dual-computer hot standby method for an embedded system, which is characterized by comprising the following steps:
s1, configuring the following components in a first kernel of the VxWorks system of the equipment to form a first double-machine hot standby function module:
INCLUDE_AMP_CPU
INCLUDE_AMP_CPU_00
INCLUDE_MOB_PLB_0
INCLUDE_MOB_PLB_1
INCLUDE_MCB_SM
INCLUDE_MIPC_SM
INCLUDE_SHELL
INCLUDE_WRLOAD
the following components are configured in a second kernel of the VxWorks system of the equipment to form a second dual-machine hot standby function module:
INCLUDE_AMP_CPU
INCLUDE_AMP_CPU_01
INCLUDE_MOB_PLB_0
INCLUDE_MOB_PLB_1
INCLUDE_MCB_SM
INCLUDE_MIPC_SM
INCLUDE_SHELL
INCLUDE_WRLOAD_IMAGE_BUILD
s2, after the equipment is powered on and operated, the first dual-machine hot standby function module of the first kernel and the second dual-machine hot standby function module of the second kernel on the mainboard firstly carry out self-detection on each board card in the equipment, after the self-detection is finished, the first kernel and the second kernel both send heartbeat signals to the external equipment, and the heartbeat signals contain equipment self-detection results;
s3, the external device receives the heartbeat signal sent by the device through Ethernet and serial port, as long as the heartbeat signal sent by one of the cores is received, the external device sends a control command to the core to start communicating with the core, and at the moment, the external device only receives the heartbeat signal of the other core and does not send the control command to the other core;
preferably, in step S3, in the process of communicating with one of the cores, if the dual hot standby function module on the core detects that the board card inside the device is faulty, the external device is immediately notified; and meanwhile, using the backup hardware resource on the fault board card to work.
Preferably, in step S3, if the external device does not receive the heartbeat signal of the core during the communication with the current core, the external device immediately stops the communication with the current core and sends a control command to the other core to start the communication with the other core.
Preferably, in step S1, the relevant components are configured in the VxWorks image.
Preferably, the IP addresses of the first dual-standby function module and the second dual-standby function module are different.
(III) advantageous effects
The invention utilizes the multi-core characteristic of the AMP architecture of the VxWorks system to respectively operate the dual-computer hot standby function modules in different cores on a host board in a chassis; hardware resources on the rest board cards in the case adopt a redundant backup mode, the external equipment serves as an arbitration mechanism, and the initiative of switching is given to the external equipment. In the process of communication between the external device and the device, the external device determines which core to communicate with. In the mode, a mainboard and a plurality of peripheral board cards are used in one case to realize the dual-computer hot standby of the whole system. The method does not need to use an arbitration mechanism, the external device serves as the arbitration mechanism, and the initiative of switching is given to the external device. In the process of communication between the external device and the device, the external device determines which core to communicate with. In the working process of the equipment, when the hardware resource on a certain board card is detected to be in fault, the backup resource of the hardware resource is used for working. Even if a plurality of board cards break down at the same time, the whole system can be ensured to work normally only by using backup resources on the board cards. The method perfectly solves the defects in the prior dual-computer fault-tolerant technology, has the advantages of low cost, easy realization, high reliability, short switching time of the main computer and the standby computer, and the like, greatly meets the requirements of short-transaction and strong real-time systems, ensures the safety, the availability and the reliability of an embedded system to the maximum extent, and enhances the maintainability of the dual-computer hot-standby system in complex and severe environments.
Drawings
FIG. 1 is a general flow diagram of the method of the present invention;
fig. 2 is a flow chart of a specific implementation of the method of the present invention.
Detailed Description
In order to make the objects, contents, and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.
Referring to fig. 1 and fig. 2, the dual-computer hot standby method of an embedded system provided by the present invention adopts an AMP (AMP-multiple-processing) multi-core architecture to implement dual-computer hot standby in a VxWorks system, and includes the following steps:
s1, in order to use the AMP architecture under the VxWorks system, related components are firstly configured in the VxWorks mirror image;
the following components are configured in a first kernel of the VxWorks system of the equipment:
INCLUDE_AMP_CPU
INCLUDE_AMP_CPU_00
INCLUDE_MOB_PLB_0
INCLUDE_MOB_PLB_1
INCLUDE_MCB_SM
INCLUDE_MIPC_SM
INCLUDE_SHELL
INCLUDE_WRLOAD
the following components are configured in a second kernel of the VxWorks system:
INCLUDE_AMP_CPU
INCLUDE_AMP_CPU_01
INCLUDE_MOB_PLB_0
INCLUDE_MOB_PLB_1
INCLUDE_MCB_SM
INCLUDE_MIPC_SM
INCLUDE_SHELL
INCLUDE_WRLOAD_IMAGE_BUILD
except that the IP addresses of the dual-computer hot standby function modules running on the first kernel and the second kernel are different, the other function performances are completely the same.
And S2, after the device is powered on and operated, the dual hot standby function modules in the first kernel and the second kernel on the mainboard firstly perform self-check on each board card in the device. After self-checking is completed, the first kernel and the second kernel both send heartbeat signals to external equipment, and the heartbeat signals contain equipment self-checking results;
s3, the external device receives the heartbeat signal sent by the device through Ethernet and serial port, as long as the heartbeat signal sent by one of the cores is received, the external device sends a control command to the core to start communicating with the core, and at the moment, the external device only receives the heartbeat signal of the other core and does not send the control command to the other core;
in the process that the external equipment communicates with one of the cores, if the dual hot standby function module on the core detects that the board card in the equipment is faulty, the external equipment is immediately notified; meanwhile, the backup hardware resources on the fault board card are used for working so as to ensure that the working process cannot be terminated;
if the external device does not receive the heartbeat signal of the core in the process of communicating with the current core, the external device immediately stops communicating with the current core and sends a control command to the other core to start communicating with the other core.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (3)

1. A dual-computer hot standby method of an embedded system is characterized by comprising the following steps:
s1, configuring the following components in a first kernel of the VxWorks system of the equipment to form a first double-machine hot standby function module:
INCLUDE_AMP_CPU
INCLUDE_AMP_CPU_00
INCLUDE_MOB_PLB_0
INCLUDE_MOB_PLB_1
INCLUDE_MCB_SM
INCLUDE_MIPC_SM
INCLUDE_SHELL
INCLUDE_WRLOAD
the following components are configured in a second kernel of the VxWorks system of the equipment to form a second dual-machine hot standby function module:
INCLUDE_AMP_CPU
INCLUDE_AMP_CPU_01
INCLUDE_MOB_PLB_0
INCLUDE_MOB_PLB_1
INCLUDE_MCB_SM
INCLUDE_MIPC_SM
INCLUDE_SHELL
INCLUDE_WRLOAD_IMAGE_BUILD
s2, after the equipment is powered on and operated, the first dual-machine hot standby function module of the first kernel and the second dual-machine hot standby function module of the second kernel on the mainboard firstly carry out self-detection on each board card in the equipment, after the self-detection is finished, the first kernel and the second kernel both send heartbeat signals to the external equipment, and the heartbeat signals contain equipment self-detection results;
s3, the external device receives the heartbeat signal sent by the device through Ethernet and serial port, as long as the heartbeat signal sent by one of the cores is received, the external device sends a control command to the core to start communicating with the core, and at the moment, the external device only receives the heartbeat signal of the other core and does not send the control command to the other core;
in step S3, in the process of communicating with one of the cores, if the dual hot standby function module on the core detects that the board card inside the device is faulty, immediately notifying the external device; meanwhile, using the backup hardware resource on the fault board card to work;
in step S3, if the external device does not receive the heartbeat signal of the core during the communication with the current core, the external device immediately stops communicating with the current core, and sends a control command to another core to start communicating with another core.
2. The method of claim 1, wherein in step S1, the relevant components are configured in a VxWorks image.
3. The method of any of claims 1-2, wherein the IP address of the first dual standby function module is different from the IP address of the second dual standby function module.
CN201811346918.6A 2018-11-13 2018-11-13 Dual-computer hot standby method of embedded system Active CN109271274B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811346918.6A CN109271274B (en) 2018-11-13 2018-11-13 Dual-computer hot standby method of embedded system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811346918.6A CN109271274B (en) 2018-11-13 2018-11-13 Dual-computer hot standby method of embedded system

Publications (2)

Publication Number Publication Date
CN109271274A CN109271274A (en) 2019-01-25
CN109271274B true CN109271274B (en) 2022-02-11

Family

ID=65192661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811346918.6A Active CN109271274B (en) 2018-11-13 2018-11-13 Dual-computer hot standby method of embedded system

Country Status (1)

Country Link
CN (1) CN109271274B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101043310A (en) * 2007-04-27 2007-09-26 北京佳讯飞鸿电气有限责任公司 Image backup method for dual-core control of core controlled system
CN101493809A (en) * 2009-03-03 2009-07-29 哈尔滨工业大学 Multi-core onboard spacecraft computer based on FPGA
CN108021406A (en) * 2017-11-03 2018-05-11 中国航空工业集团公司西安航空计算技术研究所 A kind of double remaining Hot Spare cpu systems suitable for airborne computer

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761166A (en) * 2014-01-22 2014-04-30 上海交通大学 Hot standby disaster tolerance system for network service under virtualized environment and method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101043310A (en) * 2007-04-27 2007-09-26 北京佳讯飞鸿电气有限责任公司 Image backup method for dual-core control of core controlled system
CN101493809A (en) * 2009-03-03 2009-07-29 哈尔滨工业大学 Multi-core onboard spacecraft computer based on FPGA
CN108021406A (en) * 2017-11-03 2018-05-11 中国航空工业集团公司西安航空计算技术研究所 A kind of double remaining Hot Spare cpu systems suitable for airborne computer

Also Published As

Publication number Publication date
CN109271274A (en) 2019-01-25

Similar Documents

Publication Publication Date Title
US11809291B2 (en) Method and apparatus for redundancy in active-active cluster system
CN107347018B (en) Three-redundancy 1553B bus dynamic switching method
CN101271332B (en) Compact integrated redundancy controller and control method thereof
US5875290A (en) Method and program product for synchronizing operator initiated commands with a failover process in a distributed processing system
CN107634855A (en) A kind of double hot standby method of embedded system
US6012150A (en) Apparatus for synchronizing operator initiated commands with a failover process in a distributed processing system
CN107729190B (en) IO path failover processing method and system
CN103346903A (en) Dual-machine backup method and device
CN110488597B (en) Dual-redundancy control method for main processing unit of locomotive
CN109634171B (en) Dual-core dual-lock-step two-out-of-two framework and safety platform thereof
JP2006259869A (en) Multiprocessor system
CN109194497B (en) Dual SRIO network backup system for software-oriented radio system
CN103441863A (en) Double-server hot standby system in blank pipe automatic system and control method thereof
CN105045531A (en) Buffer synchronization mechanism between double storage controllers
CN109245926B (en) Intelligent network card, intelligent network card system and control method
CN106970861A (en) A kind of virtual machine fault-tolerance approach and system
CN102026042A (en) Keep-alive and self-healing method and device for advanced telecom computing architecture control surface
CN104199353A (en) Cold backup and hot backup combined double-host command and control system
CN104468210A (en) Quick main and standby switching control method
CN110427283B (en) Dual-redundancy fuel management computer system
CN101557307B (en) Dispatch automation system application state management method
CN109271274B (en) Dual-computer hot standby method of embedded system
CN109460314B (en) Dual-computer hot standby device of embedded system
CN114968873B (en) PCIE data exchange device and method for switching data paths
CN110677288A (en) Edge computing system and method generally used for multi-scene deployment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant