CN108021406B - Dual-redundancy hot backup CPU system suitable for onboard computer - Google Patents

Dual-redundancy hot backup CPU system suitable for onboard computer Download PDF

Info

Publication number
CN108021406B
CN108021406B CN201711076207.7A CN201711076207A CN108021406B CN 108021406 B CN108021406 B CN 108021406B CN 201711076207 A CN201711076207 A CN 201711076207A CN 108021406 B CN108021406 B CN 108021406B
Authority
CN
China
Prior art keywords
cpu
started
dual
redundancy
enabling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711076207.7A
Other languages
Chinese (zh)
Other versions
CN108021406A (en
Inventor
吴斌
蔡晓乐
任晓琨
向桂林
刘夏青
车炯晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Aeronautics Computing Technique Research Institute of AVIC
Original Assignee
Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Aeronautics Computing Technique Research Institute of AVIC filed Critical Xian Aeronautics Computing Technique Research Institute of AVIC
Priority to CN201711076207.7A priority Critical patent/CN108021406B/en
Publication of CN108021406A publication Critical patent/CN108021406A/en
Application granted granted Critical
Publication of CN108021406B publication Critical patent/CN108021406B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4405Initialisation of multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3024Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Security & Cryptography (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention belongs to the technical field of computer application, and relates to a dual-redundancy hot backup CPU system designed for improving the task reliability of an onboard computer. The system comprises a dual-redundancy CPU; after the system is started, the dual-redundancy CPU keeps an idle state; after the software initialization is completed, enabling the CPU output control by the two CPUs through a main control instruction written in a specific address, wherein the main control instruction comprises a delay processing control logic for determining the starting sequence of the two CPUs; the enabling signal of the CPU module started firstly is set to be effective and obtains the output control right, meanwhile, the enabling signal is connected to the channel logic fault circuit of the CPU started later, and the CPU started later is clamped by the enabling signal of the CPU module started firstly and is in an invalid state. The dual-redundancy hot backup CPU system which is high in reliability and can operate in a coordinated mode and is suitable for the onboard computer is provided.

Description

Dual-redundancy hot backup CPU system suitable for onboard computer
Technical Field
The invention belongs to the technical field of computer application, and relates to a dual-redundancy hot backup CPU system designed for improving the task reliability of an onboard computer.
Background
With the continuous improvement of the integration degree of airborne electronic equipment, the task reliability of the work of the airborne computer is directly related to the flight safety of the airplane, and once the airborne computer is in error operation, huge life and property losses are caused.
Currently, the integration level of the on-board computer is higher and higher, for example, an important electromechanical management system on an airplane integrates subsystems with high real-time requirements such as environmental control, fuel oil, power supply, hydraulic pressure and the like, so that the electromechanical management computer is required to have high real-time processing capacity. The CPU is a control core and an operation core of the whole electromechanical management computer, once a fault or an error occurs in the running process of the CPU, disastrous results can be caused, and therefore, the data processing capacity of the CPU is improved, and meanwhile, new requirements are provided for the redundancy design aspect of the CPU.
The traditional electromechanical management computer usually solves the redundancy problem in a multi-computer and multi-channel mode, the running state of the local channel of the computer can only be monitored by other channels of the computer, and a CPU multi-core processing strategy of a single computer and a single channel cannot be realized.
Disclosure of Invention
The technical problems solved by the invention are as follows: the dual-redundancy hot backup CPU system which is high in reliability and can operate in a coordinated mode and is suitable for the onboard computer is provided.
The technical scheme of the invention is as follows: a dual-redundancy hot-standby CPU system suitable for an on-board computer, the system comprises a dual-redundancy CPU;
after the system is started, the dual-redundancy CPU keeps an idle state;
after the software initialization is completed, enabling the CPU output control by the two CPUs through a main control instruction written in a specific address, wherein the main control instruction comprises a delay processing control logic for determining the starting sequence of the two CPUs;
the enabling signal of the CPU module started firstly is set to be effective and obtains the output control right, meanwhile, the enabling signal is connected to the channel logic fault circuit of the CPU started later, and the CPU started later is clamped by the enabling signal of the CPU module started firstly and is in an invalid state.
Preferably, when the two CPU modules actively switch the control right, the CPU is started afterwards to modify and control the address data of the enabling signal of the CPU, so that the enabling signal of the CPU is effective; then, the CPU module is started first, and then the enable signal of the CPU module is modified to be invalid.
Preferably, if the CPU module started first needs to transfer the control right due to a hardware failure, the CPU started first actively sets the enable signal of itself as invalid, the clamp action of the CPU module started first on the CPU started later disappears, the channel logic failure circuit of the CPU started later sets the enable signal of itself as valid according to the state information of the CPU started first, and the CPU started later takes over the output control right.
Preferably, when both CPUs relinquish control, a restart of a CPU is arbitrated by upper-layer application software.
The invention has the beneficial effects that: the invention provides a single-machine dual-redundancy hot backup CPU processing mechanism, wherein when a main CPU processes system tasks, a slave CPU can monitor the running state of the main CPU in real time, and when the main CPU fails, the slave CPU can seize the control right of the system in time, so that the system tasks are taken over, and the task reliability of an electromechanical management computer is improved.
Drawings
Fig. 1 is a schematic diagram of the principle of the present invention.
Detailed Description
The invention provides a dual-redundancy hot backup CPU system of an airborne computer, which is used for researching a scheduling method of output control power of external hardware resources by different CPUs in the running process of a redundancy CPU. After the system is started, the dual-redundancy CPUs are both kept in an idle state (do not occupy peripheral resources and do not output signals), and at the moment, the two CPUs do not enjoy the control right of the external equipment. After the software initialization is completed, the two CPUs start to enable the CPUs to output and control through the main control instruction written with the specific address, and in order to distinguish the output order of the two CPUs, the main control instruction is subjected to delay processing by adopting random numbers in a certain range in the control logic, so that the two CPUs form a time difference on enable signals of external equipment, and the external appearance is that the starting order of the two CPUs is different. Therefore, the enable signal of the CPU module (called master) that is started first is set to be valid and the output control right is obtained, and the CPU (called slave) that is started later is clamped by the master enable signal to be in an invalid state by mutually connecting the enable signal to the channel logic fault circuit of the other CPU. If the control right transfer is needed in normal operation, the host can actively clear the specific number of the specific address. If the master needs to transfer the control right due to hardware failure, the channel failure logic sets the master enable signal to be invalid according to the state information of the master, so that the clamp action on the slave disappears, and since the initial slave also writes a specific number of specific addresses to enable the CPU output control, the slave enables are valid at this moment, and the output control right is taken over.
Examples
When the power is on, the initial states of the peripheral enable signals of the dual-computer CPU are all closed states. When the software of the dual-computer is initialized, a main control instruction '0' is written into a specific address through a specific data bit, so that the states of the external enable signals of the dual-computer CPU module are '0', and the dual-computer has no control output to the external equipment. After the system task is started, the double machines immediately write a main control instruction '1' into a specific address, and because the control logic adopts random numbers to carry out delay processing on the main control enabling instruction, the double machines start in sequence, the starting is called as a master machine, and the starting is called as a slave machine. After starting, the state of the external enabling signal of the host CPU module is changed into '1' (the watchdog signal is set to be high level at first, and hardware has no serious fault), and the host CPU module enables, controls and outputs the external device; at this time, the peripheral enable signal of the master CPU module acts on the channel fault logic of the slave, the state of the peripheral enable signal of the slave CPU is clamped to "0", and the slave CPU module cannot enable the control output.
If hardware fault occurs after the host normally operates for a period of time (watchdog signal is changed into low level), the peripheral enable signal state of the host CPU module is changed into '0', the clamping effect of the host CPU module on the peripheral enable signal of the slave CPU module disappears, because the slave software writes a main control instruction '1' through a specific address before, at the moment, the enable signal state of the slave CPU module is changed into '1', the peripheral enable of the slave CPU module is changed into effective, and the control output of the peripheral equipment is taken over; meanwhile, the enable signal state of the host CPU module is clamped to be 0 in reverse. Even if the host failure is recovered to be normal, the control output of the slave is still the control output of the master.
If the host computer normally operates for a period of time, the application software needs to perform CPU control switching, and the master control instruction '0' can be written into the specific address, then the host computer gives up the control right of the external device (the logic mode is the same as the step 4), and the slave computer takes over the control right of the external device.
If in a certain situation, the dual machines give up the channel control right first (for example, the dual machines all consider that the local machine is faulty), at this time, the state is as in step 2, and the dual machines do not control the output. At this time, if one of the two computers needs to be restarted, the judgment needs to be made by application software according to the system requirements.

Claims (1)

1. A dual-redundancy hot-standby CPU system suitable for an on-board computer, characterized by: the system comprises a dual-redundancy CPU;
after the system is started, the dual-redundancy CPU keeps an idle state;
after the software initialization is completed, enabling the CPU output control by the two CPUs through a main control instruction written in a specific address, wherein the main control instruction comprises a delay processing control logic for determining the starting sequence of the two CPUs;
enabling signals of the CPU module started firstly are set to be effective and obtain output control power, meanwhile, the enabling signals are connected to a channel logic fault circuit of the CPU started later, and the CPU started later is clamped by the enabling signals of the CPU module started firstly and is in an invalid state;
when the two CPU modules actively switch the control right, the CPU is started to modify and control the address data of the enabling signal of the CPU, so that the enabling signal of the CPU is effective; then, starting the CPU module and modifying the enabling signal of the CPU module to be invalid;
if the CPU module started firstly needs to transfer the control right due to hardware failure, the CPU started firstly actively sets the enable signal of the CPU module started firstly as invalid, the clamping action of the CPU module started firstly on the CPU started later disappears, a channel logic failure circuit of the CPU started later sets the enable signal of the CPU module started later as valid according to the state information of the CPU started firstly, and then the CPU is started to take over the output control right;
when both CPUs give up control, the upper layer application software arbitrates to restart one CPU.
CN201711076207.7A 2017-11-03 2017-11-03 Dual-redundancy hot backup CPU system suitable for onboard computer Active CN108021406B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711076207.7A CN108021406B (en) 2017-11-03 2017-11-03 Dual-redundancy hot backup CPU system suitable for onboard computer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711076207.7A CN108021406B (en) 2017-11-03 2017-11-03 Dual-redundancy hot backup CPU system suitable for onboard computer

Publications (2)

Publication Number Publication Date
CN108021406A CN108021406A (en) 2018-05-11
CN108021406B true CN108021406B (en) 2021-06-01

Family

ID=62080521

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711076207.7A Active CN108021406B (en) 2017-11-03 2017-11-03 Dual-redundancy hot backup CPU system suitable for onboard computer

Country Status (1)

Country Link
CN (1) CN108021406B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062184B (en) * 2018-08-10 2021-05-14 中国船舶重工集团公司第七一九研究所 Double-machine emergency rescue equipment, fault switching method and rescue system
CN109460314B (en) * 2018-11-13 2022-02-11 天津津航计算技术研究所 Dual-computer hot standby device of embedded system
CN109271274B (en) * 2018-11-13 2022-02-11 天津津航计算技术研究所 Dual-computer hot standby method of embedded system
CN109782578A (en) * 2018-12-24 2019-05-21 中国船舶重工集团公司第七一0研究所 A kind of high reliability deep-sea autonomous underwater vehicle control method
CN109976488B (en) * 2019-03-15 2023-04-14 西北工业大学 Unmanned aerial vehicle machine carries computer software automatic re-setting circuit with programming function
CN109976237A (en) * 2019-04-12 2019-07-05 西安爱生技术集团公司 A kind of unmanned aerial vehicle onboard computer remaining control circuit
CN111142945B (en) * 2019-11-28 2023-06-13 中国航空工业集团公司西安航空计算技术研究所 Master and slave channel dynamic switching method for dual-redundancy computer
CN111367706B (en) * 2020-03-31 2023-04-28 西安联飞智能装备研究院有限责任公司 Channel control right switching method and device for redundancy computer
CN114200820A (en) * 2021-11-08 2022-03-18 陕西千山航空电子有限责任公司 Dual-redundancy system based on airborne acquisition and control computer

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271332A (en) * 2008-05-09 2008-09-24 北京方天长久科技有限公司 Compact integrated redundancy controller and control method thereof
CN101833336A (en) * 2010-04-28 2010-09-15 北京航空航天大学 Dual-redundancy attitude control system and debug method of coaxial unmanned helicopter
CN102541697A (en) * 2010-12-31 2012-07-04 中国航空工业集团公司第六三一研究所 Switching method for processing fault of dual-redundancy computer
CN103853622A (en) * 2012-11-28 2014-06-11 中国航空工业集团公司第六三一研究所 Control method of dual redundancies capable of being backed up mutually
CN105471653A (en) * 2015-12-09 2016-04-06 中国航空工业集团公司西安飞机设计研究所 Airborne dual channel seamless switching method and system
CN105550053A (en) * 2015-12-09 2016-05-04 中国航空工业集团公司西安航空计算技术研究所 Redundancy management method for improving availability of monitoring pair based fault tolerant system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101710299A (en) * 2009-12-24 2010-05-19 中国航空工业集团公司第六三一研究所 Double-redundancy fault-tolerant computer system based on self monitoring of SCM
CN105717787A (en) * 2014-11-30 2016-06-29 上海航空电器有限公司 Dual-redundancy control system and control method for intelligent power distribution device
CN105550067B (en) * 2015-12-11 2018-05-08 中国航空工业集团公司西安航空计算技术研究所 A kind of airborne computer binary channels system of selection
CN106649909B (en) * 2016-08-29 2020-04-03 成都飞机工业(集团)有限责任公司 Dual-redundancy compensation type empennage control surface fault state control method
CN106444514B (en) * 2016-10-21 2019-04-30 中国运载火箭技术研究院 A kind of highly reliable double redundancy power controller of logic-based frame interaction
CN107065830A (en) * 2017-05-03 2017-08-18 北京电子工程总体研究所 A kind of dual redundant hot backup system based on arbitration mode

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271332A (en) * 2008-05-09 2008-09-24 北京方天长久科技有限公司 Compact integrated redundancy controller and control method thereof
CN101833336A (en) * 2010-04-28 2010-09-15 北京航空航天大学 Dual-redundancy attitude control system and debug method of coaxial unmanned helicopter
CN102541697A (en) * 2010-12-31 2012-07-04 中国航空工业集团公司第六三一研究所 Switching method for processing fault of dual-redundancy computer
CN103853622A (en) * 2012-11-28 2014-06-11 中国航空工业集团公司第六三一研究所 Control method of dual redundancies capable of being backed up mutually
CN105471653A (en) * 2015-12-09 2016-04-06 中国航空工业集团公司西安飞机设计研究所 Airborne dual channel seamless switching method and system
CN105550053A (en) * 2015-12-09 2016-05-04 中国航空工业集团公司西安航空计算技术研究所 Redundancy management method for improving availability of monitoring pair based fault tolerant system

Also Published As

Publication number Publication date
CN108021406A (en) 2018-05-11

Similar Documents

Publication Publication Date Title
CN108021406B (en) Dual-redundancy hot backup CPU system suitable for onboard computer
US20190303255A1 (en) Cluster availability management
CN107347018B (en) Three-redundancy 1553B bus dynamic switching method
US9542320B2 (en) Multi-node cache coherency with input output virtualization
CN106970857A (en) A kind of restructural triple redundance computer system and its reconstruct down method
US9195553B2 (en) Redundant system control method
CN102724083A (en) Degradable triple-modular redundancy computer system based on software synchronization
CN110427283B (en) Dual-redundancy fuel management computer system
CN104050061A (en) Multi-main-control-panel redundant backup system based on PCIe bus
EP3789834A1 (en) Hot-standby redundancy control system, method, control apparatus, and computer readable storage medium
US20080263391A1 (en) Apparatus, System, and Method For Adapter Card Failover
CN112639640A (en) Redundant hot standby control system, control device, redundant hot standby method, and computer-readable storage medium
KR100928187B1 (en) Fault-safe structure of dual processor control unit
US9026838B2 (en) Computer system, host-bus-adaptor control method, and program thereof
CN112506830B (en) Redundancy synchronous communication method for multi-path transmission data bus
KR102053849B1 (en) Airplane system and control method thereof
CN112000286B (en) Four-control full-flash-memory storage system and fault processing method and device thereof
CN117111525A (en) Multi-CPU-based trusted redundant control system and control method
CN109117317A (en) A kind of clustering fault restoration methods and relevant apparatus
JP2007280313A (en) Redundant system
US9002480B2 (en) Method for operation of a control network, and a control network
CN101145955A (en) Hot backup method, network management and network management system of network management software
CN115328706A (en) Comprehensive control method and system for dual-CPU redundant architecture
JP2006114064A (en) Storage subsystem
WO2022141128A1 (en) Safety isolation apparatus and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant