CN113050407A - Method for determining and switching master controller and slave controller of distributed processing system - Google Patents
Method for determining and switching master controller and slave controller of distributed processing system Download PDFInfo
- Publication number
- CN113050407A CN113050407A CN202110243660.2A CN202110243660A CN113050407A CN 113050407 A CN113050407 A CN 113050407A CN 202110243660 A CN202110243660 A CN 202110243660A CN 113050407 A CN113050407 A CN 113050407A
- Authority
- CN
- China
- Prior art keywords
- controller
- distributed processing
- controllers
- processing system
- switching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B9/00—Safety arrangements
- G05B9/02—Safety arrangements electric
- G05B9/03—Safety arrangements electric with multiple-channel loop, i.e. redundant control systems
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Automation & Control Theory (AREA)
- Hardware Redundancy (AREA)
Abstract
The invention discloses a method for determining and switching a distributed processing system master controller and a distributed processing system slave controller, which comprises the following steps: designing two system controllers in a distributed processing system, wherein the two system controllers work in a 1+1 hot backup mode, one system controller is a main controller, and the other system controller is a controller; configuring initial states of two system controllers, and setting up a default main controller and an occupation mark of the main controller; setting a control signal for controlling the switching of the main controller/the auxiliary controller, and configuring a switching strategy of the main controller/the auxiliary controller corresponding to different control signal states; and the two system controllers are controlled by the control signal and the switching strategy in the normal operation process in a 1+1 hot backup mode. In the scheme, the two controllers have fault-tolerant capability, and the system reliability is improved. The primary and standby identities of the two controllers are simply determined, and the switching mode is efficient.
Description
Technical Field
The invention belongs to the technical field of embedded computer system design.
Background
With the increasing complexity of embedded systems, the functional performance of the processing system is required to be improved, and the distributed processing system becomes a multifunctional and multitasking complex computer system.
In the prior art, the operation resources and the fault switching are generally realized by software discrimination, the fault discrimination and decision process is relatively complex, the consumed time is long, and the method has low efficiency and low real-time performance.
Disclosure of Invention
The invention provides a method for determining and switching a main controller and a slave controller of a distributed processing system, which is used for solving the problems of complex switching mode and low efficiency in the prior art.
In order to realize the task, the invention adopts the following technical scheme:
a distributed processing system host controller determining and switching method comprises the following steps:
designing two system controllers in a distributed processing system, wherein the two system controllers work in a 1+1 hot backup mode, one system controller is a main controller, and the other system controller is a controller;
configuring initial states of two system controllers, and setting up a default main controller and an occupation mark of the main controller;
setting a control signal for controlling the switching of the main controller/the auxiliary controller, and configuring a switching strategy of the main controller/the auxiliary controller corresponding to different control signal states; and the two system controllers are controlled by the control signal and the switching strategy in the normal operation process in a 1+1 hot backup mode.
Further, the configuring the initial states of the two system controllers and setting the default master controller and the occupation flag of the master controller includes:
when the distributed processing system is started, firstly initializing system hardware, and setting the initial states of two system controllers to be standby working modes;
the system carries out power-on test detection and records the detection result; the system reads the slot identification number of each system controller, the system controller with the slot identification number of 1 is set as a default main controller, and the system controller with the slot identification number of 2 is set as a controller; the occupancy flag set by the main controller is set to be valid.
Further, the control signal for controlling the switching of the main controller/the slave controller is a system fault signal and a network communication fault signal, wherein the system fault signal and the network communication fault signal respectively have different signal states.
Further, the signal states of the system fault signal and the network communication fault signal are valid and invalid.
Further, when the system controller with the slot identification number 1 is used as a main controller and the state of system fault signals or network communication faults is valid, the system controller with the slot identification number 2 is switched to be used as the main controller, and the occupation mark of the system controller with the slot identification number 2 is valid;
when the states of network communication fault signals of the two system controllers are effective, the distributed processing system enters an emergency working state, and only the two system controllers work;
when the states of the system fault signals of the two system controllers are valid, the distributed processing system enters a failure state.
Further, the conditions that trigger the active state of the system fault signal are: a watchdog overtime fault or a system management layer software fault in the distributed processing system; the rest time is in an invalid state;
the conditions that trigger the active state of the network communication failure are: a network communication failure of the distributed processing system.
Further, the method is stored in the form of a computer program in the memory of a computer; the computer comprises a processor and a memory, and the steps of the method for determining and switching the master controller and the slave controller of the distributed processing system are realized when the processor executes the computer program.
Further, the method is stored in a computer-readable storage medium in the form of a computer program; and when the computer program is executed by the processor, the steps of the method for determining and switching the main controller and the slave controller of the distributed processing system are realized.
Compared with the prior art, the invention has the following technical characteristics:
in the scheme, the main controller and the controller adopt a 1+1 hot backup mode, after the normal startup, the system management reads the identification number of the controller slot, the controller of the slot 1 is defaulted to operate as the main controller, the controller unit of the slot 2 is defaulted to operate as the controller, the occupation mark of the main controller is effective, and the main controller is switched after the fault occurs. The main controller has fault tolerance capability and controls and manages various software and hardware resources of the distributed processing system to work cooperatively. The fast switching and task recovery can be realized through the double controllers and by utilizing system fault signals and network communication fault signals; the two controllers have fault-tolerant capability, and the reliability of the system is improved; the primary and standby identities of the two controllers are simply determined, and the switching mode is efficient.
Drawings
FIG. 1 is a schematic diagram of the system structure of the method of the present invention.
The specific implementation mode is as follows:
various software and hardware resources and system tasks in the distributed processing system need a controller with a main and standby fault-tolerant mechanism to carry out cooperative work under unified control and management. The primary and secondary controllers are initially determined and switched when a fault occurs in operation, are key functions of the distributed processing system and are related to the reliability of system application task operation. The method has important significance for fast and efficient recovery of system work of a strong real-time system; in the scheme, quick switching and task recovery can be realized through the double controllers and by utilizing system fault signals and network communication fault signals.
The invention discloses a method for determining and switching a distributed processing system master controller and a distributed processing system slave controller, which comprises the following steps:
designing two system controllers in a distributed processing system, wherein the two system controllers work in a 1+1 hot backup mode, one system controller is a main controller, and the other system controller is a controller; configuring initial states of two system controllers, and setting up a default main controller and an occupation mark of the main controller; setting a control signal for controlling the switching of the main controller/the auxiliary controller, and configuring a switching strategy of the main controller/the auxiliary controller corresponding to different control signal states; and the two system controllers are controlled by the control signal and the switching strategy in the normal operation process in a 1+1 hot backup mode.
Wherein, the initial state of the two system controllers is configured, and the acquiescent main controller and the occupation mark of the main controller are set, including:
when the distributed processing system is started, firstly initializing system hardware, and setting the initial states of two system controllers to be standby working modes; the system carries out power-on test detection and records the detection result; the system reads the slot identification number of each system controller, the system controller with the slot identification number of 1 is set as a default main controller, and the system controller with the slot identification number of 2 is set as a controller; the occupancy flag set by the main controller is set to be valid.
The control signals for controlling the switching of the main controller/the backup controller are system fault signals and network communication fault signals, wherein the system fault signals and the network communication fault signals have different signal states respectively. The signal states of the system fault signal and the network communication fault signal are valid and invalid.
When the system controller with the slot identification number 1 is used as a main controller and has a system fault signal or a network communication fault state, the system controller with the slot identification number 2 is switched to be used as the main controller, and the occupation mark of the system controller with the slot identification number 2 is effective; when the states of network communication fault signals of the two system controllers are effective, the distributed processing system enters an emergency working state, and only the two system controllers work; when the states of the system fault signals of the two system controllers are valid, the distributed processing system enters a failure state.
The conditions that trigger the active state of the system fault signal are: a watchdog overtime fault or a system management layer software fault in the distributed processing system; the rest time is in an invalid state;
the conditions that trigger the active state of the network communication failure are: a network communication failure of the distributed processing system.
The method comprises the following implementation steps:
the distributed processing system is provided with two system controllers which work in a 1+1 hot backup mode. When the processing system is started, the hardware is initialized, and the initial states of the two controllers are standby working modes. The system carries out PUBIT detection and records a detection result, the system manages and reads a controller slot identification number, a controller 1 of a slot 1 is set to be a main controller to operate, a controller 2 of a slot 2 is set to be a controller to operate, and an occupancy flag CONFLAG set by the controller 1 is valid. The two controllers are controlled by a system fault signal SYSFAIL and a network communication fault signal FCFAIL in the normal operation process in a 1+1 hot backup mode. When the controller 1 appears as a master controller that the SYSFAIL signal or FCFAIL signal is valid, the master controller switches to the controller 2, and the controller 2 sets the occupancy flag config to be valid. When the FCFAIL signals of the two controllers are effective, the network communication is failed, the processing system enters an emergency working state, and only the controllers work. When SYSFAIL signals of the two controllers are valid, the two controllers are both in failure, and the processing system enters a failure state. The conditions that trigger the syshair signal to be active are: watchdog timeout failures (passive, hardware monitoring software running away) or software set failures (active). The conditions for triggering the FCFAIL signal to be active are: a network communication failure.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equally replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application, and are intended to be included within the scope of the present application.
Claims (8)
1. A distributed processing system host controller determining and switching method is characterized by comprising the following steps:
designing two system controllers in a distributed processing system, wherein the two system controllers work in a 1+1 hot backup mode, one system controller is a main controller, and the other system controller is a controller;
configuring initial states of two system controllers, and setting up a default main controller and an occupation mark of the main controller;
setting a control signal for controlling the switching of the main controller/the auxiliary controller, and configuring a switching strategy of the main controller/the auxiliary controller corresponding to different control signal states; and the two system controllers are controlled by the control signal and the switching strategy in the normal operation process in a 1+1 hot backup mode.
2. The distributed processing system master-slave determination and switching method according to claim 1, wherein configuring initial states of two system controllers and setting default master controllers and master controller ownership flags comprises:
when the distributed processing system is started, firstly initializing system hardware, and setting the initial states of two system controllers to be standby working modes;
the system carries out power-on test detection and records the detection result; the system reads the slot identification number of each system controller, the system controller with the slot identification number of 1 is set as a default main controller, and the system controller with the slot identification number of 2 is set as a controller; the occupancy flag set by the main controller is set to be valid.
3. The distributed processing system master controller and slave controller determining and switching method according to claim 1, wherein the control signals for controlling the master controller/slave controller switching are a system fault signal and a network communication fault signal, wherein the system fault signal and the network communication fault signal respectively have different signal states.
4. The distributed processing system master and slave determination and switchover method of claim 1 wherein the signal status of the system fault signal, the network communication fault signal is valid and invalid.
5. The method for determining and switching master controllers and slave controllers of a distributed processing system according to claim 1, wherein when the system controller with slot identification number 1 is valid as the master controller in a state where a system fault signal or a network communication fault occurs, the system controller with slot identification number 2 is switched as the master controller, and the occupation flag of the system controller with slot identification number 2 is valid;
when the states of network communication fault signals of the two system controllers are effective, the distributed processing system enters an emergency working state, and only the two system controllers work;
when the states of the system fault signals of the two system controllers are valid, the distributed processing system enters a failure state.
6. The distributed processing system master-slave determination and switching method of claim 1, wherein:
the conditions that trigger the active state of the system fault signal are: a watchdog overtime fault or a system management layer software fault in the distributed processing system; the rest time is in an invalid state;
the conditions that trigger the active state of the network communication failure are: a network communication failure of the distributed processing system.
7. The distributed processing system host-controller determination and switchover method of claim 1 wherein the method is stored in a memory of a computer in the form of a computer program; the computer comprises a processor, a memory, and a computer program which, when executed by the processor, performs the steps of the method according to any one of claims 1 to 6.
8. The distributed processing system host-controller determination and switching method of claim 1, wherein the method is stored in a computer-readable storage medium in the form of a computer program; the computer program, when being executed by a processor, is adapted to carry out the steps of the method according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110243660.2A CN113050407B (en) | 2021-03-04 | 2021-03-04 | Method for determining and switching master controller and slave controller of distributed processing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110243660.2A CN113050407B (en) | 2021-03-04 | 2021-03-04 | Method for determining and switching master controller and slave controller of distributed processing system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113050407A true CN113050407A (en) | 2021-06-29 |
CN113050407B CN113050407B (en) | 2022-11-22 |
Family
ID=76510225
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110243660.2A Active CN113050407B (en) | 2021-03-04 | 2021-03-04 | Method for determining and switching master controller and slave controller of distributed processing system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113050407B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114484766A (en) * | 2021-12-21 | 2022-05-13 | 珠海格力电器股份有限公司 | Method for determining master controller and related equipment |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20020064509A (en) * | 2001-02-02 | 2002-08-09 | 두산중공업 주식회사 | Hot back-up device for double excitation system |
CN1968075A (en) * | 2006-05-23 | 2007-05-23 | 华为技术有限公司 | Distributed hot-standby logic device and primary/standby board setting method |
CN101030073A (en) * | 2007-03-30 | 2007-09-05 | 哈尔滨工程大学 | Switch circuit for engine redundant electrically-controlled system and its controlling method |
CN101430550A (en) * | 2007-03-30 | 2009-05-13 | 哈尔滨工程大学 | Switch control method of engine redundancy electric-control system |
CN102541697A (en) * | 2010-12-31 | 2012-07-04 | 中国航空工业集团公司第六三一研究所 | Switching method for processing fault of dual-redundancy computer |
CN102799104A (en) * | 2012-07-02 | 2012-11-28 | 浙江正泰中自控制工程有限公司 | Safety control redundant system and method for fully-intelligent master control system |
CN106444685A (en) * | 2016-12-06 | 2017-02-22 | 中国船舶重工集团公司第七〇九研究所 | Distributed control system and method of distributed control system for dynamic scheduling resources |
CN107733684A (en) * | 2017-08-31 | 2018-02-23 | 北京宇航系统工程研究所 | A kind of multi-controller computing redundancy cluster based on Loongson processor |
CN108803560A (en) * | 2018-05-03 | 2018-11-13 | 南京航空航天大学 | Synthesization DC solid-state power controller and failure decision diagnostic method |
CN110677282A (en) * | 2019-09-23 | 2020-01-10 | 天津津航计算技术研究所 | Hot backup method of distributed system and distributed system |
CN112130448A (en) * | 2020-09-25 | 2020-12-25 | 北京交大思诺科技股份有限公司 | Method for switching between main and standby machines |
-
2021
- 2021-03-04 CN CN202110243660.2A patent/CN113050407B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20020064509A (en) * | 2001-02-02 | 2002-08-09 | 두산중공업 주식회사 | Hot back-up device for double excitation system |
CN1968075A (en) * | 2006-05-23 | 2007-05-23 | 华为技术有限公司 | Distributed hot-standby logic device and primary/standby board setting method |
CN101030073A (en) * | 2007-03-30 | 2007-09-05 | 哈尔滨工程大学 | Switch circuit for engine redundant electrically-controlled system and its controlling method |
CN101430550A (en) * | 2007-03-30 | 2009-05-13 | 哈尔滨工程大学 | Switch control method of engine redundancy electric-control system |
CN102541697A (en) * | 2010-12-31 | 2012-07-04 | 中国航空工业集团公司第六三一研究所 | Switching method for processing fault of dual-redundancy computer |
CN102799104A (en) * | 2012-07-02 | 2012-11-28 | 浙江正泰中自控制工程有限公司 | Safety control redundant system and method for fully-intelligent master control system |
CN106444685A (en) * | 2016-12-06 | 2017-02-22 | 中国船舶重工集团公司第七〇九研究所 | Distributed control system and method of distributed control system for dynamic scheduling resources |
CN107733684A (en) * | 2017-08-31 | 2018-02-23 | 北京宇航系统工程研究所 | A kind of multi-controller computing redundancy cluster based on Loongson processor |
CN108803560A (en) * | 2018-05-03 | 2018-11-13 | 南京航空航天大学 | Synthesization DC solid-state power controller and failure decision diagnostic method |
CN110677282A (en) * | 2019-09-23 | 2020-01-10 | 天津津航计算技术研究所 | Hot backup method of distributed system and distributed system |
CN112130448A (en) * | 2020-09-25 | 2020-12-25 | 北京交大思诺科技股份有限公司 | Method for switching between main and standby machines |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114484766A (en) * | 2021-12-21 | 2022-05-13 | 珠海格力电器股份有限公司 | Method for determining master controller and related equipment |
Also Published As
Publication number | Publication date |
---|---|
CN113050407B (en) | 2022-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20190303255A1 (en) | Cluster availability management | |
US8032786B2 (en) | Information-processing equipment and system therefor with switching control for switchover operation | |
CN110427283B (en) | Dual-redundancy fuel management computer system | |
CN103853622A (en) | Control method of dual redundancies capable of being backed up mutually | |
US20040177242A1 (en) | Dynamic computer system reset architecture | |
CN113050407B (en) | Method for determining and switching master controller and slave controller of distributed processing system | |
CN114337944B (en) | System-level main/standby redundancy general control method | |
CN101557307B (en) | Dispatch automation system application state management method | |
CN110764829B (en) | Multi-path server CPU isolation method and system | |
JP5285045B2 (en) | Failure recovery method, server and program in virtual environment | |
CN101686261A (en) | RAC-based redundant server system | |
CN101770211B (en) | Vehicle integrated data processing method capable of realizing real-time failure switching | |
JP5285044B2 (en) | Cluster system recovery method, server, and program | |
CN103297279A (en) | Switching method of main and backup single disks of software control in multi-software process system | |
JP2008152552A (en) | Computer system and failure information management method | |
CN110677288A (en) | Edge computing system and method generally used for multi-scene deployment | |
JP2014048933A (en) | Plant monitoring system, plant monitoring method, and plant monitoring program | |
CN114138567A (en) | Substrate management control module maintenance method, device, equipment and storage medium | |
CN110752955A (en) | Seat invariant fault migration system and method | |
CN113742165B (en) | Dual master control equipment and master-slave control method | |
JP5913003B2 (en) | Computer control apparatus, method and program | |
KR0168947B1 (en) | Method for booting node without disk in real-time distributing system | |
CN116881053B (en) | Data processing method, exchange board, data processing system and data processing device | |
CN113741248B (en) | Edge calculation controller and control system | |
JPH10133963A (en) | Fault detecting and recovering system for computer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |