CN116232864A

CN116232864A - Multi-machine hot backup method and system for network system based on event controller

Info

Publication number: CN116232864A
Application number: CN202310491209.1A
Authority: CN
Inventors: 朱珂; 陈培岩; 张明伟; 常超; 张波; 肖峰; 闻亮; 毛英杰; 徐涛; 高庆
Original assignee: Jingxin Microelectronics Technology Tianjin Co Ltd
Current assignee: Jingxin Microelectronics Technology Tianjin Co Ltd
Priority date: 2023-05-05
Filing date: 2023-05-05
Publication date: 2023-06-06
Anticipated expiration: 2043-05-05
Also published as: CN116232864B

Abstract

The invention belongs to the technical field of network data processing and digital information transmission, and particularly relates to a multi-machine hot backup method and system of a network system based on event control symbols, which comprise a normal working mode and an abnormal working mode.

Description

Multi-machine hot backup method and system for network system based on event controller

Technical Field

The invention belongs to the technical field of network data processing and digital information transmission, and particularly relates to a multi-machine hot backup method and system of a network system based on an event controller.

Background

RapidIO is a high performance, low pin count, packet switching based interconnect architecture; for a high-performance embedded communication system, the rapidIO protocol has the characteristics of high bandwidth, low time delay, high flexibility, high reliability and the like, and is the most preferable in the embedded interconnection technology. Typically, the RapidIO network includes end point devices (PE, processing Element) that are responsible for generating, sending and processing packets, and switching devices (SWITCH) that are responsible for receiving and forwarding. One device is generally used as a host node in the endpoint device, and the function of the endpoint device is to complete network maintenance work such as initial enumeration, route deployment, fault management and the like of the rapidIO network;

from the reliability point of view, when the host machine itself or the connection between the host machine itself and the RapidIO network fails, a hot backup mechanism is needed to ensure that the RapidIO network can keep normal operation. The current main-stream hot backup system is a dual-machine hot backup system, wherein the dual-machine hot backup system comprises a host machine and a standby machine, and when the host machine fails, the standby machine can timely take over the position of the host machine, so that the service and management of the rapidIO network are ensured not to be out of control. Common hot backup system implementation modes are numerous, including realizing heartbeat communication between the main machine and the standby machine by means of a third party arbitration mechanism through external hardware or rapidIO messages, and the methods have obvious defects: for example, the scheme by means of the third party arbitration mechanism is completely established on the basis of the reliability of the third party arbitration mechanism, and the robustness of the system is not further improved; the heartbeat communication mechanism between the main machine and the standby machine is realized through external hardware, besides the increase of hardware cost, a hardware path is needed between the main machine and the standby machine, and the form of the whole network topology is greatly limited; the channel resources in the network are occupied by the rapidIO message, the route configuration is easy to generate conflict, and the forwarding priority of the data packet is difficult to guarantee;

considering the characteristics of the rapidIO network and the actual application scene, the dual hot standby system still cannot provide enough reliable guarantee in the rapidIO network, especially the situation that a host participates in service interaction, frequently dynamically enters into the network and exits from the network or the host and the standby are in continuous fault under certain extreme conditions. At this time, the problem can be solved by increasing the number of the standby machines, namely, one host machine is adopted to match a plurality of standby machines. It is not difficult to find that higher cost overhead and system complexity are likely to be brought if the technology is adopted, and a large amount of external hardware is needed, so that the form of the rapidIO network topology is more rigid; or more channel resources may be occupied, and routing configuration becomes more complex.

The prior art has the problems that a low-cost and reliable hot backup mechanism of multiple standby machines is lacked, so that the fault tolerance of a network system is poor, the complexity is high, and the interaction of data services is affected.

Disclosure of Invention

The invention provides a multi-machine hot backup method and a system of a network system based on event control symbols, which are used for solving the problems that the prior art in the background technology lacks a low-cost and reliable hot backup mechanism of multiple machines, so that the fault tolerance of the network system is poor, the complexity is high, and the interaction of data services is affected.

The technical problems solved by the invention are realized by adopting the following technical scheme: the multi-machine hot backup method of the network system based on the event controller comprises the following steps:

multi-machine network of one host machine and multiple standby machines based on the interconnection system structure of the rapidIO network data packet exchange:

normal operation mode: if the network enumeration is finished, the current host selects and wakes up a standby machine to form a first working standby machine, and establishes heartbeat communication through the multicast event controller;

abnormal operation mode: if the host in communication fails, the first working standby machine takes over the current host to form a working host, the other standby machine selected and awakened by the working host forms a second working standby machine, and heartbeat communication is reestablished through the multicast event controller.

Further, the normal operation mode further includes:

in the initial stage of the system, the current host detects and discovers all the standby machines of the whole multi-machine network by initiating network enumeration operation, and establishes rapidIO channels of the current host and all the standby machines.

Further, the normal operation mode further includes:

and selecting and waking up the most applicable standby machine according to the comprehensively determined topological structure of the multi-machine network and the equipment physical property of each standby machine of the multi-machine network, and determining the most applicable standby machine as a first working standby machine.

Further, the normal operation mode further includes:

if the standby machine is awakened, determining a heartbeat communication complete path between the current host machine and the first working standby machine, configuring the switching equipment and the first working standby machine one by one through maintenance packets according to the heartbeat communication complete path, starting a port multicast event controller of the first working standby machine to send enabling, and establishing a multicast event controller transmission path between the current host machine and the first working standby machine;

if the standby machine is not awakened, the current host machine reselects and wakes the standby machine.

Further, the normal operation mode further includes:

based on the control transmission period, the current host transmits a multicast event controller to the first working machine;

if the first work machine receives the first multicast event controller, starting a first control sending timing and periodically detecting the multicast event controller;

if the first working machine receives the second multicast event controller, starting a second control sending timing and periodically detecting the multicast event controller;

if the first working machine receives the third multicast event controller, starting third control sending timing and periodically detecting the multicast event controller;

and so on;

if the first working machine receives the Nth multicast event controller, starting the Nth control sending timing;

averaging the first control transmit timing, the second control transmit timing, and the third control transmit timing to form an average transmit timing, and counting as a host heartbeat cycle, namely:

；

the Ta is ₁ Timing the first control transmission;

the Ta is ₂ Timing the second control transmission;

the Ta is ₃ Timing the first control transmission;

the Ta is _N Timing the Nth control transmission;

and N is the number of times of controlling the sending timing.

Further, the normal operation mode further includes:

average transmit timing threshold function:

；

the Tab is ₊ An upper timing threshold for average transmission;

the Tab is _- A lower threshold for average transmit timing;

the Ta is average sending timing;

and the Tg is a transmission timing error, and the value of the Tg is determined according to the network transmission rate.

Further, the normal operation mode further includes:

heartbeat loss judgment function:

；

further, the normal operation mode further includes:

if the first work machine receives a first multicast event controller, a first fault sending moment record is started;

if the first working machine receives the second multicast event controller, starting a second control sending time record;

the first failure transmission timing and the second control transmission timing are formed into a transmission interval timing and counted as a failure interval period.

Further, the normal operation mode further includes:

transmission interval timing function:

QT=T2-T1；

the QT is interval sending timing;

the T2 is a second fault sending time record;

the T1 is a first fault sending time record;

host fault threshold function:

；

the TAB ₊ The upper limit of the fault interval period threshold is set;

the TAB _- A lower limit of a fault interval period threshold;

the TA is average interval timing;

the TG is a fault interval error, and the TG value is determined according to the network transmission rate;

host fault determination function:

；/>

meanwhile, the invention provides a network system multi-machine hot backup system based on event control symbols, which comprises a multi-machine hot backup platform for realizing the multi-machine hot backup method, wherein the multi-machine hot backup platform comprises a normal working module and an abnormal working module;

the normal operation module is used for: if the network enumeration is finished, the current host selects and wakes up a standby machine to form a first working standby machine, and establishes heartbeat communication through the multicast event controller;

the abnormal working module is used for: if the host in communication fails, the first working standby machine takes over the current host to form a working host, the other standby machine selected and awakened by the working host forms a second working standby machine, and heartbeat communication is reestablished through the multicast event controller.

The beneficial technical effects are as follows:

the scheme adopts a multi-machine network of one host machine and multiple standby machines based on an interconnection system structure of the rapidIO network data packet exchange: normal operation mode: if the network enumeration is finished, the current host selects and wakes up a standby machine to form a first working standby machine, and establishes heartbeat communication through the multicast event controller; abnormal operation mode: if the host computer in communication fails, the first working standby machine takes over the current host computer to form a working host computer, the other standby machine selected and awakened by the working host computer forms a second working standby machine, and heartbeat communication is reestablished through the multicast event controller; when the host fails, the current standby machine takes over the network to become a new host, wakes up a standby machine in a dormant state to further establish new main standby heartbeat communication, thereby realizing a reliable hot backup mechanism of multiple standby machines with lower cost, greatly improving the fault tolerance of the rapidIO network system, and not increasing the complexity of the system and affecting the interaction of data services.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a general flow chart of a multi-machine hot standby method of the present invention;

FIG. 2 is a main flow chart of the multi-machine hot standby method of the present invention;

FIG. 3 is a flow chart of a multi-machine hot standby method of the present invention;

fig. 4 is a schematic structural diagram of a first embodiment of the present invention.

Description of the embodiments

The invention is further described below with reference to the accompanying drawings:

in the figure:

s101, a normal working mode;

s102, an abnormal working mode;

s1001-a multi-machine network of a host multi-standby machine based on an interconnection system structure of rapidIO network data packet exchange;

s1002, if the network enumeration is finished, selecting and waking up a standby machine by the current host machine to form a first working standby machine, and establishing heartbeat communication through a multicast event controller;

s1003-if the host in communication fails, the first working standby machine takes over the current host to form a working host, the other standby machine selected and awakened by the working host forms a second working standby machine, and the heartbeat communication is reestablished through the multicast event controller;

examples

This embodiment:

the multi-machine hot backup method of the network system based on the event controller comprises the following steps:

multi-machine network S1001 of one host and multiple standby machines of interconnection system structure based on RapidIO network data packet exchange:

normal operation mode S101: if the network enumeration is finished, the current host selects and wakes up a standby machine to form a first working standby machine, and establishes heartbeat communication through a multicast event controller S1002;

abnormal operation mode S102: if the communicating host fails, the first working standby takes over the current host to form a working host, the other standby selected and awakened by the working host forms a second working standby, and the heartbeat communication is reestablished through the multicast event controller S1003.

Due to the adoption of a multi-machine network of one host and multiple standby machines based on an interconnection system structure of the rapidIO network data packet exchange: normal operation mode: if the network enumeration is finished, the current host selects and wakes up a standby machine to form a first working standby machine, and establishes heartbeat communication through the multicast event controller; abnormal operation mode: if the host computer in communication fails, the first working standby machine takes over the current host computer to form a working host computer, the other standby machine selected and awakened by the working host computer forms a second working standby machine, and heartbeat communication is reestablished through the multicast event controller; when the host fails, the current standby machine takes over the network to become a new host, wakes up a standby machine in a dormant state to further establish new main standby heartbeat communication, thereby realizing a reliable hot backup mechanism of multiple standby machines with lower cost, greatly improving the fault tolerance of the rapidIO network system, and not increasing the complexity of the system and affecting the interaction of data services.

The normal operation mode S101 further includes:

Since the normal operation mode is adopted, the method further comprises: in the initial stage of the system, the current host detects and discovers all the standby machines of the whole multi-machine network by initiating network enumeration operation, and establishes a rapidIO (input/output) path between the current host and all the standby machines; one of the endpoint devices is selected as a main control processing node and is responsible for initial enumeration, configuration deployment, fault management and the like of the whole rapidIO network. In the initial stage of the system, the host initiates network enumeration operation to finish the equipment detection and discovery of the whole rapidIO network, and rapidIO channels of the host, all switching equipment and endpoint equipment are established at the moment.

The normal operation mode S101 further includes:

The adoption of the normal working mode further comprises: selecting and waking up the most applicable standby machine according to the comprehensively determined topological structure of the multi-machine network and the equipment physical property of each standby machine of the multi-machine network, and simultaneously determining the most applicable standby machine as a first working standby machine, wherein the standby machine is selected and waken up: the selection rule of the standby machine is not particularly limited, and only the endpoint equipment with the network management function is needed, the actual network can be comprehensively determined according to the topological structure, the physical properties of the endpoint equipment and the like, the topological structure can be the number of exchanges between the standby machine and the host machine, and the priority is higher when the number is smaller; the physical performance of the endpoint device may be the strength of the network management function of the endpoint device, and the stronger the function is, the higher the priority is; the host wakes up the standby machine immediately after the selection is completed, and particularly, if the wake-up fails, the selection and the wake-up are performed again.

The normal operation mode S101 further includes:

Since the normal operation mode is adopted, the method further comprises: if the standby machine is awakened, determining a heartbeat communication complete path between the current host machine and the first working standby machine, configuring the switching equipment and the first working standby machine one by one through maintenance packets according to the heartbeat communication complete path, starting a port multicast event controller of the first working standby machine to send enabling, and establishing a multicast event controller transmission path between the current host machine and the first working standby machine; if the standby machine is not awakened, the current host machine reselects and wakes the standby machine, after the standby machine is awakened, a complete path P of heartbeat communication between the host machine and the standby machine is established, mainly comprising intermediate switching equipment and corresponding ports of the path, and then switching equipment contained in the path P is configured one by one from the direct connection of the host machine through a maintenance packet, and the port multicast event controller of the next switching (or standby machine) of the current switching connection is started to transmit and enable, so that a multicast event controller transmission path between the host machine and the standby machine is established, and the host machine and the standby machine can participate in specific data service transmission as required while executing network management due to the characteristic of the multicast event controller, thereby effectively improving network throughput and the utilization rate of system resources.

The normal operation mode S101 further includes:

and so on;

；

the Ta is ₁ Timing the first control transmission;

the Ta is ₂ Timing the second control transmission;

the Ta is ₃ Timing the first control transmission;

the Ta is _N Timing the Nth control transmission;

and N is the number of times of controlling the sending timing.

Since the normal operation mode is adopted, the method further comprises: based on the control transmission period, the current host transmits a multicast event controller to the first working machine; if the first work machine receives the first multicast event controller, starting a first control sending timing and periodically detecting the multicast event controller; if the first working machine receives the second multicast event controller, starting a second control sending timing and periodically detecting the multicast event controller; if the first working machine receives the third multicast event controller, starting third control sending timing and periodically detecting the multicast event controller; and forming the first control sending timing, the second control sending timing and the third control sending timing into average sending timing and counting as the heartbeat period of the host computer, and initializing the heartbeat communication setting between the host computer and the standby computer. The host sends the multicast event control symbol to the standby machine by taking T as a period, the standby machine records time T0 when receiving the first multicast event control symbol, starts the standby machine control program, and always detects whether the multicast event control symbol sent by the host machine is received periodically or not; the standby machine receives the second multicast time controller symbol time T1, and the like, and T2 and T3 are calculated, the arithmetic average value Ta of the time spent by the standby machine for transmitting the multicast event controller symbol to the standby machine corresponding to the three times of the host machine for T1, T2 and T3 is recorded as the period of the standby machine for detecting the heartbeat of the host machine, and the initial setting of the heartbeat communication between the main machine and the standby machine is completed. Considering that the network transmission may have jitter, the time limit Tl for judging the heartbeat loss can be properly widened compared with Ta, the value can be customized according to the specific application scene, after the switching of the main and standby is completed, only the multicast event control symbol transfer path between the new host and the new standby is required to be re-established, the configuration change of the whole rapidIO network can be almost ignored, the interaction of data service can not be generated, and the influence on the existing service of the system can be reduced to the minimum.

The normal operation mode S101 further includes:

average transmit timing threshold function:

；

the Tab is ₊ An upper timing threshold for average transmission;

the Tab is _- A lower threshold for average transmit timing;

the Ta is average sending timing;

The normal operation mode S101 further includes:

heartbeat loss judgment function:

；

the normal operation mode S101 further includes:

Since the normal operation mode is adopted, the method further comprises: based on the control transmission period, the current host transmits a multicast event controller to the first working machine; if the first work machine receives a first multicast event controller, a first fault sending moment record is started; if the first working machine receives the second multicast event controller, starting a second control sending time record; forming a transmission interval timing by the first fault transmission timing and the second control transmission timing, and counting as a fault interval period, wherein the standby machine always detects heartbeat information transmitted by the host machine and records a time interval Ti adjacent to two times, and if the Ti does not exceed Tl, the standby machine is regarded as normal to continue to wait in a circulating way; otherwise, when Ti is greater than Tl or the time from the last heartbeat exceeds Tl, the host computer is regarded as fault, and the cycle detection is stopped; when the host fails, the standby machine stops the loop detection, starts the network takeover program, and the role finishes the switching from the standby machine to the host to take over the maintenance and management of the whole rapidIO network by the original host; after the network is taken over, the new host computer repeats the step 4 to complete the new standby computer to select and wake up and start the subsequent operation.

The normal operation mode S101 further includes:

transmission interval timing function:

；

the Tab is ₊ An upper timing threshold for average transmission;

the Tab is _- A lower threshold for average transmit timing;

the Ta is average sending timing;

The normal operation mode S101 further includes:

heartbeat loss judgment function:

；

the normal operation mode S101 further includes:

The normal operation mode S101 further includes:

transmission interval timing function:

QT=T2-T1；

the QT is interval sending timing;

the T2 is a second fault sending time record;

the T1 is a first fault sending time record;

host fault threshold function:

；

the TAB ₊ The upper limit of the fault interval period threshold is set;

the TAB _- A lower limit of a fault interval period threshold;

the TA is average interval timing;

host fault determination function:

；

meanwhile, the invention also provides a multi-machine hot backup system of the network system based on the event controller, which comprises a multi-machine hot backup platform for realizing the multi-machine hot backup method, wherein the multi-machine hot backup platform comprises a normal working module and an abnormal working module;

the normal operation module is used for: if the network enumeration is finished, the current host selects and wakes up a standby machine to form a first working standby machine, and establishes heartbeat communication through a multicast event controller S1002;

the abnormal working module is used for: if the communicating host fails, the first working standby takes over the current host to form a working host, the other standby selected and awakened by the working host forms a second working standby, and the heartbeat communication is reestablished through the multicast event controller S1003.

Meanwhile, the invention also provides a multi-machine hot backup system of the network system based on the event controller, which comprises a multi-machine hot backup platform of the multi-machine hot backup method, wherein the multi-machine hot backup platform comprises a normal working module and an abnormal working module; the normal operation module is used for: if the network enumeration is finished, the current host selects and wakes up a standby machine to form a first working standby machine, and establishes heartbeat communication through the multicast event controller; the abnormal working module is used for: if the host computer in communication fails, the first working standby machine takes over the current host computer to form a working host computer, the other standby machine selected and awakened by the working host computer forms a second working standby machine, and heartbeat communication is reestablished through the multicast event controller, so that the system is proved to have practicability.

Embodiment one:

for the purpose of highlighting the implementation of the present patent solution, rapidIO shown in the above topology does not list enough endpoint devices, where endpoint devices with network management functions are hosts and endpoints a, d, e. The scheme of the patent is implemented as follows:

1) Initiating network enumeration by a host to obtain a network topology comprising all endpoint devices and switching devices, wherein a network path between the host and any device in the network is provided;

2) And selecting the endpoint a as a standby machine based on the network topology obtained in the last step, and waking up the endpoint a. Planning a path between a host and an endpoint a: p= { (

SW

1,0, 5), (SW 2,11, 1) }, where all exchanges of the whole path of host to standby, and corresponding ingress and egress ports;

3) Configuring each exchange in P one by one, respectively starting multicast event controller forwarding enabling of the SW1 port 5 and the SW2 port 1, and completing the establishment of a heartbeat communication transmission path between the main machine and the standby machine;

4) Initial setting of heartbeat communication between the main machine and the standby machine is started to be executed:

a) The host sends the multicast event control symbol according to the fixed interval T cycle

b) The standby machine receives the first multicast event control symbol time T0 and starts the standby machine control program

c) The standby machine receives the second multicast event control symbol time T1, and the like, T2 and T3 are calculated, and an arithmetic average value Ta of time spent by the standby machine for transmitting the multicast event control symbol to the standby machine by the host machine corresponding to three times of T1, T2 and T3 is recorded as a period for detecting the heartbeat of the host machine by the standby machine;

5) The network transmission and service scene conditions and the like are synthesized, and a heartbeat loss judgment time limit TI=2Ta is determined;

6) So far, the host computer has ready all configuration about heartbeat communication, can start to deploy network data service, and the data interaction of the whole rapidIO network application layer is fully developed immediately; as the real-time load and link state of the network may change dynamically, the paths between the host and the network may also fail;

7) The standby machine always detects heartbeat information sent by the host machine, records a time interval Ti adjacent to two times, and if the Ti does not exceed Tl, the standby machine is regarded as normal to the host machine and continues to wait circularly; otherwise, when Ti is greater than Tl or the time from the last heartbeat exceeds Tl, the host computer is regarded as fault, and the cycle detection is stopped;

8) The standby machine starts a network take-over program, and the role finishes the switching from the standby machine to the host machine, so that the original host machine is replaced to maintain and manage the whole rapidIO network;

9) Based on the current network topology, endpoint d is selected as a standby and attempts to wake it up.

10 Endpoint d failed to wake up, reselect endpoint e as a standby, and attempt to wake it up.

11 Successfully awakening endpoint e, planning the path between the current host (endpoint a) and endpoint e: p= { (SW 2,1, 11), (

SW

1,5, 13), (SW 3,4, 15), (

SW

4,11, 4) }. Configuring the exchanges in P one by one, respectively starting the multicast event controller forwarding enabling of the SW2 port 11, the SW1 port 13, the SW3 port 15 and the SW4 port 4, and reestablishing the heartbeat communication transmission path between the main machine and the standby machine;

12 Repeating the above step 4 and subsequent operations. The multi-standby hot backup method of the rapidIO network system based on the multicast controller described in the patent is realized.

Working principle:

the scheme is that a multi-machine network of one host machine and multiple standby machines of an interconnection system structure based on the rapid IO network data packet exchange is adopted: normal operation mode: if the network enumeration is finished, the current host selects and wakes up a standby machine to form a first working standby machine, and establishes heartbeat communication through the multicast event controller; abnormal operation mode: if the host computer in communication fails, the first working standby machine takes over the current host computer to form a working host computer, the other standby machine selected and awakened by the working host computer forms a second working standby machine, and heartbeat communication is reestablished through the multicast event controller; when the host fails, the current standby machine takes over the network to become a new host, wakes up a standby machine in a dormant state to create new main standby heartbeat communication.

It should be noted herein that any process or method descriptions that are otherwise described may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and that scope of preferred embodiments of the present invention includes additional implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present invention. The processor performs the various methods and processes described above. For example, method embodiments in the present solution may be implemented as a software program tangibly embodied on a machine-readable medium, such as a memory. In some embodiments, part or all of the software program may be loaded and/or installed via memory and/or a communication interface. One or more of the steps of the methods described above may be performed when a software program is loaded into memory and executed by a processor. Alternatively, in other embodiments, the processor may be configured to perform one of the methods described above in any other suitable manner (e.g., by means of firmware).

The logic and/or steps described elsewhere herein may be embodied in any readable storage medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention.

Claims

1. The multi-machine hot backup method of the network system based on the event controller is characterized by comprising the following steps:

2. The multi-machine hot standby method according to claim 1, wherein the normal operation mode further comprises:

3. The multi-machine hot standby method according to claim 1, wherein the normal operation mode further comprises:

4. A multi-machine hot standby method according to claim 3, wherein said normal operating mode further comprises:

5. The multi-machine hot standby method according to claim 4, wherein the normal operation mode further comprises:

and so on;

；

the Ta is ₁ Timing the first control transmission;

the Ta is ₂ Timing the second control transmission;

the Ta is ₃ Timing the first control transmission;

the Ta is _N Timing the Nth control transmission;

and N is the number of times of controlling the sending timing.

6. The multi-machine hot standby method according to claim 5, wherein the normal operation mode further comprises:

average transmit timing threshold function:

；

the Tab is ₊ An upper timing threshold for average transmission;

the Tab is _- A lower threshold for average transmit timing;

the Ta is average sending timing;

7. The multi-machine hot standby method according to claim 6, wherein the normal operation mode further comprises:

heartbeat loss judgment function:

。

8. the multi-machine hot standby method according to claim 4, wherein the normal operation mode further comprises:

9. The multi-machine hot standby method according to claim 8, wherein the normal operation mode further comprises:

transmission interval timing function:

QT=T2-T1；

the QT is interval sending timing;

the T2 is a second fault sending time record;

the T1 is a first fault sending time record;

host fault threshold function:

；

the TAB ₊ The upper limit of the fault interval period threshold is set;

the TAB _- A lower limit of a fault interval period threshold;

the TA is average interval timing;

host fault determination function:

。/>

10. the network system multi-machine hot backup system based on the event controller is characterized by comprising a multi-machine hot backup platform for realizing the multi-machine hot backup method according to any one of claims 1 to 9, wherein the multi-machine hot backup platform comprises a normal working module and an abnormal working module;