CN112887176A - Computer interlocking subsystem master-slave switching system based on heartbeat message - Google Patents

Computer interlocking subsystem master-slave switching system based on heartbeat message Download PDF

Info

Publication number
CN112887176A
CN112887176A CN202110463467.XA CN202110463467A CN112887176A CN 112887176 A CN112887176 A CN 112887176A CN 202110463467 A CN202110463467 A CN 202110463467A CN 112887176 A CN112887176 A CN 112887176A
Authority
CN
China
Prior art keywords
channel
interlocking
communication server
state
communication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110463467.XA
Other languages
Chinese (zh)
Other versions
CN112887176B (en
Inventor
吴正中
郑刚
汪永刚
张涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Urban Construction Intelligent Control Technology Co.,Ltd.
Original Assignee
Beijing Urban Construction Intelligent Control Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Urban Construction Intelligent Control Technology Co ltd filed Critical Beijing Urban Construction Intelligent Control Technology Co ltd
Priority to CN202110463467.XA priority Critical patent/CN112887176B/en
Publication of CN112887176A publication Critical patent/CN112887176A/en
Application granted granted Critical
Publication of CN112887176B publication Critical patent/CN112887176B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention relates to a computer interlocking subsystem master-slave switching system based on heartbeat messages, which comprises: the communication server comprises a communication server A and a communication server B; the channel comprises an A channel and a B channel; the channel A is connected with a communication server A and a communication server B; the channel B is connected with the communication server A and the communication server B; the interlocking unit comprises a first interlocking unit and a second interlocking unit belonging to the channel A, and a third interlocking unit and a fourth interlocking unit belonging to the channel B; and transmitting the state information of the equipment between each interlocking unit and each channel through the heartbeat message between the equipment. The system realizes the main-standby decision and the main-standby switching of the interlocking subsystem by designing the special heartbeat message, and realizes quick and accurate fault location when a certain unit of the system goes out of a network communication fault, thereby reducing the overall hardware cost of the interlocking system and improving the system switching efficiency and the communication fault location efficiency.

Description

Computer interlocking subsystem master-slave switching system based on heartbeat message
Technical Field
The invention relates to a master-slave switching system of an interlocking subsystem, belongs to the technical field of communication, and particularly relates to a master-slave switching system of a computer interlocking subsystem based on heartbeat messages.
Background
The two-by-two-system redundancy structure has the advantages of high safety, good reliability, low manufacturing cost and the like, is widely applied to the field of rail transit computer interlocking, and mainly realizes two main modes of hardware realization and software-hardware combination realization in the initialization determination of the main-standby system relation and the realization of the main-standby system switching mechanism in the interlocking subsystem at present.
The hardware equipment is used for completing the functions of interlocking master-slave determination and master-slave switching, the cost is high, the technical difficulty is high, the functions are highly concentrated in a hardware scheduling unit, and once the unit fails, the whole system fails.
Software and hardware are combined to arbitrate, and a certain network device in an interlocking system is mainly used as a communication carrier, and a software mode is mainly used for completing the functions of interlocking master-slave decision and master-slave switching, and the following problems are generally solved: (1) the main and standby systems are determined to have problems when power-on initialization is carried out for the first time: (1) the original interlock system usually determines the primary-standby relationship according to the power-on sequence, but cannot solve the problem of conflict caused by simultaneously powering on each device. If manual power-on is carried out, the problem of simultaneous power-on is avoided, and the design principle that primary power-on randomly determines the primary and secondary relations is violated; (2) in the system operation process, the main-standby switching usually needs more than 500ms of time consumption; (3) in the running process of the system, once a certain equipment fails or a communication network port is abnormal, due to the lack of necessary software support, an abnormal point cannot be accurately positioned.
Therefore, it is a technical problem that needs to be solved urgently at present to improve the main/standby switching system in the prior art to solve the above problems.
Disclosure of Invention
The following presents a simplified summary of one or more aspects in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later.
The invention mainly aims to solve the technical problems in the prior art and provides a heartbeat message based master and standby switching system of a computer interlocking subsystem. The system realizes the problems of main-standby decision and main-standby switching of the interlocking subsystem by designing a special heartbeat message, reduces the overall hardware cost of the interlocking system and improves the system switching efficiency.
In order to solve the problems, the scheme of the invention is as follows:
a computer interlocking subsystem master-slave switching system based on heartbeat messages comprises:
the communication server comprises a communication server A and a communication server B;
the channel comprises an A channel and a B channel; the channel A is connected with a communication server A and a communication server B; the channel B is connected with the communication server A and the communication server B;
the interlocking unit comprises a first interlocking unit and a second interlocking unit belonging to the channel A, and a third interlocking unit and a fourth interlocking unit belonging to the channel B;
the interlocking units and the channels are communicated with each other through heartbeat messages among the devices to transmit state information of the devices, and the heartbeat messages not only record the state information of the heartbeat messages, but also comprise state information of other devices in the system; and the interlocking unit modifies the corresponding self equipment state description field in the heartbeat message according to the self running state, and updates the corresponding field of other equipment in the heartbeat message according to the received heartbeat data of other equipment.
Preferably, in the above system for switching between the main and standby computer interlocking subsystems based on heartbeat messages, when the interlocking unit or the communication server finds that the heartbeats of all other devices are overtime, it is determined that the communication of the device is abnormal, and when it is determined that the other devices are still overtime in the next communication time sequence, it is determined that the communication of the device is abnormal.
Preferably, in the master-slave switching system of the computer interlocking subsystem based on the heartbeat message, the communication server a and the communication server B send the power-on heartbeat message of the local computer and receive the heartbeat messages of other devices at the same time in the power-on process, and whether the state of the host computer or the standby computer is entered is judged according to the received heartbeat messages.
Preferably, in the above system for switching between master and slave computer interlocking subsystems based on heartbeat messages, when the communication server a and the communication server B enter the power-on state and the slave state at the same time, the two machines each take a random number, and each waits for the other to become the master in the communication cycle of the obtained random number, and if the other is not set as the master in the random number cycle, the other sets itself as the master.
Preferably, in the above mentioned system for switching between master and slave computer interlocking subsystems based on heartbeat messages, the slave machines in the communication server a and the communication server B check the heartbeat messages sent from the interlocking unit CPU in each period, extract the state of the host machine of the communication server from the heartbeat messages and follow the state, when the heartbeat of the host machine of the communication server is over time or abnormal in reporting, start to enter the master and slave machine switching process, and after the second time sequence confirms that the states of the host machine of the communication server and the host machine of the original communication server are correct, switch to enter the host machine state.
Preferably, in the above system for switching between master and slave computer interlocking subsystems based on heartbeat messages, after the operation of sending and receiving heartbeat messages between each channel and the communication server is started, the interlocking unit receives a state setting instruction from the communication host, and only when the interlocking unit of the channel enters the following state, the standby state or the host state, the interlocking unit of the channel then sends and receives heartbeat messages with another channel.
Preferably, in the above system for switching between the main and standby computer interlocking subsystems based on heartbeat messages, when only one interlocking channel can normally perform heartbeat transceiving, the channel is determined to be the main channel; when both the two interlocking channels can normally perform heartbeat transceiving, the communication server host randomly selects one channel as a main channel, and the other channel enters a following mode.
Preferably, in the master-slave switching system of a computer interlocking subsystem based on heartbeat messages, when data comparison of preset secondary processing results is inconsistent continuously in the standby mode of one channel, the abnormal state of the channel is identified in the heartbeat message, a return to a following mode is applied, entering state information and processing result information sent by the main channel are received, and the main channel is followed again.
Preferably, in the above system for switching between the main and the standby interlocking channels of the computer interlocking subsystem based on heartbeat messages, the switching between the main and the standby interlocking channels includes a passive abnormal mode, in which the communication server host finds that the heartbeat messages of the interlocking main channel are overtime in a first timing sequence, and can confirm the event after receiving the heartbeat messages of all devices in other systems; in the second time sequence, after the communication service host confirms that the heartbeat message of the interlocking main channel is overtime, the communication service host can immediately inform the interlocking standby channel device to enter the main channel mode through the heartbeat message; in this way, if the other channel of the two interlocked channels does not enter the standby mode, 2 time sequence periods are waited, if the other channel does not have the condition, the abnormality is reported through the log module, and the conditions of the two interlocked channels and the processing unit at the moment are recorded.
Preferably, in the above system for switching between the main and the standby interlocking channels of the computer interlocking subsystem based on heartbeat messages, the switching between the main and the standby interlocking channels includes an active abnormal mode, in which the communication server host finds the abnormal flag bit of the heartbeat message of the interlocking main channel at a first timing, and can confirm the event after receiving the heartbeat messages of all devices in other systems; in a second time sequence, after the communication service host confirms that the interlocking main channel continues to send abnormal heartbeat messages, the communication service host can immediately inform the interlocking standby channel equipment to enter a main channel mode through the heartbeat messages, so that the other channel in the interlocking double channels does not enter the standby mode, 2 time sequence periods are waited, if the other channel does not meet the condition, the other channel reports the abnormality through a log module, and the conditions of the interlocking double channels and the processing unit at the moment are recorded; when the interlock can sense the self state, the self state needs to be set to be an abnormal state, and a command of the communication server is waited.
Therefore, compared with the prior art, the invention has the following advantages: the invention realizes the problems of main-standby decision and main-standby switching of the interlocking subsystem by designing the special heartbeat message, reduces the overall hardware cost of the interlocking system and improves the system switching efficiency.
Drawings
The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments of the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the pertinent art to make and use the disclosure.
FIG. 1 illustrates a system framework diagram in an embodiment of the invention;
FIG. 2 illustrates a diagram of a conversion relationship between modes in an embodiment of the present invention;
FIG. 3 illustrates a power-up diagram of a communication server in an embodiment of the invention;
FIG. 4 illustrates a power-up diagram of an interlock processing unit in an embodiment of the present invention;
fig. 5 illustrates a network communication architecture diagram in an embodiment of the present invention.
Fig. 6 illustrates a schematic diagram of initialization and a process of sending and receiving a heartbeat message of a device according to an embodiment of the present invention.
Fig. 7 illustrates a schematic diagram for determining the active-standby relationship between communication servers in the embodiment of the present invention.
Fig. 8 illustrates a schematic flow chart of switching between the master backup relationship and the backup relationship between the communication servers in the embodiment of the present invention.
Embodiments of the present invention will be described with reference to the accompanying drawings.
Detailed Description
Examples
Two-out-of-two system architecture
As shown in fig. 1, the interlocking machine subsystem mainly includes a two-out-of-two a channel including a processing unit S1 and a processing unit S2, a two-out-of-two B channel including a processing unit S3 and a processing unit S4, and the two-out-of-two a channel, the two-out-of-two B channel, the communication server a, and the communication server B are connected via a redundant ethernet intranet (switch).
Wherein, the two-out-of-two A channel and the two-out-of-two B channel have a main-standby relationship; the communication server A and the communication server B have a master-slave relationship; the main-standby relation of the communication server A and the communication server B is determined through competition; the main-standby relation and the abnormal reset of the two-out-of-two A channel and the two-out-of-two B channel are judged by the communication server.
Working mode of two-out-of-two system
The operation mode of the above two-out-of-two system is described in detail below.
The two-out-of-two channel mode includes a normal mode and an abnormal mode. The normal mode comprises a main mode and a standby mode; the abnormal mode includes a power-on mode, a follow-up mode, a reset mode, and a power-off mode. The basic functional requirements of each mode are as follows.
(1) A main mode: the two-out channel in the main mode completes data input, application processing and data output functions.
(2) Standby mode: the two-way channel in the standby mode has other functions of the main mode except for outputting an application result, meanwhile, the channel in the standby mode needs to compare the running state of the channel with the main channel in each running period, and if the running state is inconsistent, the channel in the standby mode is switched into the reset mode.
(3) A power-on mode: and the two-out two-channel enters a power-on mode after being initially powered on, and the self-checking function is completed in the power-on mode. The self-check checks whether the database version number is correct, whether the initialization protocol configuration is successful, and whether the application initialization is successful.
(3) Following mode: and in the following mode, the two-out-of-two channel updates local data according to historical state information sent by the main channel, and the normal working mode can be entered only after the following is successful.
(4) Resetting mode: if a fault occurs in the normal working channel during operation, the normal working channel enters a reset mode.
(5) A power-off mode: the power supply of the channel of the interlocking safety computer platform is not opened; when the dynamic signal is abnormal or the board card breaks down, the channel is automatically switched off and is electrified again. The conversion relationship between the modes is shown in fig. 2.
After one two-out two-channel of the two-by-two-out system completes the function of the power-on mode, when the mode conversion is needed, if the other two-out two-channel is already in the main mode at the moment, the channel enters the following mode, otherwise, the channel enters the main mode.
And if the faults occur in the respective modes, entering a reset mode, resetting the processing unit of the channel and re-entering a power-on mode.
When the channel board card has a fault or the dynamic signal is abnormal, the channel enters a power-off mode, and the channel is powered on again after a period of time. And when the channel is in the power-off state, the channel enters a power-on mode after being powered on again.
In the following mode, after the state following is successful, the standby mode can be automatically switched into. When one two-way channel is in the main mode and the other two-way channel is in the standby mode, the channel in the main mode can be changed into the standby mode in a manual mode. The channel in the standby mode can be changed into the main mode in a manual mode; if the channel of the main mode fails, the channel of the standby mode becomes the main mode according to the arbitration information sent by the communication server.
Control flow of two-out-of-two system
Fig. 3 is a schematic diagram illustrating an internal control flow of the communication processing unit.
Fig. 4 is a schematic diagram showing the internal control flow of the interlock processing unit.
Network communication architecture
As shown in fig. 5, the network communication architecture of the present embodiment is designed. The whole network is divided into 4 levels according to the communication hierarchy:
external network user layer, communication server layer, interlocking equipment layer, controlled and signal layer
The devices mainly related to the active/standby switching in the interlock system are a communication server system and an interlock device system, that is, the above-mentioned 2 and 3 layers of devices, which are mainly related to and described in this section.
The communication host pair is an external network user operation device, the interlocking system pair is controlled or signal devices, namely the 1-layer device and the 4-layer device, and the communication mode and the communication message between the controlled or signal devices can adopt a common scheme in the field, which is out of the scope of the embodiment of the invention.
In order to keep absolute independence between the communication servers, the communication servers do not directly establish TCP network connection, have no primary and secondary points and are not controlled by each other.
Network message design
In an interlock system, messages can be divided into two categories:
and (4) normal service message: the messages conform to RSSP-I-railway signal safety communication protocol, and are designed to be compatible with the existing interlocking system APP and a safety layer protocol.
Heartbeat messages between devices: the information of each module in the network can be transmitted while maintaining the network connection.
In the interlock system, normal service messages are processed according to conventional design in the field.
In order to realize the setting and switching operation of the main/standby mode of each device in the interlocking system, the custom heartbeat message is used for state monitoring among each TCP link and the devices and device state monitoring, and the design conforms to the following principle:
the transmission and reception are performed in units of communication sequences in 100ms time slices.
In the format of the heartbeat message, the equipment is responsible for modifying the corresponding self equipment state description field in the heartbeat message according to the self running state. And filling the heartbeat messages into corresponding fields of the heartbeat messages sent by the equipment according to the received heartbeat data of other equipment.
When a certain processing unit or a communication server finds that the heartbeats of all other devices are overtime, the communication abnormality of the device can be judged, whether the state is still overtime needs to be judged in the next communication time sequence, if all other devices are still overtime, the judgment is confirmed, and at the moment, the states of other devices need to be cleared and set to be in an unknown mode.
The initialization and transceiving process of a heartbeat message of a certain device is shown in fig. 6. When the time sequence among the devices is not completely synchronous, the operation of changing the state of the corresponding device after receiving the heartbeat message of other devices and the operation of the local machine at the moment of sending the heartbeat message may be asynchronous, and the state update of other devices in the heartbeat message sent by the local machine may delay a time sequence period.
Compared with a message definition mode only sending own information, the heartbeat message format setting method has the advantage that when software or hardware faults occur in a network, each device can independently judge fault points and basic reasons. For example, when a communication server fails and heartbeat messages are not sent out according to a time sequence, node devices in all systems can sense the event within 2 periods, if the node devices are analyzed from the perspective of the communication server A, the node devices receive 4 interlocked CPU heartbeat messages and report that the A/B network heartbeat messages of the communication server B are overtime, and therefore the failure can be judged to be a network outlet failure of the communication server B.
Design principle of main-standby relation of communication server
The design of the main-standby relationship between the communication hosts needs to meet the following principles:
(1) the two machines are equal in status and have no primary and secondary points.
(2) Data interaction is not directly carried out between the two machines, namely no TCP connection exists. From the analysis of logic, the TCP is divided into a Server and a Client, and if the dual computers do not have the default main-standby relationship, the TCP cannot confirm who serves the Server.
(3) The dual machines independently operate the main-standby setting and the main-standby switching process without being controlled by the opposite side.
(4) The dual-computer needs to complete the switching within 2 message periods (for example, one message period of 100ms, then 200 ms).
Determination of master-slave relationship between communication servers
In the process of powering on the communication host, the main-standby relationship needs to be confirmed, and a basic principle of first arrival and first acquisition is adopted, namely who is powered on first and who obtains the host status.
When the time interval of the two-machine starting is less than a communication time sequence, namely the two-machine simultaneously enters a power-on state and a standby state, a 'shaking sub' backoff algorithm is designed to be used, namely the two-machine respectively takes a random number (the random number can be set as 1/3/5/7/9/11 combination), the two-machine respectively waits for the opposite side to become the host machine in the communication period of the obtained random number, and if the opposite side is not set as the host machine in the random number period, the opposite side sets itself as the host machine. When the double computers select the same random number and the probabilistic simultaneous application to the main computer occurs, the operation of 'color shaking' is repeated again.
In any case, obtaining the state of the host needs to go through two communication sequences, namely:
(1) first periodic transmit state
(2) And the second period, confirming the pairing state and carrying out subsequent operation according to the state.
The above communication flow is shown in fig. 7.
Main-standby relation switching process between communication servers
The following describes the main/standby relationship switching process between communication servers in detail.
And after the communication server confirms the main-standby relation, the normal operation process is started. In the normal operation flow, the main/standby machine switching operation is required in the following situations:
the host computer has active switching requirements such as software exception, hardware exception and the like.
After a host network cable is pulled out and a hardware serious fault occurs, such as power failure, passive switching caused by the fact that heartbeat messages cannot be sent can not be achieved.
No matter what switching requirement, the main and standby machines all need to follow the following rules:
active switching: the host computer should send fault information in a first period when the active switching fault occurs, and after the host computer knows the state of the other party in a second period, follow-up operation is carried out, such as grade withdrawal, reset restart or power failure troubleshooting.
Passive switching: the standby machine checks heartbeat messages sent by the 4 interlocked processing units CPU in each period, extracts the state of the communication service host from the heartbeat messages, then starts to enter a switching flow of the main machine and the standby machine when the heartbeat of the main machine is overtime or abnormal in reporting, and switches to enter the state of the host machine after the state of the main machine and the standby machine is confirmed to be correct in a second time sequence. If the current master system state is normal in the second time sequence, the switching process is not entered.
Before the host needs to perform active report request switching, the opposite machine state needs to be extracted from heartbeat messages from 4 interlocked processing units (cpus), and the opposite machine is confirmed to enter the standby machine state, so that the abnormal state sending operation can be performed. Otherwise, a waiting operation is performed, the waiting operation is performed for 20 time sequence periods (namely 2S) at most, if the pairing system does not enter the standby state, the subsequent exception handling operation is forced after the exception is recorded.
Interlocking dual-channel main-standby switching design principle
The interlocking dual-channel main/standby switching design principle is described in detail as follows:
in order to ensure the uniformity of the control of the whole system and simplify the system state maintenance complexity caused by the autonomous state switching of each module, a system interlocking master and standby machine and a switching flow are designed and strictly controlled by a communication server.
And each interlocking processing unit CPU is only responsible for reporting the state, and carries out subsequent operation according to feedback information fed back by the interlocking host, and has no right to autonomously carry out state switching.
Method for determining interlocked main and standby channels
The method for determining the interlocked main/standby channels is described in detail below.
The interlocking master and standby machine confirmation flow needs to follow the following principles:
(1) the determination of the state of the interlocked main/standby channels can be performed only after the main/standby machines of the communication server are confirmed.
(2) After each channel of the interlock is powered on, the interlock enters a power-on completion mode by default, and after receiving a first heartbeat message sent from the communication host, the interlock can perform receiving and sending operation of the heartbeat message with the communication host.
(3) After the operation of receiving and sending heartbeat messages between each channel and the communication host computer is started, the state setting instruction from the communication host computer is received, and only after the local computer enters a following state, a standby state or a host computer state, the heartbeat message receiving and sending operation can be carried out on the channel in the interlocking system.
(4) When one channel in the interlocking A/B channels becomes a main channel and is in a normal operation state, the other system must strictly observe admission and exit mechanisms. In principle, the judgment of the state is judged by the host, but the switching of the state needs to complete the application and confirmation process to the communication service host. Namely, the state flow of completing power-on, following and standby is completed. After each process is finished, the state is submitted to the communication server in a heartbeat message mode, and the next stage can be entered after the confirmation of the corresponding field in the heartbeat message of the communication server is obtained.
The communication server adopts a first-come first-serve and random alternative selection mode to confirm the host-standby relation:
first come first get mode: when only one interlocking channel can normally carry out heartbeat transceiving, the channel is confirmed to be a main channel.
Random alternative mode: when both the two interlocking channels can normally perform heartbeat transceiving, the communication server host randomly selects (adopts a random digital-analog 2) one channel as a main channel, and the other channel enters a following mode.
Interlocking channel state switching process
After the communication host computer informs the master-slave relationship of the two interlocking channels through the heartbeat message,
and (4) related exception handling: if the interlocking B channel cannot be correctly matched after being followed for a long time, abnormity can be identified in the heartbeat message, and the communication service host is applied for restarting.
When the data comparison of the processing result is inconsistent for N times (N can be 3) continuously in the interlocking B channel standby mode, the abnormal state of the self-body can be identified in the heartbeat message, the following mode is applied, the entering state information and the processing result information sent by the main channel are received, and the main channel is followed again.
Interlocking channel main/standby switching process
Because the communication server can know the state of the interlocking dual channels, taking the time sequence interval of 100ms messages as an example, the switching of the interlocking dual channels can be confirmed to be completed within 100ms at the fastest speed and within 200ms at the slowest speed.
When the interlocking main channel is abnormal and the standby channel does not have a switching condition (for example, when a network is disconnected to cause no heartbeat message and the current state is in a non-standby state such as following or power-on completion for a long time), the communication server sets a timeout mechanism, the timing sequence of 20 messages is taken as a limit, if a conditional person sends a switching heartbeat message in a fixed time, if the conditional person does not have the heartbeat message, the system failure abnormity is recorded.
The interlocking main and standby channel switching mainly comprises two conditions:
1. passive anomaly: the main channel is abnormal, such as hardware abnormal restart, and heartbeat timeout caused by physical network disconnection (at least 2 times)
In this mode, the communication server host finds that the interlocked main channel heartbeat message is overtime in the first sequence, and can confirm the event after receiving all the heartbeat messages of the devices in other systems.
And in the second time sequence, after the communication service host confirms that the heartbeat message of the interlocking main channel is overtime, the communication service host can immediately inform the interlocking standby channel equipment of entering the main channel mode through the heartbeat message.
In this way, if the other of the two interlocked channels does not enter the standby mode, the device waits for 2 timing cycles (200 ms), and if the other channel does not have the condition, the device reports an abnormality through the log module, and records the conditions of the two interlocked channels and the processing units CPU1, 2, 3, and 4 at this time.
When the interlock can sense the self state (such as network disconnection), the self state needs to be set to be an abnormal state, and a command of a communication server is waited.
2. Active exception: the main channel software finds that the main channel software has problems and needs system restart or software restart.
In the mode, the communication server host finds the abnormal zone bit of the heartbeat message of the interlocking main channel in the first time sequence, and can confirm the event after receiving the heartbeat messages of all devices in other systems.
And in the second time sequence, after the communication service host confirms that the interlocking main channel continues to send abnormal heartbeat messages, the communication service host can immediately inform the interlocking standby channel equipment to enter the main channel mode through the heartbeat messages.
In this way, if the other of the two interlocked channels does not enter the standby mode, the device waits for 2 timing cycles (200 ms), and if the other channel does not have the condition, the device reports an abnormality through the log module, and records the conditions of the two interlocked channels and the processing units CPU1, 2, 3, and 4 at this time.
When the interlock can sense the self state, the self state needs to be set to be an abnormal state, and a command of the communication server is waited.
As can be seen from the above description, the embodiment of the present invention uses the custom heartbeat packet to implement the scheme of switching between the main system and the standby system of the interlock subsystem based on the software and hardware combined arbitration mode, thereby improving the reliability of the system.
In this embodiment, while, for purposes of simplicity of explanation, the methodologies are shown and described as a series of acts, it is to be understood and appreciated that the methodologies are not limited by the order of acts, as some acts may, in accordance with one or more embodiments, occur in different orders and/or concurrently with other acts from that shown and described herein or not shown and described herein, as may be understood by those of ordinary skill in the art.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is noted that references in the specification to "one embodiment," "an example embodiment," "some embodiments," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A computer interlocking subsystem master-slave switching system based on heartbeat messages is characterized by comprising:
the communication server comprises a communication server A and a communication server B;
the channel comprises an A channel and a B channel; the channel A is connected with a communication server A and a communication server B; the channel B is connected with the communication server A and the communication server B;
the interlocking unit comprises a first interlocking unit and a second interlocking unit belonging to the channel A, and a third interlocking unit and a fourth interlocking unit belonging to the channel B;
the state of the equipment is transmitted between each interlocking unit and each channel through heartbeat messages between the equipment, and the heartbeat messages not only comprise the self state, but also comprise the state information of each interlocking unit and each channel in the system; and the interlocking unit modifies the corresponding self equipment state description field in the heartbeat message according to the self running state, and updates the corresponding field of other equipment in the heartbeat message according to the received heartbeat data of other equipment.
2. The system according to claim 1, wherein when the interlock unit or the communication server finds that all the heartbeats of other devices are overtime, it is determined that the device is abnormal in communication, and when it is determined that the other devices are still overtime in the next communication timing sequence, it is determined that the device is abnormal in communication.
3. The system according to claim 1, wherein the contents of the heartbeat messages sent by any device in the interlock system include the sensed communication states of all other devices in the interlock system, and when a heartbeat timeout of a device is found by a certain interlock unit or a communication server in the interlock system, the device is determined to be in the heartbeat timeout state for the other devices by combining with a corresponding field of the device state in the contents of the heartbeat messages sent by the other devices to the device, and further determined that a communication anomaly occurs in the device, and when the next communication sequence is performed, whether the device has the communication anomaly is determined in the same determination mode.
4. The computer interlocking subsystem master-slave switching system based on heartbeat messages according to claim 1, wherein in the process of powering on, the communication server A and the communication server B send the power-on heartbeat messages of the local computer and receive the heartbeat messages of other equipment at the first time sequence, and whether the state of the host computer or the standby computer is entered is judged according to the received heartbeat messages.
5. The system according to claim 3, wherein when the communication server A and the communication server B enter the power-on state and the standby state at the same time, the two machines each take a random number, each waits for the other to become the master during the communication cycle of the random number, and sets itself as the master if the other is not set as the master during the random number cycle.
6. The system according to claim 1, wherein the standby machines in the communication server a and the communication server B check the heartbeat messages sent from the interlock unit CPU at each cycle, extract the state of the host machine of the communication server from the heartbeat messages, follow the state, start to enter the switching flow of the standby machines when the heartbeat of the host machine of the communication server is over time or abnormal in reporting, and switch to enter the host machine state after the second timing sequence confirms that the states of the host machine of the communication server and the original host machine of the communication server are correct.
7. The system according to claim 1, wherein the interlocking unit receives the state setting command from the communication host after the operation of sending and receiving heartbeat messages between each channel and the communication server is started, and only when the interlocking unit of the channel enters the following state, the standby state, or the host state, the interlocking unit of the channel sends and receives heartbeat messages to and from another channel.
8. The system according to claim 1, wherein when only one interlock channel can normally perform heartbeat transceiving, the channel is determined to be a main channel; when both the two interlocking channels can normally perform heartbeat transceiving, the communication server host randomly selects one channel as a main channel, and the other channel enters a following mode.
9. The system according to claim 1, wherein when a channel is in standby mode and the preset secondary processing result data contrast is continuously inconsistent, the channel is identified to have an abnormal state in the heartbeat message, applies for returning to the following mode, receives the entering state information and the processing result information sent by the main channel, and follows the main channel again.
10. The system according to claim 1, wherein the active-standby switching of the interlocking channel comprises a passive abnormal mode or an active abnormal mode;
in the passive abnormal mode, the communication server host finds that the heartbeat message of the interlocking main channel is overtime in a first sequence, and can confirm that the heartbeat message of the interlocking main channel is overtime after receiving the heartbeat messages of all devices in other systems; in the second time sequence, after the communication service host confirms that the heartbeat message of the interlocking main channel is overtime, the communication service host can immediately inform the interlocking standby channel device to enter the main channel mode through the heartbeat message; if the other channel does not have the condition, reporting the abnormity through a log module, and recording the conditions of the two interlocked channels and the processing unit at the moment;
in the active abnormal mode, the communication server host finds the abnormal zone bit of the heartbeat message of the interlocking main channel in a first time sequence, and can confirm the abnormal zone bit of the heartbeat message of the interlocking main channel after receiving the heartbeat messages of all devices in other systems; in a second time sequence, after the communication service host confirms that the interlocking main channel continues to send abnormal heartbeat messages, the communication service host can immediately inform the interlocking standby channel equipment to enter a main channel mode through the heartbeat messages, so that the other channel in the interlocking double channels does not enter the standby mode, 2 time sequence periods are waited, if the other channel does not meet the condition, the other channel reports the abnormality through a log module, and the conditions of the interlocking double channels and the processing unit at the moment are recorded; when the interlock can sense the self state, the self state needs to be set to be an abnormal state, and a command of the communication server is waited.
CN202110463467.XA 2021-04-28 2021-04-28 Computer interlocking subsystem master-slave switching system based on heartbeat message Active CN112887176B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110463467.XA CN112887176B (en) 2021-04-28 2021-04-28 Computer interlocking subsystem master-slave switching system based on heartbeat message

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110463467.XA CN112887176B (en) 2021-04-28 2021-04-28 Computer interlocking subsystem master-slave switching system based on heartbeat message

Publications (2)

Publication Number Publication Date
CN112887176A true CN112887176A (en) 2021-06-01
CN112887176B CN112887176B (en) 2021-07-16

Family

ID=76040758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110463467.XA Active CN112887176B (en) 2021-04-28 2021-04-28 Computer interlocking subsystem master-slave switching system based on heartbeat message

Country Status (1)

Country Link
CN (1) CN112887176B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742165A (en) * 2021-07-23 2021-12-03 文华学院 Double-master control equipment and master-slave control method
CN113917834A (en) * 2021-10-13 2022-01-11 上海华建电力设备股份有限公司 Redundant switching and data sharing of dual-network dual-communication server
CN114237990A (en) * 2021-11-18 2022-03-25 通号万全信号设备有限公司 FPGA chip-based two-multiplication redundancy switching method and device
CN115903451A (en) * 2023-03-08 2023-04-04 北京全路通信信号研究设计院集团有限公司 Component working mode switching method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101102177A (en) * 2007-08-20 2008-01-09 杭州华三通信技术有限公司 An implementation method and device for switching master and slave controller
CN101659271A (en) * 2009-09-01 2010-03-03 卡斯柯信号有限公司 Method for connecting station ATS with interlocking subsystem
CN102064962A (en) * 2010-12-06 2011-05-18 南京恩瑞特实业有限公司 Method for implementing input and output assemblies of ATS (Automatic Train Supervision) system based on named pipeline communication
US20150256436A1 (en) * 2014-03-04 2015-09-10 Connectem Inc. Method and system for seamless sctp failover between sctp servers running on different machines
CN106741000A (en) * 2016-12-26 2017-05-31 合肥工大高科信息科技股份有限公司 Liaison circuit communication means between computer interlock system
CN108279597A (en) * 2018-01-23 2018-07-13 上海亨钧科技股份有限公司 A kind of computer interlocking platform courses method based on finite state machine

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101102177A (en) * 2007-08-20 2008-01-09 杭州华三通信技术有限公司 An implementation method and device for switching master and slave controller
CN101659271A (en) * 2009-09-01 2010-03-03 卡斯柯信号有限公司 Method for connecting station ATS with interlocking subsystem
CN102064962A (en) * 2010-12-06 2011-05-18 南京恩瑞特实业有限公司 Method for implementing input and output assemblies of ATS (Automatic Train Supervision) system based on named pipeline communication
US20150256436A1 (en) * 2014-03-04 2015-09-10 Connectem Inc. Method and system for seamless sctp failover between sctp servers running on different machines
CN106741000A (en) * 2016-12-26 2017-05-31 合肥工大高科信息科技股份有限公司 Liaison circuit communication means between computer interlock system
CN108279597A (en) * 2018-01-23 2018-07-13 上海亨钧科技股份有限公司 A kind of computer interlocking platform courses method based on finite state machine

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
冯浩楠等: "CBTC系统计轴接口优化方案研究", 《铁道标准设计》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742165A (en) * 2021-07-23 2021-12-03 文华学院 Double-master control equipment and master-slave control method
CN113742165B (en) * 2021-07-23 2024-05-24 文华学院 Dual master control equipment and master-slave control method
CN113917834A (en) * 2021-10-13 2022-01-11 上海华建电力设备股份有限公司 Redundant switching and data sharing of dual-network dual-communication server
CN114237990A (en) * 2021-11-18 2022-03-25 通号万全信号设备有限公司 FPGA chip-based two-multiplication redundancy switching method and device
CN114237990B (en) * 2021-11-18 2024-04-26 通号万全信号设备有限公司 Method and device for switching square redundancy based on FPGA chip
CN115903451A (en) * 2023-03-08 2023-04-04 北京全路通信信号研究设计院集团有限公司 Component working mode switching method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN112887176B (en) 2021-07-16

Similar Documents

Publication Publication Date Title
CN112887176B (en) Computer interlocking subsystem master-slave switching system based on heartbeat message
JPH03106144A (en) Mutual connection of network modules
CN101594383B (en) Method for monitoring service and status of controllers of double-controller storage system
CN110830276B (en) EtherCAT communication system and method based on cold redundancy mechanism
JP4181283B2 (en) Failure detection notification method and internetwork apparatus
CN109639512B (en) Hot backup method of VTS multi-sensor information comprehensive processing system
KR100324275B1 (en) Dual State Control Method Of Duplicated Processors
JPH04261246A (en) Transmission system and transmission method
CN101291201A (en) Heart beat information transmission system and method
CN114594672A (en) Control system, control method thereof, and computer-readable storage medium
CN106445852B (en) A kind of task communicator and method based on from monitoring framework
JP2682251B2 (en) Multiplex controller
CN109245979A (en) A kind of CANopen master-salve station Control for Dependability method and its overall management device
KR100724495B1 (en) Programmable logic controller duplex system and running method
JP7035511B2 (en) Programmable controller and duplex system
KR100932148B1 (en) Master module and slave module communication method of PLC network
JP2010136038A (en) Transmitter, and method for switching system in redundant configuration
RU2430400C1 (en) Backup software-hadware system for automatic monitoring and control
KR100296403B1 (en) Redundancy Implementation in Communication Systems
JPH05304528A (en) Multiplex communication node
CN115037674B (en) Single-machine and multi-equipment redundancy backup method for central control system
CN115412424B (en) Double-master device detection method and device in MLAG environment
US7590717B1 (en) Single IP address for redundant shelf processors
CN117885788A (en) Main and standby system switching control method
KR970002779B1 (en) Processor status management method in the intelligent network service control/management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 101300 Room 203, building 5, 13 Fuqian 1st Street, Tianzhu District, Shunyi District, Beijing

Patentee after: Beijing Urban Construction Intelligent Control Technology Co.,Ltd.

Address before: 101300 Room 203, building 5, 13 Fuqian 1st Street, Tianzhu District, Shunyi District, Beijing

Patentee before: BEIJING URBAN CONSTRUCTION INTELLIGENT CONTROL TECHNOLOGY Co.,Ltd.