CN113794507A

CN113794507A - Autonomous fault handling method and system based on-orbit routing switching

Info

Publication number: CN113794507A
Application number: CN202111062710.3A
Authority: CN
Inventors: 张海; 邹亿; 张旭光; 王献忠; 褚英志; 查宏瑞
Original assignee: Shanghai Institute of Satellite Engineering
Current assignee: Shanghai Institute of Satellite Engineering
Priority date: 2021-09-10
Filing date: 2021-09-10
Publication date: 2021-12-14
Anticipated expiration: 2041-09-10
Also published as: CN113794507B

Abstract

The invention provides an autonomous fault processing method and system based on-orbit route switching, which take a comprehensive electronic computer as a core, realize the interconnection and the intercommunication with a single machine or a module of a measurement and control, attitude and orbit control, a solar wing, a load and other sub-systems through 1553B, RS422 bus communication, realize the backup of each sub-system in a single machine or module form, and the backup form has double hot backup or one hot and one cold backup. The invention can improve the fault tolerance of the system by the autonomous switching of the on-track route in the environment with strict requirements on real-time performance, safety and reliability, such as a deep space exploration environment.

Description

Autonomous fault handling method and system based on-orbit routing switching

Technical Field

The invention relates to the technical field of deep space exploration, in particular to an autonomous fault processing method and system based on-orbit routing switching.

Background

The method is characterized in that a measurement and control link is interrupted during the period of Mars deep space exploration project 'sunset', ground measurement and control support is not provided for a long time, the ground fire transfer stage is as long as more than 200 days, the longest time delay of single communication exceeds 45 minutes, multiple key actions such as 'midway correction, deep space maneuver' and the like are implemented during the period, the key program control instruction actions in certain stages only have one chance, and the requirement of deep space real-time flight control cannot be met through the traditional flight control 'satellite-ground remote measurement and instruction large loop comparison method'. Therefore, the requirements of the aircraft on fault diagnosis and autonomous processing are far greater than those of the near-earth satellite, whether the fault is diagnosed in real time and different redundant modules or single machines can be automatically switched, and normal forwarding and execution of instructions are ensured, which is the key for ensuring the success of tasks.

At present, in the field of aerospace, a precedent for realizing automatic fault diagnosis and treatment by switching routes of two different types of networks is not adopted. Compared with a patent related to route switching, the patent discloses a node satellite communication method (CN105979557A) based on sequential decision route switching, which utilizes a low-orbit satellite constellation node satellite to provide service signals strength and spot beam service orientation for a mobile terminal and adopts the sequential decision route switching to perform optimized satellite switching on a low-orbit satellite constellation. The patent discloses a method and a system for switching routes and a routing device (CN106856453A), and the method and the system for switching routes and the routing device (CN106856453A) adopt an access node routing device to issue a bit error rate measurement message to the downstream, so that the access node routing device can determine a transmission path according to the measurement condition of the downstream measurement node routing device, and the effect of switching routes according to the bit error rate of flow is realized. A real-time automatic switching device and method (CN106209619A) applied to route interruption realizes automatic switching to a standby router when a main link router is broken. These patents related to routing switching cannot be applied to a dedicated electronic system architecture in the aerospace field, and do not consider application to switching in two different heterogeneous networks, nor consider the cold and hot states of a terminal device.

Patent document CN109219105B (application number: CN201710518731.9) discloses a route switching method and a route switching system, relating to the field of mobile communication. The method comprises the following steps: routing heartbeat servers are respectively arranged in a network node and a plurality of client systems; the routing heartbeat server positioned at the network node is respectively communicated with the routing heartbeat servers positioned at the plurality of client systems through heartbeat messages, and the fault service route is determined according to the heartbeat response message of the heartbeat messages; the backup traffic route is used instead of the failed traffic route. However, the patent cannot be applied to a special electronic system architecture in the aerospace field, does not consider application to switching between two different heterogeneous networks, and does not consider the cold and hot states of terminal equipment.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide an autonomous fault handling method and system based on-orbit route switching.

The autonomous fault processing method based on-orbit route switching provided by the invention comprises the following steps:

step 1: inquiring an RT address of a 1553B bus or a terminal address of an RS422 serial port according to the APID in the CCSDS instruction packet, and forwarding to a corresponding 1553B or RS422 terminal according to the routing address; if a terminal is simultaneously mounted on the RS422 and 1553B buses, preferentially sending an instruction or a comment number from the 1553B bus;

step 2: starting 1553B communication, inquiring a telemetering sending request of each RT, and sending a 1553B bus vector word command to each RT according to an RT address;

and step 3: sending RS422 telemetering acquisition commands to each RS422 terminal every 0.5 s;

and 4, step 4: organizing the telemetering transmission of the RTs according to the vector word requests of all the RTs of the 1553B bus;

and 5: reading terminal telemetry of each RS422 port;

step 6: initializing an RT serial number on a 1553B bus, setting an initial value i to be 0,

and 7: judging whether telemetry of an ith RT of a 1553B bus is updated or not, if not, judging that a fault count M _ Err (i) ═ M _ Err (i) +1 of the ith RT, otherwise, judging the state of the next RT;

and 8: if the fault count M _ Err (i) >5 of the ith RT is judged, whether the 1553B bus RT is in cold backup or hot backup is judged, and if the RT is in hot backup, the address of the ith RT is changed into a hot backup address; if the current address is the cold backup address, sending a cold backup starting instruction of the ith RT, and converting the address of the ith RT into the cold backup address; if the RT is simultaneously mounted on a 1553B bus and an RS422 bus and the communication of the main backup interfaces on the 1553B bus fails, switching to the RS422 communication;

and step 9: repeating the step 7 to the step 8 to complete the telemetering state monitoring and the routing address switching of all RTs on the 1553B bus;

step 10: if all the remote measurements of the RTs are not updated, the FPGA of a 1553B chip of the integrated electronic BC terminal is reset, and the 1553B communication self-fault of the integrated electronic BC terminal is eliminated:

step 11: initializing a serial number of an RS422 terminal, wherein an initial value j is 0;

step 12: judging whether the telemetering of the jth port of the RS422 is updated, if not, judging the fault count R _ Err (j) ═ R _ Err (j) +1 of the jth RS422 terminal, otherwise, judging the state of the next RS422 port;

step 13: if the fault count R _ Err (j) of the jth RS422 serial port terminal is greater than 5, judging whether the RS422 serial port terminal is in cold backup or hot backup, if so, switching the jth RS422 terminal into a backup routing address, and if so, sending a cold backup starting instruction of the jth RS422 terminal and switching the jth RS422 routing address into a cold backup address;

step 14: repeating the step 12 to the step 13 to complete the telemetering state monitoring and the routing address switching of all terminals on the RS422 serial port;

step 15: if the telemetering of all serial port terminals is not updated, resetting the FPGA of the comprehensive electronic control RS422 chip, and eliminating the self fault of the comprehensive electronic RS422 communication;

step 16: if the telemetering of all the RT ports and RS422 ports of 1553B are not updated, the electronic computer is integrated to prepare the machine;

and step 17: and (5) repeatedly executing the step 1 to the step 16 to realize the autonomous fault processing flow based on the on-orbit route switching.

Preferably, the 1553B bus and the RS422 bus are backed up in a single machine or module mode, and the backup modes comprise double hot backup and hot-cold backup;

when a single machine or a module fails due to the influence of space environment radiation or self fault, the backup single machine is isolated and switched from the bus network, and the service function is not interrupted.

Preferably, routing addresses of the 1553B bus and the RS422 bus are uniformly defined, and unified addressing of the notes and the instructions is realized according to the bus type, subsystem single machine identification and module identification.

Preferably, the working state of each terminal is monitored in real time, and all terminals in a 1553B bus and RS422 bus network are traversed;

the fault processing is carried out by stage, when a terminal fault is monitored, firstly, the fault module is switched, secondly, the fault single machine is switched, if the fault still cannot be eliminated and the whole 1553B bus or RS422 bus is in overall fault, the 1553B or RS422 control FPGA of the integrated electronic computer is subjected to hardware reset, and if the fault still cannot be processed, the main and standby switching of the processor module of the integrated electronic computer is realized.

Preferably, when a single computer and a module are switched, the sending of a starting instruction and the waiting time are increased in a cold backup mode;

when the single machine and the module are switched, the network routing address is correspondingly switched, and a basis is provided for the forwarding of the subsequent note number instruction.

The invention provides an autonomous fault handling system based on-orbit route switching, which comprises:

module M1: inquiring an RT address of a 1553B bus or a terminal address of an RS422 serial port according to the APID in the CCSDS instruction packet, and forwarding to a corresponding 1553B or RS422 terminal according to the routing address; if a terminal is simultaneously mounted on the RS422 and 1553B buses, preferentially sending an instruction or a comment number from the 1553B bus;

module M2: starting 1553B communication, inquiring a telemetering sending request of each RT, and sending a 1553B bus vector word command to each RT according to an RT address;

module M3: sending RS422 telemetering acquisition commands to each RS422 terminal every 0.5 s;

module M4: organizing the telemetering transmission of the RTs according to the vector word requests of all the RTs of the 1553B bus;

module M5: reading terminal telemetry of each RS422 port;

module M6: initializing an RT serial number on a 1553B bus, setting an initial value i to be 0,

module M7: judging whether telemetry of an ith RT of a 1553B bus is updated or not, if not, judging that a fault count M _ Err (i) ═ M _ Err (i) +1 of the ith RT, otherwise, judging the state of the next RT;

module M8: if the fault count M _ Err (i) >5 of the ith RT is judged, whether the 1553B bus RT is in cold backup or hot backup is judged, and if the RT is in hot backup, the address of the ith RT is changed into a hot backup address; if the current address is the cold backup address, sending a cold backup starting instruction of the ith RT, and converting the address of the ith RT into the cold backup address; if the RT is simultaneously mounted on a 1553B bus and an RS422 bus and the communication of the main backup interfaces on the 1553B bus fails, switching to the RS422 communication;

module M9: repeatedly calling the modules M7-M8 to complete the telemetering state monitoring and routing address switching of all RTs on the 1553B bus;

module M10: if all the remote measurements of the RTs are not updated, the FPGA of a 1553B chip of the integrated electronic BC terminal is reset, and the 1553B communication self-fault of the integrated electronic BC terminal is eliminated:

module M11: initializing a serial number of an RS422 terminal, wherein an initial value j is 0;

module M12: judging whether the telemetering of the jth port of the RS422 is updated, if not, judging the fault count R _ Err (j) ═ R _ Err (j) +1 of the jth RS422 terminal, otherwise, judging the state of the next RS422 port;

module M13: if the fault count R _ Err (j) of the jth RS422 serial port terminal is greater than 5, judging whether the RS422 serial port terminal is in cold backup or hot backup, if so, switching the jth RS422 terminal into a backup routing address, and if so, sending a cold backup starting instruction of the jth RS422 terminal and switching the jth RS422 routing address into a cold backup address;

module M14: the modules M12 to M13 are called repeatedly to complete the telemetering state monitoring and the routing address switching of all terminals on the RS422 serial port;

module M15: if the telemetering of all serial port terminals is not updated, resetting the FPGA of the comprehensive electronic control RS422 chip, and eliminating the self fault of the comprehensive electronic RS422 communication;

module M16: if the telemetering of all the RT ports and RS422 ports of 1553B are not updated, the electronic computer is integrated to prepare the machine;

module M17: and repeatedly calling the modules M1-M16 to realize the autonomous fault processing flow based on the on-rail route switching.

Compared with the prior art, the invention has the following beneficial effects:

the invention can improve the fault tolerance of the system by the autonomous switching of the on-track route in the environment with strict requirements on real-time performance, safety and reliability, such as a deep space exploration environment.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a block diagram of an autonomous fault handling system based on-orbit route switching for deep space exploration;

FIG. 2 is a flow diagram of autonomic failure handling based on-orbit route switching;

fig. 3 is a diagram of the format of the command remoting packet CCSDS.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.

Example (b):

the data management system design of a typical deep space probe is that a comprehensive electronic computer is used as a core, and the data management system is communicated through a 1553B, RS422 bus to realize the interconnection and the intercommunication with a single machine or a module of a measurement and control system, an attitude and orbit control system, a solar wing system, a load system and the like, as shown in figure 1. Generally, in order to implement fault tolerance and redundancy, each subsystem implements backup in a single machine or module form, and the backup form includes double hot backup or one hot and one cold backup. When a single machine or a module fails due to the influence of space environment radiation or self-failure, the single machine or the module needs to be isolated and switched to a backup single machine from the bus network in time, so that the service function is not interrupted.

Aiming at a mesh communication architecture of cold and hot backup of satellite redundant equipment, the invention designs an autonomous fault processing method based on-orbit route switching to improve the fault tolerance of the system. Specifically, unified addressing of the notes and the instructions is realized according to 1553B or RS422 bus types, subsystem single machine identification, module identification and the like; monitoring the working state of each terminal in real time, and performing hierarchical processing on fault processing, namely switching a fault module firstly when a fault occurs in the terminal, and switching a fault single machine secondly, if the fault still cannot be eliminated and the whole 1553B bus or RS422 bus has a fault globally, performing hardware reset on 1553B or RS422 control FPGA of the integrated electronic computer, and if the fault processing still cannot be performed, performing main-standby switching on a processor module of the integrated electronic computer; when a single machine and a module are switched, a cold and hot backup mode is considered, namely, a starting instruction and waiting time need to be sent in the cold backup mode; when the single machine and the module are switched, the corresponding network routing addresses are also switched, and a basis is provided for the forwarding of the subsequent note number instruction. By the aid of the autonomous fault processing method, when the on-orbit equipment fails, the on-orbit equipment can be switched in time, and normal forwarding and execution of instructions or notes can be guaranteed.

For further detailed description of the present invention, an autonomous fault handling system based on-track route switching for deep space exploration designed by the present invention is taken as an example. The system depends on a comprehensive electronic, GNC and load equal system, and mainly comprises a comprehensive electronic computer, loads mounted on a 1553B bus, measurement and control mounted on an RS422 bus, a single machine of a platform branch system such as a solar wing and GNC and the like.

The comprehensive electronic processor board normally adopts a dual-computer hot backup mode, wherein the machine A is an authorized machine, and the machine B is an unauthorized machine. If the machine B works in a cold machine mode, only in the state of quasi-autonomous machine B starting, the hardware can autonomously send out a management unit machine B starting command under the condition of machine A failure and restore to a double-heat machine mode, and at the moment, no data such as a delay command and the like injected on the ground inside the machine B exist. When the machine A has a fault, the machine B in the hot standby mode carries out the autonomous right switching, the single machine which acquires the control right continues to control, and the external sending of the delay instruction is forbidden.

The standard data format of each type of command remote control packet is shown in fig. 3.

The total length of the remote control packet main header is 6 bytes, and the method comprises the following steps:

a) B0-B2: version number, fixed to "000";

b) b3: type, fixed as "1", indicating a remote control packet;

c) b4: a sub-lead flag fixed to "0" indicating that there is no sub-lead;

d) B5-B15: application Process Identification (APID), detailed in table 2;

e) B16-B17: packet sequence tag

f) B18-B31: a package name or sequence count, representing a package name for an individual package;

g) B32-B47: the packet length is the length of the remote control packet data field minus 1;

remote control packet data field: an even number of bytes, the last 2 bytes of the data field being a checksum, indicating that the data field (not containing a checksum) results as a word-wise xor.

The APID routing address allocation convention of each single machine and module is as follows:

the autonomous failure processing flow based on-orbit route switching is shown in fig. 2, and the specific steps are as follows:

1) the comprehensive electronic computer inquires the instruction queue of the number to be sent, inquires the RT address of a 1553B bus or the terminal address of an RS422 serial port according to the APID in the CCSDS instruction packet, and forwards the RT address or the terminal address of the RS422 serial port to a corresponding 1553B or RS422 terminal according to the routing address; if a terminal is simultaneously mounted on the RS422 or 1553B bus, the command or the number of the notes is preferentially sent from the 1553B bus.

2) Starting 1553B communication by the integrated electronic computer, inquiring the telemetering sending request of each RT, and mainly sending a 1553B bus vector word command to each RT according to an RT address;

3) the comprehensive electronic computer sends RS422 telemetering acquisition commands to each RS422 terminal every 0.5 s;

4) the integrated electronic computer organizes the telemetering transmission of the RTs according to the vector word requests of all the RTs of the 1553B bus;

5) reading terminal telemetering of each RS422 port by the integrated electronic computer;

6) the RT serial number on the 1553B bus, the initial value i is 0,

7) judging whether telemetry of an ith RT of a 1553B bus is updated or not, if not, judging that a fault count M _ Err (i) ═ M _ Err (i) +1 of the ith RT, otherwise, judging the state of the next RT;

8) if the fault count M _ Err (i) >5 of the ith RT, judging whether the RT of the 1553B bus is in cold backup or hot backup, if so, replacing the address of the ith RT into a hot backup address, if so, sending a cold backup starting instruction of the ith RT, and replacing the address of the ith RT into a cold backup address; if the RTs are simultaneously mounted on the 1553B bus, the RS422 bus and the main backup interface on the 1553B bus and fail to communicate, the communication is switched to the RS422 communication.

9) Repeating the steps (7) to (8) to complete the telemetering state monitoring and the routing address switching of all RTs on the 1553B bus;

10) if all the RT telemetering messages are not updated, the FPGA of the 1553B chip of the integrated electronic BC terminal is reset, and the 1553B communication self-fault of the integrated electronic BC terminal is eliminated.

11) RS422 terminal serial number, initial value j is 0;

12) judging whether the telemetering of the j-th port of the RS422 is updated, if not, judging the fault count R _ Err (j) ═ R _ Err (j) +1 of the j-th RS422 terminal, otherwise, judging the state of the next RS422 port

13) If the fault count R _ Err (j) of the jth RS422 serial port terminal is greater than 5, judging whether the RS422 serial port terminal is in cold backup or hot backup, if so, switching the jth RS422 terminal into a backup routing address, if so, sending a cold backup starting instruction of the jth RS422 terminal, and switching the jth RS422 routing address into a cold backup address;

14) repeating the steps (12) to (13) to complete the telemetering state monitoring and the routing address switching of all terminals on the RS422 serial port;

15) if the telemetering of all serial port terminals is not updated, the FPGA of the comprehensive electronic control RS422 chip is reset, and the self fault of the comprehensive electronic RS422 communication is eliminated.

16) If the telemetering of all the RT ports and RS422 ports of 1553B are not updated, the electronic computer is integrated to prepare the machine;

17) and repeating the steps to realize the autonomous fault processing flow based on the on-orbit route switching.

Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims

1. An autonomous failure handling method based on-orbit routing switching is characterized by comprising the following steps:

and 5: reading terminal telemetry of each RS422 port;

2. The on-orbit routing switching-based autonomous failure handling method of claim 1, wherein backup is implemented for the 1553B bus and the RS422 bus in a single machine or module form, and the backup form comprises double hot backup and one hot-cold backup;

3. The on-orbit routing switching-based autonomous failure handling method of claim 1, wherein routing addresses of the 1553B bus and the RS422 bus are uniformly defined, and unified addressing of the comments and the instructions is realized according to bus types, subsystem single machine identification and module identification.

4. The on-orbit routing switching-based autonomous fault handling method of claim 1, wherein the working state of each terminal is monitored in real time, and all terminals in 1553B bus and RS422 bus networks are traversed;

5. The autonomic failure handling method based on-track routing switch as claimed in claim 1, wherein during stand-alone, module switch, in cold backup mode, the sending of boot instruction and waiting time are increased;

6. An autonomic failure handling system based on-orbit routing switch, comprising:

module M5: reading terminal telemetry of each RS422 port;

7. The system for processing the autonomous fault based on the on-rail routing switching according to claim 6, wherein the 1553B bus and the RS422 bus are backed up in a single machine or a module mode, and the backup mode comprises a double-hot backup and a hot-cold backup;

8. The system for processing the autonomous failure based on the on-rail routing switching as claimed in claim 6, wherein the routing addresses of the 1553B bus and the RS422 bus are uniformly defined, and the unified addressing of the comments and the instructions is realized according to the bus type, the subsystem single machine identification and the module identification.

9. The on-orbit routing switching-based autonomous fault handling system of claim 6, wherein the working state of each terminal is monitored in real time, traversing all terminals in 1553B bus and RS422 bus networks;

10. The system according to claim 6, wherein the sending of the boot command and the waiting time are increased in a cold backup mode during the switching between the single machine and the module;