CN112511394A - Management and maintenance method of RapidIO bus system - Google Patents
Management and maintenance method of RapidIO bus system Download PDFInfo
- Publication number
- CN112511394A CN112511394A CN202011227054.3A CN202011227054A CN112511394A CN 112511394 A CN112511394 A CN 112511394A CN 202011227054 A CN202011227054 A CN 202011227054A CN 112511394 A CN112511394 A CN 112511394A
- Authority
- CN
- China
- Prior art keywords
- rapidio
- processing unit
- main
- host node
- management
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/28—Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
- H04L12/40—Bus networks
- H04L12/40006—Architecture of a communication node
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
- H04L41/0663—Performing the actions predefined by failover planning, e.g. switching to standby network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0803—Configuration setting
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Maintenance And Management Of Digital Transmission (AREA)
Abstract
The invention discloses a management maintenance method of a RapidIO bus system, which adopts a two-stage management strategy combining board level concentration and system distribution, two management maintenance methods of RapidIO out-of-band operation and in-band operation and a hot backup monitoring mode of a Host node, provides reliable on-line management maintenance of the RapidIO bus system, and realizes the functions of real-time monitoring, fault isolation, fault recovery and the like of the RapidIO bus system. The embodiment of the invention solves the problems of difficult configuration management, large burst flow, more data transmission conflicts, high real-time requirement and the like of a RapidIO bus network in an embedded signal processing system.
Description
Technical Field
The present invention relates to, but is not limited to, the field of embedded signal processing technologies, and in particular, to a method for managing and maintaining a RapidIO bus system.
Background
The RapidIO bus (one of standard buses) is widely applied in the technical field of embedded signal processing, and the management and maintenance of a complex RapidIO bus network are important parts of the design of an embedded signal processing system.
However, the RapidIO bus network in the embedded signal processing system has the characteristics of difficult configuration management, large burst flow, more data transmission conflicts, high real-time requirement and the like.
The embodiment of the invention provides a reliable, online and real-time RapidIO bus system management and maintenance method, which can effectively solve the stability problem of multitask, strong real-time and mass data transmission in an embedded signal processing system.
Disclosure of Invention
The purpose of the invention is: the embodiment of the invention provides a management and maintenance method of a RapidIO bus system, which aims to solve the problems of difficult configuration and management, large burst flow, multiple data transmission conflicts, high real-time requirements and the like of a RapidIO bus network in an embedded signal processing system.
The technical scheme of the invention is as follows:
the embodiment of the invention provides a management and maintenance method of a RapidIO bus system, which is characterized in that the RapidIO bus system comprises the following steps: the RapidIO processing units are connected through a RapidIO bus to form a RapidIO bus system; each RapidIO processing unit includes: the system comprises a main Host node, a backup Host node, a RapidIO switch and other processing nodes with RapidIO interfaces, wherein one RapidIO processing unit serves as a main processing unit, and the other RapidIO processing units serve as slave processing units; a main Host node, a backup Host node and a RapidIO switch in each processing unit are all configured with a management maintenance interface and a RapidIO interface, other processing nodes are configured with RapidIO interfaces, the main Host node and the backup Host node are respectively interconnected with the RapidIO switch through the management maintenance interface, the RapidIO interfaces of the main Host node, the backup Host node and other processing nodes are respectively connected with the RapidIO interfaces of the RapidIO switch, and the RapidIO interface externally output by the RapidIO switch of each processing unit is interconnected with other processing units in the RapidIO bus system to form the RapidIO bus system; the method for executing management and maintenance of the RapidIO bus system comprises the following steps:
step 1, in each processing unit, a main Host node executes the initial configuration operation of the RapidIO network of the processing unit, and the configuration items comprise: the ID of the RapidIO equipment, the communication rate and the link line width; the main Host node in the main processing unit also executes the communication route configuration of the whole RapidIO bus system;
step 2, in each processing unit, the main Host node configures the RapidIO switch of the processing unit through software; the working state after the configuration is as follows: when an important fault event occurs at a Port of the RapidIO switch, the RapidIO switch reports a fault state to a main Host through an interrupt signal or a Port-Write maintenance packet;
step 3, in each processing unit, the main Host node monitors events in the processing unit, monitors conventional events and monitors important events, wherein the important events comprise fault events, and the fault events comprise conventional fault events and important fault events;
step 4, processing the conventional fault event, including: when a RapidIO interface of a non-main Host node or a switch fails, the main Host node maintains and configures a register of the RapidIO switch through a RapidIO maintenance packet; when the RapidIO interface of the main Host node fails, recovering the failed interface through the management maintenance interface of the main Host node;
and 5, processing important fault events, including: and the main Host reports the fault to a main Host node of the main processing unit through a RapidIO maintenance packet, and the main Host of the main processing unit performs fault transaction broadcasting, blocking link interruption, fault packet discarding or communication route reconfiguration operation through the RapidIO maintenance packet to perform system-level fault processing.
Optionally, in the above-mentioned management and maintenance method for RapidIO bus system, in step 2,
the monitoring mode of the conventional event is as follows: the main Host node accesses a register of the RapidIO switch through interconnection of the internal management and maintenance interface of the processing unit in a periodic query mode, and obtains RapidIO network states of the processing unit in real time, wherein the RapidIO network states include link states, flow and error rates.
Optionally, in the above-mentioned management and maintenance method for RapidIO bus system, in step 2,
the monitoring mode of the important events is as follows: and the main Host node receives an interrupt signal or a Port-Write maintenance packet of the RapidIO switch, analyzes the Port-Write maintenance packet and determines the fault type.
Optionally, in the above method for managing and maintaining a RapidIO bus system, after the step 5, the method further includes:
and 6, recording the event, comprising: the main Host records the event of the processing unit; and the main Host node reports the events affecting other processing units to the main Host of the main processing unit through a RapidIO maintenance packet, and the main Host of the main processing unit determines to discard, record or broadcast the events according to the system running state.
Optionally, in the management and maintenance method of a RapidIO bus system, the method further includes:
step 7, in each processing unit, the main Host periodically reports heartbeats to the standby Host, and when the main Host does not report heartbeats, the standby Host is used as the main Host to take over the management right of the processing unit in the RapidIO bus system; when the management right of the master Host is changed, the change condition is reported to the master Host of the master processing unit through the maintenance packet, and when the management right of the master Host of the master processing unit is changed, the change condition is broadcasted to the slave processing unit in the RapidIO bus system through the maintenance packet.
Optionally, in the above method for managing and maintaining a RapidIO bus system, before the step 1, further includes:
setting one RapidIO processing unit in a RapidIO bus system as a main processing unit and other RapidIO processing units as slave processing units according to the configuration file;
and determining a main Host node and a standby Host node in each processing unit in a right preempting mode.
Optionally, in the management and maintenance method for the RapidIO bus system, the management and maintenance of the RapidIO bus system uses two modes, namely an out-of-band operation of a management and maintenance interface and an in-band maintenance operation of the RapidIO interface, and adopts a two-stage management method combining board-level concentration and system distribution, and both board-level and system Host nodes are realized by a "master-slave" hot backup mode.
Optionally, in the management maintenance method of the RapidIO bus system as described above, the management maintenance interface includes one of PCIe, I2C, and JTAG;
the RapidIO switch in the RapidIO processing unit is configured to be realized through cascade connection of a plurality of RapidIO switch chips.
The invention has the advantages that:
the management and maintenance method of the RapidIO bus system provided by the embodiment of the invention specifically comprises the following aspects: (1) providing a method for real-time state monitoring, quick fault recovery and unrecoverable fault isolation in a RapidIO bus system; (2) providing a hierarchical management maintenance system architecture of a complex RapidIO bus system; (3) providing a management and maintenance strategy combining RapidIO in-band and out-of-band operations; (4) and a reliable 'main-standby' Host system management node design mode is provided. Therefore, the robustness of the RapidIO bus network in the embedded signal processing system is enhanced, and the stable transmission of multi-task, strong real-time and burst large-flow data is realized. The management and maintenance method of the RapidIO bus system has the following advantages that:
(1) the two-stage management and maintenance strategy of board level concentration and system distribution is adopted to realize the layering, classification and home management of RapidIO bus faults, reduce the management and maintenance cost of the RapidIO bus system and improve the management and maintenance efficiency;
(2) the management maintenance method combining the operations in the RapidIO band and out of the RapidIO band is adopted, so that the periodic real-time monitoring of the ordinary transactions and the rapid processing of the emergency transactions of the RapidIO network are realized, and the relationship between the management maintenance overhead and the fault real-time response of the RapidIO bus system is effectively balanced;
(4) the monitoring mode of hot backup of a main-backup Host node is adopted, so that the management and monitoring reliability of the RapidIO bus system is effectively improved.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a schematic structural diagram of a RapidIO bus system for executing a management maintenance method in the embodiment of the present invention;
fig. 2 is a block diagram of a specific embodiment of a RapidIO bus system according to an embodiment of the present invention;
fig. 3 is a functional operation block diagram of a specific embodiment of the RapidIO bus system according to the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.
The invention provides a management maintenance method of a RapidIO bus system, which adopts a two-stage management strategy combining board level concentration and system distribution, two management maintenance methods of RapidIO out-of-band operation and RapidIO in-band operation, and a hot backup monitoring mode of a Host node, provides reliable on-line management maintenance of the RapidIO bus system, and realizes real-time monitoring, fault isolation, fault recovery and the like of the RapidIO bus system.
The following specific embodiments of the present invention may be combined, and the same or similar concepts or processes may not be described in detail in some embodiments.
Fig. 1 is a schematic structural diagram of a RapidIO bus system for executing the management maintenance method in the embodiment of the present invention. The RapidIO bus system comprises a plurality of RapidIO processing units which are interconnected through RapidIO buses, and the RapidIO processing units can be configured into a master processing unit and a slave processing unit through software. Each RapidIO processing unit comprises a software configurable main Host node, a backup Host node, a RapidIO switch and other processing nodes with RapidIO interfaces. The main Host node and the backup Host node in the main processing unit are a system main Host node and a system backup Host node and are responsible for management and maintenance of a RapidIO network in the whole system. And the master Host node and the backup Host node in the slave processing unit are a board-level master Host node and a board-level backup Host node and are responsible for management and maintenance of a RapidIO sub-network in the processing unit.
The main Host node, the backup Host node and the RapidIO switch on the RapidIO processing unit should be provided with a management maintenance interface (the management maintenance interface includes one of PCIe, I2C and JTAG, but is not limited to the above interface) and a RapidIO interface. And the main Host node and the backup Host node are interconnected with the RapidIO switch through the management and maintenance interface. The RapidIO interfaces of all RapidIO processing nodes on the RapidIO processing unit are connected to a RapidIO switch. And the RapidIO switch of the RapidIO processing unit interconnects the externally output RapidIO interface with other RapidIO processing units in the system to form the whole RapidIO network. Based on the hardware structure of the RapidIO bus system and the functions of each part, the RapidIO bus system execution management maintenance method comprises the following steps:
step 1, in each processing unit, a main Host node executes the initial configuration operation of the RapidIO network of the processing unit, and the configuration items comprise: the ID of the RapidIO equipment, the communication rate and the link line width; the main Host node in the main processing unit also executes the communication route configuration of the whole RapidIO bus system;
step 2, in each processing unit, the main Host node configures the RapidIO switch of the processing unit through software; the working state after the configuration is as follows: when an important fault event occurs at a Port of the RapidIO switch, the RapidIO switch reports a fault state to a main Host through an interrupt signal or a Port-Write maintenance packet; important fault events can be disconnection, blockage, retransmission stop and the like;
step 3, in each processing unit, the main Host node monitors events in the processing unit, monitors conventional events and monitors important events, wherein the important events comprise fault events, and the fault events comprise conventional fault events and important fault events;
step 4, processing the conventional fault event, including: when the RapidIO interface of the non-master Host node or the switch fails, the master Host node maintains and configures the register of the RapidIO switch through a RapidIO maintenance packet, and the maintenance items may include: port reset, close-restart operation, attempting to recover the failed interface; when the RapidIO interface of the main Host node fails, recovering the failed interface through the management maintenance interface of the main Host node;
and 5, processing important fault events, including: and the main Host reports the fault to a main Host node of the main processing unit through a RapidIO maintenance packet, and the main Host of the main processing unit performs fault transaction broadcasting, blocking link interruption, fault packet discarding or communication route reconfiguration operation through the RapidIO maintenance packet to perform system-level fault processing.
In an implementation manner of the embodiment of the present invention, the monitoring manner of the conventional event in step 2 is as follows: the main Host node accesses a register of the RapidIO switch through interconnection of the internal management and maintenance interface of the processing unit in a periodic query mode, and obtains RapidIO network states of the processing unit in real time, wherein the RapidIO network states include link states, flow and error rates. In the specific implementation, on the RapidIO processing unit, the periodic management of the RapidIO sub-network is realized, the main Host node is realized by adopting a periodic query mode, the register of the RapidIO switch is accessed through the internal management maintenance interface of the processing unit, and the states of the RapidIO network in the processing unit, such as link state, flow, error rate and the like, are acquired in real time, so that the real-time monitoring of the RapidIO network state is realized.
In an implementation manner of the embodiment of the present invention, the monitoring manner of the important event in step 2 is as follows: and the main Host node receives an interrupt signal or a Port-Write maintenance packet of the RapidIO switch, analyzes the Port-Write maintenance packet and determines the fault type. In the specific implementation, the emergency transaction management of the RapidIO sub-network is realized by adopting an interrupt or Port-Write maintenance message mode, the emergency transaction of the RapidIO switch is configured to notify the board-level main Host node of the interrupt or Port-Write maintenance message, the board-level main Host node configures a RapidIO switch register through a RapidIO maintenance operation mode, and the RapidIO switch register is configured to handle fault types, including isolation and recovery operations such as route reconfiguration, Port reset or close, Port restart and the like.
When the failure can not be recovered, the board-level main Host node informs the system main Host node through RapidIO maintenance operation, and the system main Host node uses RapidIO maintenance operation to take charge of system-level failure processing, including system failure notification, failure packet discarding, routing reconfiguration and other operations, so as to realize the notification and isolation of the failure in the system.
On the RapidIO processing unit, a board-level main Host node set by software is responsible for the periodic state transaction management of a RapidIO sub-network on the processing unit through a management maintenance interface. And emergency transaction management such as fault isolation, fault recovery and the like of a RapidIO sub-network on the processing unit is taken charge through RapidIO maintenance operation. And reporting the board-level RapidIO sub-network state and fault information to a system Host through RapidIO maintenance operation.
On the main processing unit, the system main Host and the backup Host are responsible for receiving and analyzing the state and the fault message of the RapidIO sub-network reported by each board level Host node, and the RapidIO event is processed by adopting a RapidIO maintenance operation mode, and the method comprises the following steps: system failure notification, failed packet discarding, route reconfiguration, etc.
After step 5, the embodiment of the present invention further includes:
and 6, recording the event, comprising: the main Host records events of the processing unit, such as a flow state, a packet loss state and a retransmission state; and the main Host node reports events affecting other processing units to the main Host of the main processing unit through a RapidIO maintenance packet, such as disconnection, connection, link blockage and the like, and the main Host of the main processing unit determines to discard, record or broadcast the events according to the system running state.
Further, the embodiment of the present invention further includes:
step 7, in each processing unit, the main Host periodically reports heartbeats to the standby Host, and when the main Host does not report heartbeats, the standby Host is used as the main Host to take over the management right of the processing unit in the RapidIO bus system; when the management right of the master Host is changed, the change condition is reported to the master Host of the master processing unit through the maintenance packet, and when the management right of the master Host of the master processing unit is changed, the change condition is broadcasted to the slave processing unit in the RapidIO bus system through the maintenance packet.
In a specific implementation manner, on each RapidIO processing unit, the main Host node and the backup Host adopt a hot backup mode, main-backup monitoring is performed in a heartbeat mode, and when the main Host node fails, the backup Host node takes over the management authority of the main Host node.
In other RapidIO processing units, the board-level backup Host monitors the heartbeat of a board-level main Host node, takes over the management right of a board-level RapidIO sub-network when the board-level main Host is free of heartbeat, and notifies the main Host of the RapidIO bus system of the change of the management right. In the RapidIO main processing unit, the system backups the Host, monitors the heartbeat of the main Host node of the system, takes over the management right of the RapidIO bus system when the main Host of the system has no heartbeat, and notifies the board-level main Host in the RapidIO bus system of the change of the management right.
In practical application, any RapidIO processing unit in the software configuration system can be used as a main processing unit, and a main Host node and a backup Host node of the software configuration system are used as a system main Host node and a system backup Host node. When the RapidIO processing unit is a main processing unit, the main Host and the backup Host node are simultaneously responsible for management and maintenance of a board-level RapidIO bus and management and maintenance of a system RapidIO bus. At this time, the system main Host and the board-level main Host may be the same processor node, and the system backup Host and the board-level backup Host may be the same processor node. In addition, in each processing unit, a main Host node and a standby Host node in the processing unit can be determined in a right-robbing mode.
It should be noted that, in the embodiment of the present invention, the management and maintenance of the RapidIO bus system uses two modes, namely, an out-of-band operation of the management and maintenance interface and an in-band maintenance operation of the RapidIO interface, and a two-stage management method combining board level concentration and system distribution is adopted, and both board level and system Host node are implemented in a "main-standby" hot backup mode, so as to provide reliable management and maintenance of the RapidIO bus system.
Further, the management maintenance interface in the embodiment of the present invention includes one of PCIe, I2C, and JTAG; the RapidIO switch in the RapidIO processing unit is configured to be realized through cascade connection of a plurality of RapidIO switch chips.
The management and maintenance method of the RapidIO bus system provided by the embodiment of the invention specifically comprises the following aspects: (1) providing a method for real-time state monitoring, quick fault recovery and unrecoverable fault isolation in a RapidIO bus system; (2) providing a hierarchical management maintenance system architecture of a complex RapidIO bus system; (3) providing a management and maintenance strategy combining RapidIO in-band and out-of-band operations; (4) and a reliable 'main-standby' Host system management node design mode is provided. Therefore, the robustness of the RapidIO bus network in the embedded signal processing system is enhanced, and the stable transmission of multi-task, strong real-time and burst large-flow data is realized.
The management and maintenance method of the RapidIO bus system provided by the embodiment of the invention adopts a two-stage management strategy combining board level concentration and system distribution, two management and maintenance methods of RapidIO out-of-band operation and RapidIO in-band operation, and a hot backup monitoring mode of a Host node, provides reliable on-line management and maintenance of the RapidIO bus system, and realizes real-time monitoring, fault isolation, fault recovery and the like of the RapidIO bus system.
Fig. 2 is a block diagram of a RapidIO bus system according to an embodiment of the present invention, and the present invention is further described with reference to a specific embodiment.
The RapidIO bus system is realized by a plurality of RapidIO processing units, a main Host node and a backup Host node in the RapidIO processing units are realized by adopting a TMS320C6678 processor of a TI company, and a RapidIO switch is realized by adopting an 80HCPS1848 switching chip of the IDT company.
The management and maintenance interface of the TMS320C6678 processor is realized through I2C, the heartbeat between the main Host node and the backup Host node is realized through a GPIO interface of the processor, the main Host node reports the heartbeat to the backup Host node periodically, and the TMS320C6678 processor realizes interconnection of 1-path 4x/5Gbps RapidIO and a switching chip.
The 80HCPS1848 switching chip provides a RapidIO physical interface of 18 ports and 48 lines and supports the RapidIO V2.1 specification. In a specific embodiment, the configuration is a 4x module port, which is respectively connected with a TMS320C6678 processor inside a RapidIO processing unit and outputs 2 paths of 4x RapidIO interfaces to the outside, so as to implement system interconnection. The 80HCPS1848 switching chip provides I2C as an administration and maintenance interface, is connected with the TMS320C6678 processor, and realizes administration and maintenance of RapidIO.
As shown in fig. 3, for a functional operation block diagram of a specific embodiment of the RapidIO bus system provided in the embodiment of the present invention, in the management and maintenance design of the RapidIO bus system, a RapidIO switch chip is configured by software, and when it is preset that states such as disconnection, retransmission, and the like occur at each RapidIO Port, a fault state is reported to the TMS320C6678 processor by an interrupt or a Port-Write maintenance packet.
During the operation of a board-level main Host node TMS320C6678, the following functions are mainly realized: the states of the RapidIO switching chip are read periodically through an I2C interface, wherein the states include link states, flow, error rate and the like; responding and processing the interrupt or Port-Write transaction reported by the RapidIO chip, and performing fault recovery; periodically reporting the state of a local RapidIO sub-network to a system main Host node, and reporting an unrecoverable fault in real time; and in the process of system initialization or fault recovery, finishing the configuration management of the local route. Meanwhile, after the board-level backup Host node monitors that the main Host node has no heartbeat, the board-level backup Host node takes over the management right of the board-level RapidIO sub-network and reports the management right to the system main Host.
During the operation of a main Host node TMS320C6678 of the system, the following functions are mainly realized: periodically acquiring and analyzing state information reported by a RapidIO sub-network, and reporting a system state or recording a key state to an upper-level host system; responding to the unrecoverable fault reported by the RapidIO sub-network, reporting the unrecoverable fault and related nodes to the whole system, and sending route change information to realize fault isolation; and completing the configuration management of the system route in the process of system initialization or fault isolation. Meanwhile, after the system backup Host node monitors that the main Host node has no heartbeat, the management right of the RapidIO network of the system is taken over, and the management right is reported to each board-level main Host.
Although the embodiments of the present invention have been described above, the above description is only for the convenience of understanding the present invention, and is not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (8)
1. A management maintenance method of a RapidIO bus system is characterized in that the RapidIO bus system comprises the following steps: the RapidIO processing units are connected through a RapidIO bus to form a RapidIO bus system; each RapidIO processing unit includes: the system comprises a main Host node, a backup Host node, a RapidIO switch and other processing nodes with RapidIO interfaces, wherein one RapidIO processing unit serves as a main processing unit, and the other RapidIO processing units serve as slave processing units; a main Host node, a backup Host node and a RapidIO switch in each processing unit are all configured with a management maintenance interface and a RapidIO interface, other processing nodes are configured with RapidIO interfaces, the main Host node and the backup Host node are respectively interconnected with the RapidIO switch through the management maintenance interface, the RapidIO interfaces of the main Host node, the backup Host node and other processing nodes are respectively connected with the RapidIO interfaces of the RapidIO switch, and the RapidIO interface externally output by the RapidIO switch of each processing unit is interconnected with other processing units in the RapidIO bus system to form the RapidIO bus system; the method for executing management and maintenance of the RapidIO bus system comprises the following steps:
step 1, in each processing unit, a main Host node executes the initial configuration operation of the RapidIO network of the processing unit, and the configuration items comprise: the ID of the RapidIO equipment, the communication rate and the link line width; the main Host node in the main processing unit also executes the communication route configuration of the whole RapidIO bus system;
step 2, in each processing unit, the main Host node configures the RapidIO switch of the processing unit through software; the working state after the configuration is as follows: when an important fault event occurs at a Port of the RapidIO switch, the RapidIO switch reports a fault state to a main Host through an interrupt signal or a Port-Write maintenance packet;
step 3, in each processing unit, the main Host node monitors events in the processing unit, monitors conventional events and monitors important events, wherein the important events comprise fault events, and the fault events comprise conventional fault events and important fault events;
step 4, processing the conventional fault event, including: when a RapidIO interface of a non-main Host node or a switch fails, the main Host node maintains and configures a register of the RapidIO switch through a RapidIO maintenance packet; when the RapidIO interface of the main Host node fails, recovering the failed interface through the management maintenance interface of the main Host node;
and 5, processing important fault events, including: and the main Host reports the fault to a main Host node of the main processing unit through a RapidIO maintenance packet, and the main Host of the main processing unit performs fault transaction broadcasting, blocking link interruption, fault packet discarding or communication route reconfiguration operation through the RapidIO maintenance packet to perform system-level fault processing.
2. The method for managing and maintaining a RapidIO bus system according to claim 1, wherein in the step 2,
the monitoring mode of the conventional event is as follows: the main Host node accesses a register of the RapidIO switch through interconnection of the internal management and maintenance interface of the processing unit in a periodic query mode, and obtains RapidIO network states of the processing unit in real time, wherein the RapidIO network states include link states, flow and error rates.
3. The method for managing and maintaining a RapidIO bus system according to claim 1, wherein in the step 2,
the monitoring mode of the important events is as follows: and the main Host node receives an interrupt signal or a Port-Write maintenance packet of the RapidIO switch, analyzes the Port-Write maintenance packet and determines the fault type.
4. The method for managing and maintaining the RapidIO bus system according to claim 1, further comprising, after the step 5:
and 6, recording the event, comprising: the main Host records the event of the processing unit; and the main Host node reports the events affecting other processing units to the main Host of the main processing unit through a RapidIO maintenance packet, and the main Host of the main processing unit determines to discard, record or broadcast the events according to the system running state.
5. The method for managing and maintaining the RapidIO bus system according to claim 4, further comprising:
step 7, in each processing unit, the main Host periodically reports heartbeats to the standby Host, and when the main Host does not report heartbeats, the standby Host is used as the main Host to take over the management right of the processing unit in the RapidIO bus system; when the management right of the master Host is changed, the change condition is reported to the master Host of the master processing unit through the maintenance packet, and when the management right of the master Host of the master processing unit is changed, the change condition is broadcasted to the slave processing unit in the RapidIO bus system through the maintenance packet.
6. The method for managing and maintaining the RapidIO bus system according to claim 1, wherein the step 1 is preceded by the steps of:
setting one RapidIO processing unit in a RapidIO bus system as a main processing unit and other RapidIO processing units as slave processing units according to the configuration file;
and determining a main Host node and a standby Host node in each processing unit in a right preempting mode.
7. The method for managing and maintaining the RapidIO bus system according to any one of claims 1 to 6, characterized in that the RapidIO bus system is managed and maintained by using two modes of out-of-band operation of a management and maintenance interface and in-band maintenance operation of the RapidIO interface, a two-stage management method combining board level concentration and system distribution is adopted, and both board level and system Host nodes are realized by adopting a 'main-standby' hot backup mode.
8. The method for managing and maintaining the RapidIO bus system according to any of claims 1-6, wherein the management and maintenance interface comprises one of PCIe, I2C and JTAG;
the RapidIO switch in the RapidIO processing unit is configured to be realized through cascade connection of a plurality of RapidIO switch chips.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011227054.3A CN112511394B (en) | 2020-11-05 | 2020-11-05 | Management and maintenance method of RapidIO bus system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011227054.3A CN112511394B (en) | 2020-11-05 | 2020-11-05 | Management and maintenance method of RapidIO bus system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112511394A true CN112511394A (en) | 2021-03-16 |
CN112511394B CN112511394B (en) | 2022-02-11 |
Family
ID=74955347
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011227054.3A Active CN112511394B (en) | 2020-11-05 | 2020-11-05 | Management and maintenance method of RapidIO bus system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112511394B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113965459A (en) * | 2021-10-08 | 2022-01-21 | 浪潮云信息技术股份公司 | Consul-based method for monitoring host network to realize high availability of computing nodes |
CN115150322A (en) * | 2022-09-06 | 2022-10-04 | 中勍科技股份有限公司 | Multichannel RapidIO distribution system and fault self-isolation method thereof |
CN115484220A (en) * | 2022-08-23 | 2022-12-16 | 中国电子科技集团公司第十研究所 | Domestic SRIO exchange chip event crazy report processing method, equipment and medium |
CN116232864A (en) * | 2023-05-05 | 2023-06-06 | 井芯微电子技术(天津)有限公司 | Multi-machine hot backup method and system for network system based on event controller |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060200614A1 (en) * | 2005-03-04 | 2006-09-07 | Fujitsu Limited | Computer system using serial connect bus, and method for interconnecting a plurality of CPU using serial connect bus |
US20070104219A1 (en) * | 2005-11-09 | 2007-05-10 | Honeywell International Inc. | System and method to facilitate testing of rapidio components |
CN102843264A (en) * | 2012-09-21 | 2012-12-26 | 中国航空无线电电子研究所 | Control method of double hosts in high-speed serial bus network |
CN103001867A (en) * | 2012-12-27 | 2013-03-27 | 中航(苏州)雷达与电子技术有限公司 | Host-standby machine duplicated hot-backup system and method |
CN103970704A (en) * | 2014-04-16 | 2014-08-06 | 上海电控研究所 | Optical fiber bus hardware system based on Rapid IO protocol |
CN107483353A (en) * | 2017-08-30 | 2017-12-15 | 天津津航计算技术研究所 | A kind of RapidIO network managements and monitoring system |
CN109194497A (en) * | 2018-07-17 | 2019-01-11 | 中国航空无线电电子研究所 | Double SRIO Network Backup Systems of software-oriented radio system |
CN109218231A (en) * | 2018-09-21 | 2019-01-15 | 中国航空无线电电子研究所 | A kind of RapidIO exchange network |
CN109547365A (en) * | 2018-10-29 | 2019-03-29 | 中国航空无线电电子研究所 | A kind of unmanned Combat Command System data exchange system based on SRIO |
CN110704250A (en) * | 2019-09-23 | 2020-01-17 | 天津津航计算技术研究所 | Hot backup device of distributed system |
-
2020
- 2020-11-05 CN CN202011227054.3A patent/CN112511394B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060200614A1 (en) * | 2005-03-04 | 2006-09-07 | Fujitsu Limited | Computer system using serial connect bus, and method for interconnecting a plurality of CPU using serial connect bus |
US20070104219A1 (en) * | 2005-11-09 | 2007-05-10 | Honeywell International Inc. | System and method to facilitate testing of rapidio components |
CN102843264A (en) * | 2012-09-21 | 2012-12-26 | 中国航空无线电电子研究所 | Control method of double hosts in high-speed serial bus network |
CN103001867A (en) * | 2012-12-27 | 2013-03-27 | 中航(苏州)雷达与电子技术有限公司 | Host-standby machine duplicated hot-backup system and method |
CN103970704A (en) * | 2014-04-16 | 2014-08-06 | 上海电控研究所 | Optical fiber bus hardware system based on Rapid IO protocol |
CN107483353A (en) * | 2017-08-30 | 2017-12-15 | 天津津航计算技术研究所 | A kind of RapidIO network managements and monitoring system |
CN109194497A (en) * | 2018-07-17 | 2019-01-11 | 中国航空无线电电子研究所 | Double SRIO Network Backup Systems of software-oriented radio system |
CN109218231A (en) * | 2018-09-21 | 2019-01-15 | 中国航空无线电电子研究所 | A kind of RapidIO exchange network |
CN109547365A (en) * | 2018-10-29 | 2019-03-29 | 中国航空无线电电子研究所 | A kind of unmanned Combat Command System data exchange system based on SRIO |
CN110704250A (en) * | 2019-09-23 | 2020-01-17 | 天津津航计算技术研究所 | Hot backup device of distributed system |
Non-Patent Citations (2)
Title |
---|
但成福等: "一种可软件配置的RapidIO总线系统设计", 《单片机与嵌入式系统应用》 * |
朱坚等: "基于Serial RapidIO的高速实时数据采集处理系统", 《电子质量》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113965459A (en) * | 2021-10-08 | 2022-01-21 | 浪潮云信息技术股份公司 | Consul-based method for monitoring host network to realize high availability of computing nodes |
CN115484220A (en) * | 2022-08-23 | 2022-12-16 | 中国电子科技集团公司第十研究所 | Domestic SRIO exchange chip event crazy report processing method, equipment and medium |
CN115484220B (en) * | 2022-08-23 | 2023-06-27 | 中国电子科技集团公司第十研究所 | Method, equipment and medium for processing event report of domestic SRIO exchange chip |
CN115150322A (en) * | 2022-09-06 | 2022-10-04 | 中勍科技股份有限公司 | Multichannel RapidIO distribution system and fault self-isolation method thereof |
CN115150322B (en) * | 2022-09-06 | 2022-11-25 | 中勍科技股份有限公司 | Multichannel RapidIO distribution system and fault self-isolation method thereof |
CN116232864A (en) * | 2023-05-05 | 2023-06-06 | 井芯微电子技术(天津)有限公司 | Multi-machine hot backup method and system for network system based on event controller |
Also Published As
Publication number | Publication date |
---|---|
CN112511394B (en) | 2022-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112511394B (en) | Management and maintenance method of RapidIO bus system | |
US6173411B1 (en) | Method and system for fault-tolerant network connection switchover | |
AU737333B2 (en) | Active failure detection | |
JP2002517819A (en) | Method and apparatus for managing redundant computer-based systems for fault-tolerant computing | |
WO2007092132A2 (en) | System and method for detecting and recovering from virtual switch link failures | |
JPH07235933A (en) | Fault-torelant connection method and device to local area network of computor system | |
WO2013097882A1 (en) | Technique for handling a status change in an interconnect node | |
CN109194497B (en) | Dual SRIO network backup system for software-oriented radio system | |
CA2357913A1 (en) | System for providing fabric activity switch control in a communications system | |
CN115550291B (en) | Switch reset system and method, storage medium, and electronic device | |
JPH086910A (en) | Cluster type computer system | |
CN110535715B (en) | Linux-based port state real-time detection method, circuit and switch | |
CN114356665A (en) | Comprehensive photoelectric signal processing computing resource management method | |
CN101212341A (en) | Database system switching method | |
CN101667953A (en) | Reporting method of rapid looped network physical link state and device therefor | |
JPH08305592A (en) | Multiprocessor system | |
US8208370B1 (en) | Method and system for fast link failover | |
CN114884767B (en) | Synchronous dual-redundancy CAN bus communication system, method, equipment and medium | |
JP6134720B2 (en) | Connection method | |
CN114928513A (en) | Double-bus communication system and communication method based on SRIO protocol | |
CN114675583A (en) | System for double-system main/standby state judgment according with SIL4 safety level | |
CN111682966A (en) | Network communication device with fault active reporting function, system and method thereof | |
CN217037201U (en) | Management network device for storing products and storage system | |
KR100198416B1 (en) | Synchronization monitor circuit for duplicated control system | |
JP2692338B2 (en) | Communication device failure detection device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |