CN116248488A

CN116248488A - rapidIO network management method of signal processing system

Info

Publication number: CN116248488A
Application number: CN202211617310.9A
Authority: CN
Inventors: 王亮; 郝玉锴; 栗阳阳; 杨玻; 马超; 段秉环
Original assignee: Xian Aeronautics Computing Technique Research Institute of AVIC
Current assignee: Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date: 2022-12-15
Filing date: 2022-12-15
Publication date: 2023-06-09

Abstract

The invention provides a rapidIO network management method, and aims to provide a method for reducing software operation abnormality caused by network congestion. The scheme is realized by the following technical scheme: and all the processor nodes on the signal processing modules are interconnected through rapidIO network switches to form rapidIO sub-networks, and the rapidIO sub-networks on all the modules are interconnected to form a rapidIO main network. The main node in the sub-network maintains the state of each node in the sub-network, fault handling measures are executed when the fault occurs, the sub-network states are periodically sent to the main module main node, the main module main node maintains each sub-network state, and the main module main node maintains the main network state when the main module main node breaks down.

Description

rapidIO network management method of signal processing system

Technical Field

The invention belongs to the technical field of network communication, and particularly relates to a rapidIO network management method of a signal processing system.

Background

With the development of the signal processing field toward the rapid and efficient direction, computing resource nodes in the signal processing system increase exponentially with the increase of the operation scale, and interconnection communication between large-scale computing resources becomes a key factor affecting the performance of the signal processing system. The rapidIO bus is used as an interconnection standard in the embedded system, is suitable for a tightly coupled working environment of multiple computing resource nodes, and supports hot plug of each node. Aiming at the requirement of dividing various functions into field replaceable unit modules in a signal processing system, the rapidIO bus can obtain higher system-level performance and support the dynamic networking of each unit module. In the signal processing system, a large number of DSP, CPU, FPGA devices are interconnected by rapidIO, so that the data transmission function can be completed quickly and efficiently, however, in the practical process, when network management nodes are off line, the network does not have management nodes, so that each node of the network cannot acquire the state of an opposite node to cause data transmission failure, data congestion is caused, and then a processor node cannot normally operate, for example, a certain receiving node in the network is off line, if no message is notified, a data transmitting end continuously transmits, the communication of the processor is blocked, and the efficiency of data interaction is reduced.

Disclosure of Invention

In view of this, the embodiments of the present disclosure provide a rapidIO network management method of a signal processing system, which improves system scalability and adaptability, and solves the problem that when a network management node goes offline, a network does not have a management node, so that each node of the network cannot obtain a state of an opposite node, and data transmission fails, and data congestion is caused, so that a processor node cannot operate normally.

A signal processing system rapidIO network management method, signal processing main module, signal processing standby module and multiple signal processing slave modules that the dual redundancy mode sets up, the signal processing standby module is the hot backup of the signal processing main module, when the signal processing main module is off line, the signal processing standby module is on line;

the signal processing main module comprises a first switch, a first main processing node CPU and a plurality of first sub-processing nodes, wherein the first main processing node CPU acquires state information of the plurality of first sub-processing nodes through the first switch and generates a first information table;

the signal processing standby module comprises a second switch, a second main processing node CPU and a plurality of second sub-processing nodes, wherein the second main processing node CPU acquires state information of the plurality of second sub-processing nodes through the second switch and generates a second information table;

each signal processing slave module comprises a third switch, a third main processing node CPU and a plurality of third sub-processing nodes, wherein the third main processing node CPU obtains state information of the plurality of third sub-processing nodes through the third switch and generates a subnet information table;

the first main processing node CPU and the second main processing node CPU acquire all the sub-network information tables generated by the signal processing slave modules, and respectively gather the sub-network information tables and the first information tables or the second information tables of the sub-network information tables into a main network information table, wherein the main network information table is read by all the signal processing slave modules in real time, the working states of all other signal processing slave modules and all nodes of the signal processing master module are known, and whether data interaction is performed or not is determined;

when a first main processing node CPU of the signal processing main module fails, a second main processing node CPU of the signal processing standby module manages a backbone network, and the problem of communication congestion caused by node offline is solved.

The beneficial effects are that:

the signal processing system is divided into rapidIO subnets according to the signal processing module, the rapidIO subnets form a rapidIO main network together, each subnet state is maintained by a main node of the slave module, the main network is maintained by a main node of the main module, and when a link between the subnets and the main network is abnormal, the subnets can work normally under the management of the main node of the slave module without influencing the work of the main network. Meanwhile, the slave module master node acquires the state of the subnet cascade port in an access attempt mode, so that the communication blockage of the processor caused by data access after the failure of the subnet cascade port is avoided. And if the receiving node is offline and has no message notification, the data transmitting end continuously transmits, so that the communication of the processor is blocked.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.

FIG. 1 is a schematic diagram of a network architecture;

fig. 2 is a schematic structural diagram of the signal processing main module;

FIG. 3 is a schematic diagram of a signal processing standby module;

fig. 4 is a schematic structural view of the slave module.

Detailed Description

Embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.

Other advantages and effects of the present disclosure will become readily apparent to those skilled in the art from the following disclosure, which describes embodiments of the present disclosure by way of specific examples. It will be apparent that the described embodiments are merely some, but not all embodiments of the present disclosure. The disclosure may be embodied or practiced in other different specific embodiments, and details within the subject specification may be modified or changed from various points of view and applications without departing from the spirit of the disclosure. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict. All other embodiments, which can be made by one of ordinary skill in the art without inventive effort, based on the embodiments in this disclosure are intended to be within the scope of this disclosure.

It is noted that various aspects of the embodiments are described below within the scope of the following claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the present disclosure, one skilled in the art will appreciate that one aspect described herein may be implemented independently of any other aspect, and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. In addition, such apparatus may be implemented and/or such methods practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.

As shown in fig. 1, in the rapidIO network management method of the signal processing system, a signal processing main module, a signal processing standby module and a plurality of signal processing slave modules are set in a dual redundancy mode, wherein the signal processing standby module is a hot backup of the signal processing main module, and when the signal processing main module is offline, the signal processing standby module is online;

as shown in fig. 2, the signal processing main module includes a first switch, a first main processing node CPU and a plurality of first sub-processing nodes of the first main processing node CPU, where the first main processing node CPU obtains state information of the plurality of first sub-processing nodes of the first main processing node CPU through the first switch, and generates a first information table;

as shown in fig. 3, the signal processing standby module includes a second switch, a second main processing node CPU and a plurality of own second sub-processing nodes, where the second main processing node CPU obtains state information of the own plurality of second sub-processing nodes through the second switch, and generates a second information table;

as shown in fig. 4, each signal processing slave module includes a third main processing node CPU, a plurality of own third sub-processing nodes and a third switch, where the third main processing node CPU obtains state information of the plurality of own third sub-processing nodes through the third switch, and generates a subnet information table;

the first main processing node CPU and the second main processing node CPU acquire all the sub-network information tables generated by the signal processing slave modules and respectively gather the sub-network information tables and the first information tables or the second information tables of the sub-network information tables into a main network information table, the main network information table is read by all the signal processing slave modules in real time, the working states of all other signal processing slave modules and all nodes of the signal processing master module are known, whether data interaction is performed or not is determined, and when the first main processing node CPU of the signal processing master module fails, the second main processing node CPU of the signal processing standby module manages the main network, so that the problem of communication congestion caused by node offline is solved.

Wherein: the sub-processing nodes correspond to DSPs (the DSPs are digital signal processors, the DSPs are correspondingly connected with external equipment) and FPGAs (programmable array processors, the FPGAs are connected with the external equipment), the CPU of the first main processing node serves as a main node, the main processing node acquires a sub-network information table generated by the online state of each sub-processing node in real time, and when a sub-processing node of a certain signal processing slave module is blocked, for example, when the DSP1 is blocked or is offline (the user knows that the DSP is blocked and does not send messages), and other nodes perform normal data interaction.

The design of the main and standby signal processing modules solves the problem that communication abnormality is caused by the fact that each node of the rapidIO network cannot acquire the state of the opposite node due to the failure of the main signal processing module. The access design is tried to solve the problem of network congestion caused by a large amount of communication data when the node is disconnected. The design of the rapidIO subnetwork solves the problem that each point in the rapidIO subnetwork cannot communicate when the main and standby signal processing modules are failed. In general, the rapidIO network management method of the signal processing system can be widely popularized and used as an implementation method.

As the specific implementation mode provided by the scheme, when the main processing node of the signal processing main module fails, the main processing node of the signal processing standby module manages the backbone network, and the communication among nodes is carried out in a mode of attempting access at preset time intervals, so that the problem of communication congestion caused by node offline is solved.

As the specific implementation mode provided by the scheme, the method further comprises the following steps:

step 1: establishing a subnet information table in each third main processing node CPU, wherein the subnet information table records a network initialization state, a subnet effective mark, a subnet node on-line state and a subnet node physical link state;

step 2: a third main processing node CPU main node of each signal processing slave module acquires state information of all sub-nodes of the third main processing node CPU main node and generates a sub-network information table, and the sub-network information table is sent to a first main processing node CPU main node and a second main processing node CPU main node of the signal;

step 3: after the first main processing node CPU main node and the second main processing node CPU main node acquire all the sub-network information tables, respectively integrating the sub-network information tables with the first information table and the second information table corresponding to the first main processing node CPU main node and the second main processing node CPU main node in a preset time period to form a main network information table, and placing the main network node information table in a local shared memory area for all the signal processing slave modules to access;

step 4: the third main processing node CPU of each signal processing slave module reads the main network information table from the first main processing node CPU to the local at the beginning of each period and distributes the main network information table to each sub-node of the signal processing slave module.

when the first main processing node CPU is offline, the second main processing node CPU is online, if the second main processing node CPU is offline, a main network information table accessed by a signal processing slave module is marked as invalid, and a currently used rapidIO network enters a degradation working mode and does not have network management capability; the subnet information table of each signal processing slave module is only used for communication between own sub-processing nodes, or the third main processing node CPU reestablishes connection with the first main processing node CPU or the second main processing node CPU by adopting a strategy of attempting access.

As the specific implementation mode provided by the scheme, the node in the rapidIO network which is currently used resets the port connected with the opposite node, reads the initialization completion mark of the opposite node once, can read the initialization completion mark, and attempts to access successfully, otherwise, the processor generates data as access abnormality, the opposite node is not accessed in the period, and the problem of data congestion caused by continuously accessing the offline node is solved.

As a specific embodiment provided in this case, the first main processing node CPU or the second main processing node CPU maintains all reported subnet information tables, including:

detecting an initialization completion mark, wherein the presence of the mark indicates that the corresponding subnet information table exists, detecting the subnet information table once, detecting a heartbeat signal, and setting the subnet information table to be invalid when no heartbeat exists. (heartbeat signals, 1,2,3, 4.) n, each number representing the updated status of all subnet information tables for different periods, when no heartbeat is present, the subnet information table is set to fail.

The foregoing is merely specific embodiments of the disclosure, but the protection scope of the disclosure is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the disclosure are intended to be covered by the protection scope of the disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A rapidIO network management method of a signal processing system is characterized in that a signal processing main module, a signal processing standby module and a plurality of signal processing auxiliary modules are arranged in a dual redundancy mode, wherein the signal processing standby module is a hot backup of the signal processing main module, and when the signal processing main module is offline, the signal processing standby module is online;

2. The method of claim 1, wherein when a main processing node of the signal processing main module fails, the main processing node of the signal processing standby module manages a backbone network, and performs communication between nodes by attempting access at preset time intervals, so as to solve a communication congestion problem caused by node offline.

3. The method of claim 1, further comprising the step of:

step 1: establishing a subnet information table in each third main processing node CPU, wherein the subnet information table records a subnet initialization state, a subnet effective mark, a subnet node on-line state and a subnet node physical link state;

step 2: the third main processing node CPU of each signal processing slave module acquires the state information of all sub-nodes of the third main processing node CPU and generates a sub-network information table, and the sub-network information table is sent to the first main processing node CPU and the second main processing node CPU of the signal;

step 3: after the first main processing node CPU and the second main processing node CPU acquire all the sub-network information tables, integrating the sub-network information tables with the first information table and the second information table corresponding to the first main processing node CPU and the second main processing node CPU respectively in a preset time period to form a main network information table, and placing the main network node information table in a local shared memory area for all the signal processing slave modules to access;

4. A method according to claim 3, further comprising the step of:

when the first main processing node CPU is offline, the second main processing node CPU is online, if the second main processing node CPU is offline, a backbone network information table accessed by a signal processing slave module is marked as invalid, and a currently used rapidIO network enters a degradation working mode and does not have backbone network management capability; the subnet information table of each signal processing slave module is only used for communication between own sub-processing nodes, or the third main processing node CPU reestablishes connection with the first main processing node CPU or the second main processing node CPU by adopting a strategy of attempting access.

5. The method of claim 4, wherein the node in the rapidIO network used at present resets the port to which the opposite node is connected, reads the initialization completion flag of the opposite node once, and can read the initialization completion flag, if the initialization completion flag is successful, the processor generates data access exception, the opposite node is not accessed in the period, and the problem of data congestion caused by continuously accessing the offline node is solved.

6. The method of claim 5, wherein the first main processing node CPU or the second main processing node CPU maintains all reported subnet information tables, comprising:

detecting an initialization completion mark, wherein the presence of the mark indicates that the corresponding subnet information table exists, detecting the subnet information table once, detecting a heartbeat signal, and setting the subnet information table to be invalid when no heartbeat exists.