CN113079093B - Routing method based on hierarchical Q-routing planning - Google Patents
Routing method based on hierarchical Q-routing planning Download PDFInfo
- Publication number
- CN113079093B CN113079093B CN202110389260.2A CN202110389260A CN113079093B CN 113079093 B CN113079093 B CN 113079093B CN 202110389260 A CN202110389260 A CN 202110389260A CN 113079093 B CN113079093 B CN 113079093B
- Authority
- CN
- China
- Prior art keywords
- sub
- learning module
- network structure
- layer
- path information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/12—Shortest path evaluation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/12—Discovery or management of network topologies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/14—Routing performance; Theoretical aspects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/24—Multipath
Abstract
The invention discloses a routing method based on hierarchical Q-routing planning, which obtains a high-efficiency data transmission link by sensing the congestion condition of a network and the use condition of an interconnection link and performing global hierarchical parallel planning. The algorithm of the invention is a routing algorithm based on a lookup table, the routing algorithm stores the planned direction in the routing table in the learning module of each router node, and the data packet obtains the path information by accessing the routing table in the learning module of the router node. The invention constructs hierarchical design on the basis of split Q-routing, and greatly reduces the convergence time of the algorithm by using a multilayer congestion sensor and multilayer parallel learning, thereby improving the network-on-chip data transmission efficiency, compressing a routing table and reducing the hardware resource consumption.
Description
Technical Field
The invention belongs to the technical field of communication of integrated circuit network-on-chip, and particularly relates to a network-on-chip routing method based on hierarchical Q-routing planning.
Background
With the gradual failure of moore's law, the development of the semiconductor process is gradually slowed down, and the working frequency of the single-core processor is difficult to rapidly increase when encountering a bottleneck. The System on Chip (SoC) of the traditional bus structure has the disadvantages of poor expansibility, low parallelism and the like, and a new method, namely Network on Chip (NoC) communication, other than the traditional bus is needed to improve the working frequency of the whole Chip. The NoC has good expansibility, can process data of a plurality of IP cores in a chip in parallel, and effectively solves the problems of power consumption, performance, area and the like.
The NoC comprises the aspects of a topological structure, a routing algorithm, a switching technology and the like, and the patent researches the routing algorithm. Routing algorithms provide the direction of transmission for packets in nocs, a ring of great importance in nocs. An excellent routing algorithm will improve transmission efficiency and increase throughput through rapid, reasonable path planning.
Split Q-routing is a reinforcement learning based network-on-chip routing algorithm. To find the shortest route path between the source router node and the target router node. The problems of data delay, power consumption increase, temperature rise of the router and the like caused by the transmission of a large amount of data by the NoC can be solved well. However, as the scale of the network on chip is continuously increased, the network congestion will be more and more serious, and the split Q-routing will have the problem that the path planning time is too long, so that the timeliness is lost and the requirement is difficult to meet.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a routing method based on hierarchical Q-routing planning, which aims to make up the defects of the traditional Q-routing, further improve the transmission performance of the NoC, reduce the power consumption and increase the throughput; meanwhile, the area of a hardware circuit can be reduced by compressing the routing table.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to a routing method based on hierarchical Q-routing planning, which is applied to a network on chip consisting of w router nodes, w resource nodes and a plurality of interconnected channels, wherein the router nodes comprise input ports, output ports, congestion perceptrons, multi-way gates and access routing tables; the method is characterized in that a learning module is arranged in the router node; the learning module includes: the system comprises a learning mode arbitrator, a hierarchical control module, a routing table selection module, 3 sub-learning modules and 3 routing tables; the routing method comprises the following steps:
step 1: dividing all router nodes into three-layer network structures according to the following rules, thereby forming a pyramid structure; the rule is as follows:
w router nodes per x in a layer 1 network structure2Each node is divided into a group, thereby forming a group consisting of A layer 1 network structure formed by virtual router groups;
in a layer 2 network structureEach virtual router group in every y2Each node is divided into a group, thereby forming a group consisting of A layer 2 network structure formed by virtual router groups;
in a layer 3 network structureA virtual router group with z2Dividing the nodes into a group, thereby forming a layer 3 consisting of 1 virtual router group;
the 3 sub-learning modules and the 3 routing tables correspond to the network structures of all layers; each sub-learning module includes: r matrix, Q value comparator;
let L1 i,hRepresents the h sub-learning module, i, corresponding to the ith router node in the 1 st network structure<x2,h=1,2,3;
Let Lj iIndicating the ith virtual router group in the jth layer network structure, wherein j is not equal to 1;
let L3 i:L2 iIndicating an ith virtual router group in the layer 2 network structure in the ith virtual router group in the layer 3 network structure;
let L3 i:L2 i:L1 i,hThe h sub-learning module represents the h sub-learning module corresponding to the ith router node in the 1 st network structure in the ith virtual router group in the 2 nd network structure in the ith virtual router group in the 3 rd network structure;
step 2: each router node senses the congestion degree and learns in parallel in each layer of network structure;
step 2.1: with L3 i:L2 i:L1 i,1The 1 st sub-learning module is used as the current router node; for the current router node in the 1 st layer network structure, the occupancy rate counting is carried out on the input port of the current router node through the congestion sensor of the current router node, the flow counting is carried out on the output port of the current router node, and therefore the 1 st layer congestion level on each path is obtained and stored in the L in the 1 st layer network structure3 i:L2 i:L1 i,1In the R matrix of the 1 st sub-learning module represented;
for L in layer 2 network structure3 i:L2 i:L1 i,2The represented 2 nd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 2 nd network structure to obtain pooled congestion levels, and stores the pooled congestion levels into the L in the 2 nd network structure3 i:L2 i:L1 i,2In the R matrix of the 2 nd sub-learning module represented;
for L in layer 3 network structure3 i:L2 i:L1 i,3The 3 rd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 3 rd network structure to obtain pooled congestion levels, and stores the pooled congestion levels in the topmost virtual router groupL in a layer network structure3 i:L2 i:L1 i,3In the R matrix of the 3 rd sub-learning module represented;
step 2.2: initializing i to 0;
step 2.3: mixing L with3 i:L2 i:L1 i,hThe reward values in the sub-learning modules represented by h 1,2 and 3 are initialized to the target reward value; initializing the reward values in the sub-learning modules of other router nodes to 0;
step 2.4: for L in layer 1 network structure3 i:L2 i:L1 i,1The 1 st sub-learning module reads the maximum reward value of the sub-learning module on the adjacent router node and according to L3 i:L2 i:L1 i,1The R matrix in the 1 st sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to transmit to all adjacent router nodes;
for L in layer 2 network structure3 i:L2 i:L1 0,2The 2 nd sub-learning module, read and L3 i:L2 i:L1 0,2The maximum reward value of the 2 nd sub-learning module at the same position in the virtual router group adjacent to the 2 nd sub-learning module is expressed and is taken as L3 i:L2 iThe maximum reward value of all nodes in the represented ith virtual router group; according to item L3 i:L2 i:L1 0,2The R matrix in the 2 nd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,2The 2 nd sub-learning module is adjacent to the virtual router groupThe 2 nd sub-learning module at the same position;
for L in layer 3 network structure3 i:L2 i:L1 0,33 rd sub-learning module, read and L3 i:L2 i:L1 0,3The maximum reward value of the 3 rd sub-learning module at the same position in the 3 rd sub-learning module adjacent virtual router group is taken as the ith virtual router group L in the 3 rd layer network structure3 iMaximum reward values for all nodes in the set; according to L in layer 3 network structures3 i:L2 i:L1 0,3The R matrix in the 3 rd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,3The 3 rd sub-learning module at the same position in the virtual router group adjacent to the 3 rd sub-learning module;
step 2.5: after the path information of the next hop of each sub-learning module of each router node is obtained, the path information of the next hop of each sub-learning module of each router node is transmitted to the hierarchical control module in parallel;
and step 3: and correcting the path information in the hierarchical control module, namely: correcting path information from a high-layer network structure to a low-layer network structure, and then performing reverse direction transmission on the corrected path information from the low-layer network structure to the high-layer network structure, so that each sub-learning module of the router node obtains the corrected path information, and stores the corrected path information into a routing table of a corresponding layer network structure;
and 4, step 4: assigning i +1 to i, and returning to the step 2.3 until i is max-1, so as to complete the path planning of each router node as a destination node, wherein max represents the maximum number of router nodes in the virtual router group in each layer of the network structure;
and 5: and (3) transmission of network-on-chip data packets:
the data packet in the network-on-chip accesses the routing table in each layer network structure, sequentially passes through the input port and the multi-way gate, and the multi-way gate performs access operation on the routing table selection module in the learning module;
the routing table selection module reads the position information of the destination router node in the data packet, accesses the routing table according to the access rule and takes out the path information stored in the routing table:
and according to the taken-out path information in the routing table, if the taken-out path information is return information, the destination router node transmits the data packet to a packet receiver of the destination router node, otherwise, the data packet is transmitted to a corresponding output port according to the taken-out path information, so that the transmission of the data packet is completed.
The routing algorithm based on the hierarchical Q-routing plan is also characterized in that the step 3 is carried out according to the following steps:
step 3.1, the learning mode arbitrator judges the source of the path information sent to the hierarchical control module, if the source is L3 i:L2 i:L1 0,hWhen h is 1,2,3, the 1 st sub-learning module, the 2 nd sub-learning module, and the 3 rd sub-learning module represent that the path information is from the sub-learning module located at the same position in the adjacent virtual router group; otherwise, the path information is from the h sub-learning module corresponding to the 0 th router node in the same virtual router group;
step 3.2, correcting the path information from the layer 3 network structure to the layer 1 network structure in sequence according to the correction rule;
step 3.3, judging whether the reverse transmission rule is met, if so, sequentially performing reverse transmission from the 1 st network structure to the 3 rd network structure according to the reverse transmission rule, and otherwise, sequentially performing reverse transmission from the 1 st network structure to the 3 rd network structure according to a strategy 1 or a strategy 2; wherein, the strategy 1 is to directly obtain the uncorrected path information of the current layer; strategy 2 is to obtain the uncorrected path information of the lower layer;
and 3.4, sending the corrected and reversely transmitted path information into a routing table of a corresponding layer network structure.
The correction rule is as follows:
if L is3 i:L2 i:L1 i,2The 2 nd sub-learning module is shown located at the position of the adjacent virtual router group in the layer 3 grid structure, and L3 i:L2 i:L1 i,3The path information in the 3 rd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module is changed to L3 i:L2 i:L1 i,3The path information of the 3 rd sub-learning module is represented, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module;
if L is3 i:L2 i:L1 i,1The 1 st sub-learning module is located at the position of the adjacent virtual router group in the layer 2 grid structure, and L3 i:L2 i:L1 i,2The path information in the 2 nd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module is changed to L3 i:L2 i:L1 i,2The expressed path information of the 2 nd sub-learning module is divided, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module;
the reverse transmission rule is as follows:
if L is3 i:L2 i:L1 i,hWhen the path information of the 1 st sub-learning module and the 2 nd sub-learning module denoted by h 1,2 is corrected, L is corrected3 i:L2 i:L1 i,hPath information of the 1 st and 2 nd sub-learning modules denoted by h 1,2 is inversely transmitted to L3 i:L2 i:L1 i,(h+1)And h is 1,2 represents the 2 nd sub-learning module and the 3 rd sub-learning module, otherwise, the reverse transmission is not carried out.
And the target reward value is (p-1) multiplied by q, wherein p represents the number of the router nodes in the virtual router group in the corresponding layer network structure, and q represents the weighted value corresponding to the maximum congestion level.
The rule of the weighting processing is as follows:
if the next hop is a path, subtracting y from the maximum reward value;
if the next jump is the first-level blockage, subtracting 3 multiplied by y +1 from the maximum reward value;
if the next hop is a secondary occlusion, then the maximum reward value is subtracted by 3 × (3 × y +1) + 1;
if the next hop is a device edge or temporary deactivation, the zero maximum reward value is zero; wherein y represents a positive integer.
The access rule is as follows:
step a, initializing i to 3;
and b, comparing the data packet from the i-th layer network structure according to the position information of the destination router node, if the destination router node of the data packet is in the group corresponding to the i-th layer network structure, assigning i-1 to i, and returning to the step b until i is equal to 1, otherwise, accessing the routing table in the i-th layer network structure by the data packet.
Compared with the prior art, the invention has the beneficial effects that:
1. the network-on-chip routing method based on hierarchical Q-routing planning of the invention comprises the steps of forming NoC systems of various scales by multiplexing a learning module; each router node performs hierarchical parallel learning, and the time for path planning is greatly reduced, so that the real-time change of the NoC network environment blocking condition is better adapted.
2. The invention divides and compresses the routing table, and integrates a plurality of routing tables into a complete routing table through mapping, thereby greatly reducing the resources occupied by the routing table, and the reduction amplitude is further increased along with the increase of the scale of the NoC network. Taking an 8 × 8 routing network as an example, 8 × 8 × 4 bits (4 bits represent four directions) are needed before layering is not introduced, and if three Q-routing are performed, only 3 × 4 × 4 bits (4 nodes in each area) are needed, which can be seen as an 80% reduction in area resources.
3. According to the invention, through reducing the learning levels of part of nodes, namely, the small part of nodes are subjected to multi-layer learning, and the large part of nodes are subjected to non-multi-layer learning, the consumption of Q-routing circuit area resources is reduced, and meanwhile, the layout and wiring of the whole system circuit are facilitated.
Drawings
FIG. 1 is a diagram of a router node structure according to the present invention;
FIG. 2 is a schematic diagram of a learning module configuration according to the present invention;
FIG. 3 is a hierarchical diagram of a router node according to the present invention;
FIG. 4 is a block diagram of a hierarchical Q-routing implementation of the present invention;
FIG. 5 is an exemplary diagram of a correction rule and a reverse transmission rule according to the present invention;
FIG. 6 is a flow chart of a routing table read in accordance with the present invention;
fig. 7 is a flow chart of an example NoC system of the present invention.
Detailed Description
In the routing method based on hierarchical Q-routing planning in this embodiment, in a network on chip including 64 router nodes, 64 resource nodes, and a plurality of interconnection channels, configuration is performed with reference to a learning module configuration manner shown in fig. 1 and 2, where a router node includes an input port, an output port, a congestion sensor, a multi-way gate, and an access routing table; the method is characterized in that a learning module is arranged in a router node; referring to fig. 4, the learning module includes: the system comprises a learning mode arbitrator, a hierarchical control module, a routing table selection module, 3 sub-learning modules and 3 routing tables. The number of the network structure, the sub-learning modules and the routing tables can be increased to adapt to more router node networks; referring to fig. 7, the routing method is performed as follows:
step 1: referring to fig. 3, all router nodes are divided into three-layer network structures according to the following rules, so as to form a pyramid structure; the purpose of such layering is to enable parallel learning and obtain a transmission path with short time consumption and high transmission rate. The rule is:
dividing 64 router nodes into one group by 4 nodes in a layer 1 network structure, so as to form the layer 1 network structure consisting of 16 virtual router groups;
dividing 16 virtual router nodes into one group by every 4 nodes in a layer 2 network structure, thereby forming the layer 2 network structure consisting of 4 virtual router groups;
allocating 4 virtual router nodes in z in a layer 3 network structure2Dividing the nodes into a group, thereby forming a layer 3 consisting of 1 virtual router group;
3 sub-learning modules and 3 routing tables; each sub-learning module includes: r matrix, Q value comparator;
let L1 0,hRepresents the h sub-learning module, i, corresponding to the ith router node in the 1 st network structure<x2,h=1,2,3;
Let Lj iIndicating the ith virtual router group in the jth layer network structure, wherein j is not equal to 1;
let L3 i:L2 iIndicating an ith virtual router group in the layer 2 network structure in the ith virtual router group in the layer 3 network structure;
let L3 i:L2 i:L1 i,hThe ith virtual router group in the 3 rd network structure is represented, and the ith virtual router group in the 2 nd network structure is positioned in the ith virtual router group in the 1 st network structure and corresponds to the ith router nodeh sub-learning modules;
step 2: each router node senses the congestion degree and learns in parallel in each layer of network structure;
step 2.1: with L3 i:L2 i:L1 i,1The 1 st sub-learning module is used as the current router node; for the current router node in the 1 st layer network structure, the occupancy rate counting is carried out on the input port of the current router node through the congestion sensor of the current router node, the flow counting is carried out on the output port of the current router node, and therefore the 1 st layer congestion level on each path is obtained and stored in the L in the 1 st layer network structure3 i:L2 i:L1 i,1In the R matrix of the 1 st sub-learning module represented; the congestion level R value 3 'b 000 represents a channel, 3' b001 represents a primary blockage, 3 'b 010 represents a secondary blockage, and 3' b111 represents an unreachable channel (fully blocked or temporarily deactivated state).
For L in layer 2 network structure3 i:L2 i:L1 i,2The represented 2 nd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 2 nd network structure to obtain pooled congestion levels, and stores the pooled congestion levels into the L in the 2 nd network structure3 i:L2 i:L1 i,2In the R matrix of the 2 nd sub-learning module represented;
for L in layer 3 network structure3 i:L2 i:L1 i,3The 3 rd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 3 rd network structure to obtain pooled congestion levels, and stores the pooled congestion levels into the L in the topmost network structure3 i:L2 i:L1 i,3In the R matrix of the 3 rd sub-learning module represented;
step 2.2: initializing i to 0;
step 2.3: mixing L with3 i:L2 i:L1 i,hThe reward values in the sub-learning modules represented by h 1,2 and 3 are initialized to the target reward value; in this embodiment, the destination reward value is 39, the number of router nodes in the virtual router group is 4, and the weighting value corresponding to the maximum congestion level is 13. Initializing the reward values in the sub-learning modules of other router nodes to 0;
step 2.4: for L in layer 1 network structure3 i:L2 i:L1 i,1The 1 st sub-learning module reads the maximum reward value of the sub-learning module on the adjacent router node and according to L3 i:L2 i:L1 i,1The R matrix in the 1 st sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to transmit to all adjacent router nodes;
for L in layer 2 network structure3 i:L2 i:L1 0,2The 2 nd sub-learning module, read and L3 i:L2 i:L1 0,2The maximum reward value of the 2 nd sub-learning module at the same position in the virtual router group adjacent to the 2 nd sub-learning module is expressed and is taken as L3 i:L2 iThe maximum reward value of all nodes in the represented ith virtual router group; according to item L3 i:L2 i:L1 0,2The R matrix in the 2 nd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,2The 2 nd sub-learning module at the same position in the virtual router group adjacent to the represented 2 nd sub-learning module;
for L in layer 3 network structure3 i:L2 i:L1 0,33 rd sub-learning module, read and L3 i:L2 i:L1 0,3The maximum reward value of the 3 rd sub-learning module at the same position in the 3 rd sub-learning module adjacent virtual router group is taken as the ith virtual router group L in the 3 rd layer network structure3 iMaximum reward values for all nodes in the set; according to L in layer 3 network structures3 i:L2 i:L1 0,3The R matrix in the 3 rd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,3The 3 rd sub-learning module at the same position in the virtual router group adjacent to the 3 rd sub-learning module;
in specific implementation, the rule of the weighting process is as follows:
if the next hop is a path, subtracting 1 from the maximum reward value;
if the next hop is the first-level blockage, subtracting 4 from the maximum reward value;
if the next hop is a secondary jam, subtracting 13 from the maximum reward value;
if the next hop is a device edge or temporary deactivation, the zero maximum reward value is zero;
step 2.5: after the path information of the next hop of each sub-learning module of each router node is obtained, the path information of the next hop of each sub-learning module of each router node is transmitted to the hierarchical control module in parallel;
and step 3: referring to fig. 5, the path information is modified in the hierarchical control module, that is: correcting path information from a high-layer network structure to a low-layer network structure, and then performing reverse direction transmission on the corrected path information from the low-layer network structure to the high-layer network structure, so that each sub-learning module of the router node obtains the corrected path information, and stores the corrected path information into a routing table of a corresponding layer network structure;
step 3.1: the learning mode arbitrator judges the source of the path information sent to the hierarchical control module, if the source is L3 i:L2 i:L1 0,hWhen h is 1,2,3, the 1 st sub-learning module, the 2 nd sub-learning module, and the 3 rd sub-learning module represent that the path information is from the sub-learning module located at the same position in the adjacent virtual router group; otherwise, the path information is from the h sub-learning module corresponding to the 0 th router node in the same virtual router group;
step 3.2: according to the correction rule, path information is corrected from the 3 rd layer network structure to the 1 st layer network structure in sequence;
step 3.3: judging whether the anti-transmission rule is met, if so, sequentially performing anti-transmission from the 1 st network structure to the 3 rd network structure according to the anti-transmission rule, and otherwise, sequentially performing anti-transmission from the 1 st network structure to the 3 rd network structure according to a strategy 1 or a strategy 2; wherein, the strategy 1 is to directly obtain the uncorrected path information of the current layer; strategy 2 is to obtain the uncorrected path information of the lower layer;
step 3.4: and sending the corrected and reversely transmitted path information into a routing table of a corresponding layer network structure.
The correction rule is:
if L is3 i:L2 i:L1 i,2The 2 nd sub-learning module is shown located at the position of the adjacent virtual router group in the layer 3 grid structure, and L3 i:L2 i:L1 i,3The path information in the 3 rd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module is changed to L3 i:L2 i:L1 i,3The path information of the 3 rd sub-learning module is represented, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module;
if L is3 i:L2 i:L1 i,1The 1 st sub-learning module is located at the position of the adjacent virtual router group in the layer 2 grid structure, and L3 i:L2 i:L1 i,2The path information in the 2 nd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module is changed to L3 i:L2 i:L1 i,2The expressed path information of the 2 nd sub-learning module is divided, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module;
the anti-transmission rule is as follows:
if L is3 i:L2 i:L1 i,hWhen the path information of the 1 st sub-learning module and the 2 nd sub-learning module denoted by h 1,2 is corrected, L is corrected3 i:L2 i:L1 i,hPath information of the 1 st and 2 nd sub-learning modules denoted by h 1,2 is inversely transmitted to L3 i:L2 i:L1 i,(h+1)H is 1,2 represents the 2 nd sub-learning module and the 3 rd sub-learning module, otherwise, the reverse transmission is not carried out;
and 4, step 4: and assigning i +1 to i, and returning to the step 2.3 until i is 3, thereby completing the path planning of each router node as a destination node.
And 5: referring to fig. 5, the transmission of the network-on-chip packet is as follows:
the data packet in the network-on-chip accesses the routing table in each layer network structure, sequentially passes through the input port and the multi-way gate, and the multi-way gate performs access operation on the routing table selection module in the learning module;
the routing table selection module reads the position information of the destination router node in the data packet, including the information of the nodes in the layer 3, layer 2 and layer 1, and accesses the routing table and takes out the path information stored in the routing table according to the following access rules:
step a, initializing i to 3;
and b, comparing the data packet from the i-th layer network structure according to the position information of the destination router node, if the destination router node of the data packet is in the group corresponding to the i-th layer network structure, assigning i-1 to i, and returning to the step b until i is equal to 1, otherwise, accessing the routing table in the i-th layer network structure by the data packet.
And according to the taken-out path information in the routing table, if the taken-out path information is return information, the destination router node transmits the data packet to a packet receiver of the destination router node, otherwise, the data packet is transmitted to a corresponding output port according to the taken-out path information, so that the transmission of the data packet is completed.
Claims (6)
1. A routing method based on hierarchical Q-routing planning is applied to a network on chip consisting of w router nodes, w resource nodes and a plurality of interconnecting channels, wherein the router nodes comprise input ports, output ports, congestion perceptrons, multi-way gates and access routing tables; the method is characterized in that a learning module is arranged in the router node; the learning module includes: the system comprises a learning mode arbitrator, a hierarchical control module, a routing table selection module, 3 sub-learning modules and 3 routing tables; the routing method comprises the following steps:
step 1: dividing all router nodes into three-layer network structures according to the following rules, thereby forming a pyramid structure; the rule is as follows:
w router nodes per x in a layer 1 network structure2Each node is divided into a group, thereby forming a group consisting ofA layer 1 network structure formed by virtual router groups;
in a layer 2 network structureEach virtual router group in every y2Each node is divided into a group, thereby forming a group consisting ofA layer 2 network structure formed by virtual router groups;
in a layer 3 network structureA virtual router group with z2Dividing the nodes into a group, thereby forming a layer 3 consisting of 1 virtual router group;
the 3 sub-learning modules and the 3 routing tables correspond to the network structures of all layers; each sub-learning module includes: r matrix, Q value comparator;
let L1 i,hRepresents the h sub-learning module, i, corresponding to the ith router node in the 1 st network structure<x2,h=1,2,3;
Let Lj iIndicating the ith virtual router group in the jth layer network structure, wherein j is not equal to 1;
let L3 i:L2 iIndicating an ith virtual router group in the layer 2 network structure in the ith virtual router group in the layer 3 network structure;
let L3 i:L2 i:L1 i,hThe h sub-learning module represents the h sub-learning module corresponding to the ith router node in the 1 st network structure in the ith virtual router group in the 2 nd network structure in the ith virtual router group in the 3 rd network structure;
step 2: each router node senses the congestion degree and learns in parallel in each layer of network structure;
step 2.1: with L3 i:L2 i:L1 i,1The 1 st sub-learning module is used as the current router node; for the current router node in the 1 st layer network structure, the occupancy rate counting is carried out on the input port of the current router node through the congestion sensor of the current router node, the flow counting is carried out on the output port of the current router node, and therefore the 1 st layer congestion level on each path is obtained and stored in the L in the 1 st layer network structure3 i:L2 i:L1 i,1In the R matrix of the 1 st sub-learning module represented;
for L in layer 2 network structure3 i:L2 i:L1 i,2The represented 2 nd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 2 nd network structure to obtain pooled congestion levels, and stores the pooled congestion levels into the L in the 2 nd network structure3 i:L2 i:L1 i,2In the R matrix of the 2 nd sub-learning module represented;
for L in layer 3 network structure3 i:L2 i:L1 i,3The 3 rd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 3 rd network structure to obtain pooled congestion levels, and stores the pooled congestion levels into the L in the topmost network structure3 i:L2 i:L1 i,3In the R matrix of the 3 rd sub-learning module represented;
step 2.2: initializing i to 0;
step 2.3: mixing L with3 i:L2 i:L1 i,hThe reward values in the sub-learning modules represented by h 1,2 and 3 are initialized to the target reward value; initializing the reward values in the sub-learning modules of other router nodes to 0;
step 2.4: for L in layer 1 network structure3 i:L2 i:L1 i,1The 1 st sub-learning module reads the maximum reward value of the sub-learning module on the adjacent router node and according to L3 i:L2 i:L1 i,1The R matrix in the 1 st sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to transmit to all adjacent router nodes;
for L in layer 2 network structure3 i:L2 i:L1 0,2The 2 nd sub-learning module, read and L3 i:L2 i:L1 0,2The maximum reward value of the 2 nd sub-learning module at the same position in the virtual router group adjacent to the 2 nd sub-learning module is expressed and is taken as L3 i:L2 iThe maximum reward value of all nodes in the represented ith virtual router group; according to item L3 i:L2 i:L1 0,2The R matrix in the 2 nd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,2The 2 nd sub-learning module at the same position in the virtual router group adjacent to the represented 2 nd sub-learning module;
for L in layer 3 network structure3 i:L2 i:L1 0,33 rd sub-learning module, read and L3 i:L2 i:L1 0,3The maximum reward value of the 3 rd sub-learning module at the same position in the 3 rd sub-learning module adjacent virtual router group is taken as the ith virtual router group L in the 3 rd layer network structure3 iMaximum reward values for all nodes in the set; according to L in layer 3 network structures3 i:L2 i:L1 0,3The R matrix in the 3 rd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,3The 3 rd sub-learning module at the same position in the virtual router group adjacent to the 3 rd sub-learning module;
step 2.5: after the path information of the next hop of each sub-learning module of each router node is obtained, the path information of the next hop of each sub-learning module of each router node is transmitted to the hierarchical control module in parallel;
and step 3: and correcting the path information in the hierarchical control module, namely: correcting path information from a high-layer network structure to a low-layer network structure, and then performing reverse direction transmission on the corrected path information from the low-layer network structure to the high-layer network structure, so that each sub-learning module of the router node obtains the corrected path information, and stores the corrected path information into a routing table of a corresponding layer network structure;
and 4, step 4: assigning i +1 to i, and returning to the step 2.3 until i is max-1, so as to complete the path planning of each router node as a destination node, wherein max represents the maximum number of router nodes in the virtual router group in each layer of the network structure;
and 5: and (3) transmission of network-on-chip data packets:
the data packet in the network-on-chip accesses the routing table in each layer network structure, sequentially passes through the input port and the multi-way gate, and the multi-way gate performs access operation on the routing table selection module in the learning module;
the routing table selection module reads the position information of the destination router node in the data packet, accesses the routing table according to the access rule and takes out the path information stored in the routing table:
and according to the taken-out path information in the routing table, if the taken-out path information is return information, the destination router node transmits the data packet to a packet receiver of the destination router node, otherwise, the data packet is transmitted to a corresponding output port according to the taken-out path information, so that the transmission of the data packet is completed.
2. The routing method based on hierarchical Q-routing planning of claim 1, wherein the step 3 is performed as follows:
step 3.1, the learning mode arbitrator judges the source of the path information sent to the hierarchical control module, if the source is L3 i:L2 i:L1 0,hWhen h is 1,2,3, the 1 st sub-learning module, the 2 nd sub-learning module, and the 3 rd sub-learning module represent that the path information is from the sub-learning module located at the same position in the adjacent virtual router group; otherwise, the path information is from the h sub-learning module corresponding to the 0 th router node in the same virtual router group;
step 3.2, correcting the path information from the layer 3 network structure to the layer 1 network structure in sequence according to the correction rule;
step 3.3, judging whether the anti-transmission rule is met, if so, sequentially performing anti-transmission from the 1 st network structure to the 3 rd network structure according to the anti-transmission rule, and otherwise, sequentially performing anti-transmission from the 1 st network structure to the 3 rd network structure according to the strategy 1 or the strategy 2; wherein, the strategy 1 is to directly obtain the uncorrected path information of the current layer; strategy 2 is to obtain the uncorrected path information of the lower layer;
and 3.4, sending the corrected and reversely transmitted path information into a routing table of a corresponding layer network structure.
3. The hierarchical Q-routing based routing method of claim 2,
the correction rule is as follows:
if L is3 i:L2 i:L1 i,2The 2 nd sub-learning module represented is located in the layer 3 grid structure adjacent to the virtual routeIn the position of the group, and L3 i:L2 i:L1 i,3The path information in the 3 rd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module is changed to L3 i:L2 i:L1 i,3The path information of the 3 rd sub-learning module is represented, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module;
if L is3 i:L2 i:L1 i,1The 1 st sub-learning module is located at the position of the adjacent virtual router group in the layer 2 grid structure, and L3 i:L2 i:L1 i,2The path information in the 2 nd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module is changed to L3 i:L2 i:L1 i,2The expressed path information of the 2 nd sub-learning module is divided, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module;
the reverse transmission rule is as follows:
if L is3 i:L2 i:L1 i,hWhen the path information of the 1 st sub-learning module and the 2 nd sub-learning module denoted by h 1,2 is corrected, L is corrected3 i:L2 i:L1 i,hPath information of the 1 st and 2 nd sub-learning modules denoted by h 1,2 is inversely transmitted to L3 i:L2 i:L1 i,(h+1)And h is 1,2 represents the 2 nd sub-learning module and the 3 rd sub-learning module, otherwise, the reverse transmission is not carried out.
4. The hierarchical Q-routing scheme-based routing method of claim 1,
and the target reward value is (p-1) multiplied by q, wherein p represents the number of the router nodes in the virtual router group in the corresponding layer network structure, and q represents the weighted value corresponding to the maximum congestion level.
5. The hierarchical Q-routing scheme-based routing method of claim 1,
the rule of the weighting processing is as follows:
if the next hop is a path, subtracting y from the maximum reward value;
if the next jump is the first-level blockage, subtracting 3 multiplied by y +1 from the maximum reward value;
if the next hop is a secondary occlusion, then the maximum reward value is subtracted by 3 × (3 × y +1) + 1;
if the next hop is a device edge or temporary deactivation, the zero maximum reward value is zero; wherein y represents a positive integer.
6. The hierarchical Q-routing scheme-based routing method of claim 1,
the access rule is as follows:
step a, initializing i to 3;
and b, comparing the data packet from the i-th layer network structure according to the position information of the destination router node, if the destination router node of the data packet is in the group corresponding to the i-th layer network structure, assigning i-1 to i, and returning to the step b until i is equal to 1, otherwise, accessing the routing table in the i-th layer network structure by the data packet.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110389260.2A CN113079093B (en) | 2021-04-12 | 2021-04-12 | Routing method based on hierarchical Q-routing planning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110389260.2A CN113079093B (en) | 2021-04-12 | 2021-04-12 | Routing method based on hierarchical Q-routing planning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113079093A CN113079093A (en) | 2021-07-06 |
CN113079093B true CN113079093B (en) | 2022-03-15 |
Family
ID=76617277
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110389260.2A Active CN113079093B (en) | 2021-04-12 | 2021-04-12 | Routing method based on hierarchical Q-routing planning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113079093B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111522775A (en) * | 2020-04-22 | 2020-08-11 | 合肥工业大学 | Network-on-chip routing device and control method thereof |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9787571B2 (en) * | 2014-12-22 | 2017-10-10 | Intel Corporation | Link delay based routing apparatus for a network-on-chip |
CN111770019B (en) * | 2020-05-13 | 2021-06-15 | 西安电子科技大学 | Q-learning optical network-on-chip self-adaptive route planning method based on Dijkstra algorithm |
-
2021
- 2021-04-12 CN CN202110389260.2A patent/CN113079093B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111522775A (en) * | 2020-04-22 | 2020-08-11 | 合肥工业大学 | Network-on-chip routing device and control method thereof |
Non-Patent Citations (2)
Title |
---|
一种基于预判机制的极化码译码算法及 VLSI架构;杜高明 等;《微电子学与计算机》;20191231;全文 * |
片上网络层次划分及多目标映射技术研究;张泽奇;《中国优秀硕士学位论文全文数据库》;20090415(第02期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113079093A (en) | 2021-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Trik et al. | A hybrid selection strategy based on traffic analysis for improving performance in networks on chip | |
JP5860670B2 (en) | Table-driven routing in a Dragonfly processor interconnect network | |
US7555001B2 (en) | On-chip packet-switched communication system | |
US9825844B2 (en) | Network topology of hierarchical ring with recursive shortcuts | |
CN101834789B (en) | Packet-circuit exchanging on-chip router oriented rollback steering routing algorithm and router used thereby | |
EP2596603B1 (en) | Ethernet switch and method for routing ethernet data packets | |
CN103973564B (en) | The adaptive routing method of interconnected network system | |
US9529775B2 (en) | Network topology of hierarchical ring with gray code and binary code | |
CN110830394B (en) | Method for generating routing table based on RapidIO network | |
CN116260760A (en) | Topology reconstruction method based on flow sensing in multi-core interconnection network | |
CN111245730B (en) | Routing system and communication method of network on chip | |
CN114116596A (en) | Dynamic relay-based infinite routing method and architecture for neural network on chip | |
CN117135059B (en) | Network topology structure, construction method, routing algorithm, equipment and medium | |
CN116886591B (en) | Computer network system and routing method | |
CN113079093B (en) | Routing method based on hierarchical Q-routing planning | |
CN113839878A (en) | Data-intensive application-oriented network-on-chip approximate communication system | |
Ebrahimi et al. | Partitioning methods for unicast/multicast traffic in 3D NoC architecture | |
CN112001141B (en) | Brain network inspired middle-large scale on-die interconnection system comprehensive method | |
AU2020101176A4 (en) | Exploring a new adaptive routing based on dijkstra algorithm in optical networks-on-chip | |
US20040073699A1 (en) | Dynamic routing method for multistage bus networks in distributed shared memory environment | |
US7236497B2 (en) | Facilitating arbitration via information associated with groups of requesters | |
Zhang et al. | A cellular NoC architecture based on butterfly network coding (CBNoC) | |
CN115499271B (en) | Hybrid network topology structure and routing method thereof | |
CN117135107B (en) | Network communication topology system, routing method, device and medium | |
CN116383126A (en) | Network-on-chip-based transmission method of deep neural network accelerator |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |