CN113079093B - Routing method based on hierarchical Q-routing planning - Google Patents

Routing method based on hierarchical Q-routing planning Download PDF

Info

Publication number
CN113079093B
CN113079093B CN202110389260.2A CN202110389260A CN113079093B CN 113079093 B CN113079093 B CN 113079093B CN 202110389260 A CN202110389260 A CN 202110389260A CN 113079093 B CN113079093 B CN 113079093B
Authority
CN
China
Prior art keywords
sub
learning module
network structure
layer
path information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110389260.2A
Other languages
Chinese (zh)
Other versions
CN113079093A (en
Inventor
李桢旻
翁晓峰
王镜涵
李天瑜
马宇晴
杜高明
宋宇鲲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202110389260.2A priority Critical patent/CN113079093B/en
Publication of CN113079093A publication Critical patent/CN113079093A/en
Application granted granted Critical
Publication of CN113079093B publication Critical patent/CN113079093B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/12Shortest path evaluation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/14Routing performance; Theoretical aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/24Multipath

Abstract

The invention discloses a routing method based on hierarchical Q-routing planning, which obtains a high-efficiency data transmission link by sensing the congestion condition of a network and the use condition of an interconnection link and performing global hierarchical parallel planning. The algorithm of the invention is a routing algorithm based on a lookup table, the routing algorithm stores the planned direction in the routing table in the learning module of each router node, and the data packet obtains the path information by accessing the routing table in the learning module of the router node. The invention constructs hierarchical design on the basis of split Q-routing, and greatly reduces the convergence time of the algorithm by using a multilayer congestion sensor and multilayer parallel learning, thereby improving the network-on-chip data transmission efficiency, compressing a routing table and reducing the hardware resource consumption.

Description

Routing method based on hierarchical Q-routing planning
Technical Field
The invention belongs to the technical field of communication of integrated circuit network-on-chip, and particularly relates to a network-on-chip routing method based on hierarchical Q-routing planning.
Background
With the gradual failure of moore's law, the development of the semiconductor process is gradually slowed down, and the working frequency of the single-core processor is difficult to rapidly increase when encountering a bottleneck. The System on Chip (SoC) of the traditional bus structure has the disadvantages of poor expansibility, low parallelism and the like, and a new method, namely Network on Chip (NoC) communication, other than the traditional bus is needed to improve the working frequency of the whole Chip. The NoC has good expansibility, can process data of a plurality of IP cores in a chip in parallel, and effectively solves the problems of power consumption, performance, area and the like.
The NoC comprises the aspects of a topological structure, a routing algorithm, a switching technology and the like, and the patent researches the routing algorithm. Routing algorithms provide the direction of transmission for packets in nocs, a ring of great importance in nocs. An excellent routing algorithm will improve transmission efficiency and increase throughput through rapid, reasonable path planning.
Split Q-routing is a reinforcement learning based network-on-chip routing algorithm. To find the shortest route path between the source router node and the target router node. The problems of data delay, power consumption increase, temperature rise of the router and the like caused by the transmission of a large amount of data by the NoC can be solved well. However, as the scale of the network on chip is continuously increased, the network congestion will be more and more serious, and the split Q-routing will have the problem that the path planning time is too long, so that the timeliness is lost and the requirement is difficult to meet.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a routing method based on hierarchical Q-routing planning, which aims to make up the defects of the traditional Q-routing, further improve the transmission performance of the NoC, reduce the power consumption and increase the throughput; meanwhile, the area of a hardware circuit can be reduced by compressing the routing table.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to a routing method based on hierarchical Q-routing planning, which is applied to a network on chip consisting of w router nodes, w resource nodes and a plurality of interconnected channels, wherein the router nodes comprise input ports, output ports, congestion perceptrons, multi-way gates and access routing tables; the method is characterized in that a learning module is arranged in the router node; the learning module includes: the system comprises a learning mode arbitrator, a hierarchical control module, a routing table selection module, 3 sub-learning modules and 3 routing tables; the routing method comprises the following steps:
step 1: dividing all router nodes into three-layer network structures according to the following rules, thereby forming a pyramid structure; the rule is as follows:
w router nodes per x in a layer 1 network structure2Each node is divided into a group, thereby forming a group consisting of
Figure BDA0003015849010000021
A layer 1 network structure formed by virtual router groups;
in a layer 2 network structure
Figure BDA0003015849010000022
Each virtual router group in every y2Each node is divided into a group, thereby forming a group consisting of
Figure BDA0003015849010000023
A layer 2 network structure formed by virtual router groups;
in a layer 3 network structure
Figure BDA0003015849010000024
A virtual router group with z2Dividing the nodes into a group, thereby forming a layer 3 consisting of 1 virtual router group;
the 3 sub-learning modules and the 3 routing tables correspond to the network structures of all layers; each sub-learning module includes: r matrix, Q value comparator;
let L1 i,hRepresents the h sub-learning module, i, corresponding to the ith router node in the 1 st network structure<x2,h=1,2,3;
Let Lj iIndicating the ith virtual router group in the jth layer network structure, wherein j is not equal to 1;
let L3 i:L2 iIndicating an ith virtual router group in the layer 2 network structure in the ith virtual router group in the layer 3 network structure;
let L3 i:L2 i:L1 i,hThe h sub-learning module represents the h sub-learning module corresponding to the ith router node in the 1 st network structure in the ith virtual router group in the 2 nd network structure in the ith virtual router group in the 3 rd network structure;
step 2: each router node senses the congestion degree and learns in parallel in each layer of network structure;
step 2.1: with L3 i:L2 i:L1 i,1The 1 st sub-learning module is used as the current router node; for the current router node in the 1 st layer network structure, the occupancy rate counting is carried out on the input port of the current router node through the congestion sensor of the current router node, the flow counting is carried out on the output port of the current router node, and therefore the 1 st layer congestion level on each path is obtained and stored in the L in the 1 st layer network structure3 i:L2 i:L1 i,1In the R matrix of the 1 st sub-learning module represented;
for L in layer 2 network structure3 i:L2 i:L1 i,2The represented 2 nd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 2 nd network structure to obtain pooled congestion levels, and stores the pooled congestion levels into the L in the 2 nd network structure3 i:L2 i:L1 i,2In the R matrix of the 2 nd sub-learning module represented;
for L in layer 3 network structure3 i:L2 i:L1 i,3The 3 rd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 3 rd network structure to obtain pooled congestion levels, and stores the pooled congestion levels in the topmost virtual router groupL in a layer network structure3 i:L2 i:L1 i,3In the R matrix of the 3 rd sub-learning module represented;
step 2.2: initializing i to 0;
step 2.3: mixing L with3 i:L2 i:L1 i,hThe reward values in the sub-learning modules represented by h 1,2 and 3 are initialized to the target reward value; initializing the reward values in the sub-learning modules of other router nodes to 0;
step 2.4: for L in layer 1 network structure3 i:L2 i:L1 i,1The 1 st sub-learning module reads the maximum reward value of the sub-learning module on the adjacent router node and according to L3 i:L2 i:L1 i,1The R matrix in the 1 st sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to transmit to all adjacent router nodes;
for L in layer 2 network structure3 i:L2 i:L1 0,2The 2 nd sub-learning module, read and L3 i:L2 i:L1 0,2The maximum reward value of the 2 nd sub-learning module at the same position in the virtual router group adjacent to the 2 nd sub-learning module is expressed and is taken as L3 i:L2 iThe maximum reward value of all nodes in the represented ith virtual router group; according to item L3 i:L2 i:L1 0,2The R matrix in the 2 nd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,2The 2 nd sub-learning module is adjacent to the virtual router groupThe 2 nd sub-learning module at the same position;
for L in layer 3 network structure3 i:L2 i:L1 0,33 rd sub-learning module, read and L3 i:L2 i:L1 0,3The maximum reward value of the 3 rd sub-learning module at the same position in the 3 rd sub-learning module adjacent virtual router group is taken as the ith virtual router group L in the 3 rd layer network structure3 iMaximum reward values for all nodes in the set; according to L in layer 3 network structures3 i:L2 i:L1 0,3The R matrix in the 3 rd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,3The 3 rd sub-learning module at the same position in the virtual router group adjacent to the 3 rd sub-learning module;
step 2.5: after the path information of the next hop of each sub-learning module of each router node is obtained, the path information of the next hop of each sub-learning module of each router node is transmitted to the hierarchical control module in parallel;
and step 3: and correcting the path information in the hierarchical control module, namely: correcting path information from a high-layer network structure to a low-layer network structure, and then performing reverse direction transmission on the corrected path information from the low-layer network structure to the high-layer network structure, so that each sub-learning module of the router node obtains the corrected path information, and stores the corrected path information into a routing table of a corresponding layer network structure;
and 4, step 4: assigning i +1 to i, and returning to the step 2.3 until i is max-1, so as to complete the path planning of each router node as a destination node, wherein max represents the maximum number of router nodes in the virtual router group in each layer of the network structure;
and 5: and (3) transmission of network-on-chip data packets:
the data packet in the network-on-chip accesses the routing table in each layer network structure, sequentially passes through the input port and the multi-way gate, and the multi-way gate performs access operation on the routing table selection module in the learning module;
the routing table selection module reads the position information of the destination router node in the data packet, accesses the routing table according to the access rule and takes out the path information stored in the routing table:
and according to the taken-out path information in the routing table, if the taken-out path information is return information, the destination router node transmits the data packet to a packet receiver of the destination router node, otherwise, the data packet is transmitted to a corresponding output port according to the taken-out path information, so that the transmission of the data packet is completed.
The routing algorithm based on the hierarchical Q-routing plan is also characterized in that the step 3 is carried out according to the following steps:
step 3.1, the learning mode arbitrator judges the source of the path information sent to the hierarchical control module, if the source is L3 i:L2 i:L1 0,hWhen h is 1,2,3, the 1 st sub-learning module, the 2 nd sub-learning module, and the 3 rd sub-learning module represent that the path information is from the sub-learning module located at the same position in the adjacent virtual router group; otherwise, the path information is from the h sub-learning module corresponding to the 0 th router node in the same virtual router group;
step 3.2, correcting the path information from the layer 3 network structure to the layer 1 network structure in sequence according to the correction rule;
step 3.3, judging whether the reverse transmission rule is met, if so, sequentially performing reverse transmission from the 1 st network structure to the 3 rd network structure according to the reverse transmission rule, and otherwise, sequentially performing reverse transmission from the 1 st network structure to the 3 rd network structure according to a strategy 1 or a strategy 2; wherein, the strategy 1 is to directly obtain the uncorrected path information of the current layer; strategy 2 is to obtain the uncorrected path information of the lower layer;
and 3.4, sending the corrected and reversely transmitted path information into a routing table of a corresponding layer network structure.
The correction rule is as follows:
if L is3 i:L2 i:L1 i,2The 2 nd sub-learning module is shown located at the position of the adjacent virtual router group in the layer 3 grid structure, and L3 i:L2 i:L1 i,3The path information in the 3 rd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module is changed to L3 i:L2 i:L1 i,3The path information of the 3 rd sub-learning module is represented, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module;
if L is3 i:L2 i:L1 i,1The 1 st sub-learning module is located at the position of the adjacent virtual router group in the layer 2 grid structure, and L3 i:L2 i:L1 i,2The path information in the 2 nd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module is changed to L3 i:L2 i:L1 i,2The expressed path information of the 2 nd sub-learning module is divided, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module;
the reverse transmission rule is as follows:
if L is3 i:L2 i:L1 i,hWhen the path information of the 1 st sub-learning module and the 2 nd sub-learning module denoted by h 1,2 is corrected, L is corrected3 i:L2 i:L1 i,hPath information of the 1 st and 2 nd sub-learning modules denoted by h 1,2 is inversely transmitted to L3 i:L2 i:L1 i,(h+1)And h is 1,2 represents the 2 nd sub-learning module and the 3 rd sub-learning module, otherwise, the reverse transmission is not carried out.
And the target reward value is (p-1) multiplied by q, wherein p represents the number of the router nodes in the virtual router group in the corresponding layer network structure, and q represents the weighted value corresponding to the maximum congestion level.
The rule of the weighting processing is as follows:
if the next hop is a path, subtracting y from the maximum reward value;
if the next jump is the first-level blockage, subtracting 3 multiplied by y +1 from the maximum reward value;
if the next hop is a secondary occlusion, then the maximum reward value is subtracted by 3 × (3 × y +1) + 1;
if the next hop is a device edge or temporary deactivation, the zero maximum reward value is zero; wherein y represents a positive integer.
The access rule is as follows:
step a, initializing i to 3;
and b, comparing the data packet from the i-th layer network structure according to the position information of the destination router node, if the destination router node of the data packet is in the group corresponding to the i-th layer network structure, assigning i-1 to i, and returning to the step b until i is equal to 1, otherwise, accessing the routing table in the i-th layer network structure by the data packet.
Compared with the prior art, the invention has the beneficial effects that:
1. the network-on-chip routing method based on hierarchical Q-routing planning of the invention comprises the steps of forming NoC systems of various scales by multiplexing a learning module; each router node performs hierarchical parallel learning, and the time for path planning is greatly reduced, so that the real-time change of the NoC network environment blocking condition is better adapted.
2. The invention divides and compresses the routing table, and integrates a plurality of routing tables into a complete routing table through mapping, thereby greatly reducing the resources occupied by the routing table, and the reduction amplitude is further increased along with the increase of the scale of the NoC network. Taking an 8 × 8 routing network as an example, 8 × 8 × 4 bits (4 bits represent four directions) are needed before layering is not introduced, and if three Q-routing are performed, only 3 × 4 × 4 bits (4 nodes in each area) are needed, which can be seen as an 80% reduction in area resources.
3. According to the invention, through reducing the learning levels of part of nodes, namely, the small part of nodes are subjected to multi-layer learning, and the large part of nodes are subjected to non-multi-layer learning, the consumption of Q-routing circuit area resources is reduced, and meanwhile, the layout and wiring of the whole system circuit are facilitated.
Drawings
FIG. 1 is a diagram of a router node structure according to the present invention;
FIG. 2 is a schematic diagram of a learning module configuration according to the present invention;
FIG. 3 is a hierarchical diagram of a router node according to the present invention;
FIG. 4 is a block diagram of a hierarchical Q-routing implementation of the present invention;
FIG. 5 is an exemplary diagram of a correction rule and a reverse transmission rule according to the present invention;
FIG. 6 is a flow chart of a routing table read in accordance with the present invention;
fig. 7 is a flow chart of an example NoC system of the present invention.
Detailed Description
In the routing method based on hierarchical Q-routing planning in this embodiment, in a network on chip including 64 router nodes, 64 resource nodes, and a plurality of interconnection channels, configuration is performed with reference to a learning module configuration manner shown in fig. 1 and 2, where a router node includes an input port, an output port, a congestion sensor, a multi-way gate, and an access routing table; the method is characterized in that a learning module is arranged in a router node; referring to fig. 4, the learning module includes: the system comprises a learning mode arbitrator, a hierarchical control module, a routing table selection module, 3 sub-learning modules and 3 routing tables. The number of the network structure, the sub-learning modules and the routing tables can be increased to adapt to more router node networks; referring to fig. 7, the routing method is performed as follows:
step 1: referring to fig. 3, all router nodes are divided into three-layer network structures according to the following rules, so as to form a pyramid structure; the purpose of such layering is to enable parallel learning and obtain a transmission path with short time consumption and high transmission rate. The rule is:
dividing 64 router nodes into one group by 4 nodes in a layer 1 network structure, so as to form the layer 1 network structure consisting of 16 virtual router groups;
dividing 16 virtual router nodes into one group by every 4 nodes in a layer 2 network structure, thereby forming the layer 2 network structure consisting of 4 virtual router groups;
allocating 4 virtual router nodes in z in a layer 3 network structure2Dividing the nodes into a group, thereby forming a layer 3 consisting of 1 virtual router group;
3 sub-learning modules and 3 routing tables; each sub-learning module includes: r matrix, Q value comparator;
let L1 0,hRepresents the h sub-learning module, i, corresponding to the ith router node in the 1 st network structure<x2,h=1,2,3;
Let Lj iIndicating the ith virtual router group in the jth layer network structure, wherein j is not equal to 1;
let L3 i:L2 iIndicating an ith virtual router group in the layer 2 network structure in the ith virtual router group in the layer 3 network structure;
let L3 i:L2 i:L1 i,hThe ith virtual router group in the 3 rd network structure is represented, and the ith virtual router group in the 2 nd network structure is positioned in the ith virtual router group in the 1 st network structure and corresponds to the ith router nodeh sub-learning modules;
step 2: each router node senses the congestion degree and learns in parallel in each layer of network structure;
step 2.1: with L3 i:L2 i:L1 i,1The 1 st sub-learning module is used as the current router node; for the current router node in the 1 st layer network structure, the occupancy rate counting is carried out on the input port of the current router node through the congestion sensor of the current router node, the flow counting is carried out on the output port of the current router node, and therefore the 1 st layer congestion level on each path is obtained and stored in the L in the 1 st layer network structure3 i:L2 i:L1 i,1In the R matrix of the 1 st sub-learning module represented; the congestion level R value 3 'b 000 represents a channel, 3' b001 represents a primary blockage, 3 'b 010 represents a secondary blockage, and 3' b111 represents an unreachable channel (fully blocked or temporarily deactivated state).
For L in layer 2 network structure3 i:L2 i:L1 i,2The represented 2 nd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 2 nd network structure to obtain pooled congestion levels, and stores the pooled congestion levels into the L in the 2 nd network structure3 i:L2 i:L1 i,2In the R matrix of the 2 nd sub-learning module represented;
for L in layer 3 network structure3 i:L2 i:L1 i,3The 3 rd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 3 rd network structure to obtain pooled congestion levels, and stores the pooled congestion levels into the L in the topmost network structure3 i:L2 i:L1 i,3In the R matrix of the 3 rd sub-learning module represented;
step 2.2: initializing i to 0;
step 2.3: mixing L with3 i:L2 i:L1 i,hThe reward values in the sub-learning modules represented by h 1,2 and 3 are initialized to the target reward value; in this embodiment, the destination reward value is 39, the number of router nodes in the virtual router group is 4, and the weighting value corresponding to the maximum congestion level is 13. Initializing the reward values in the sub-learning modules of other router nodes to 0;
step 2.4: for L in layer 1 network structure3 i:L2 i:L1 i,1The 1 st sub-learning module reads the maximum reward value of the sub-learning module on the adjacent router node and according to L3 i:L2 i:L1 i,1The R matrix in the 1 st sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to transmit to all adjacent router nodes;
for L in layer 2 network structure3 i:L2 i:L1 0,2The 2 nd sub-learning module, read and L3 i:L2 i:L1 0,2The maximum reward value of the 2 nd sub-learning module at the same position in the virtual router group adjacent to the 2 nd sub-learning module is expressed and is taken as L3 i:L2 iThe maximum reward value of all nodes in the represented ith virtual router group; according to item L3 i:L2 i:L1 0,2The R matrix in the 2 nd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,2The 2 nd sub-learning module at the same position in the virtual router group adjacent to the represented 2 nd sub-learning module;
for L in layer 3 network structure3 i:L2 i:L1 0,33 rd sub-learning module, read and L3 i:L2 i:L1 0,3The maximum reward value of the 3 rd sub-learning module at the same position in the 3 rd sub-learning module adjacent virtual router group is taken as the ith virtual router group L in the 3 rd layer network structure3 iMaximum reward values for all nodes in the set; according to L in layer 3 network structures3 i:L2 i:L1 0,3The R matrix in the 3 rd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,3The 3 rd sub-learning module at the same position in the virtual router group adjacent to the 3 rd sub-learning module;
in specific implementation, the rule of the weighting process is as follows:
if the next hop is a path, subtracting 1 from the maximum reward value;
if the next hop is the first-level blockage, subtracting 4 from the maximum reward value;
if the next hop is a secondary jam, subtracting 13 from the maximum reward value;
if the next hop is a device edge or temporary deactivation, the zero maximum reward value is zero;
step 2.5: after the path information of the next hop of each sub-learning module of each router node is obtained, the path information of the next hop of each sub-learning module of each router node is transmitted to the hierarchical control module in parallel;
and step 3: referring to fig. 5, the path information is modified in the hierarchical control module, that is: correcting path information from a high-layer network structure to a low-layer network structure, and then performing reverse direction transmission on the corrected path information from the low-layer network structure to the high-layer network structure, so that each sub-learning module of the router node obtains the corrected path information, and stores the corrected path information into a routing table of a corresponding layer network structure;
step 3.1: the learning mode arbitrator judges the source of the path information sent to the hierarchical control module, if the source is L3 i:L2 i:L1 0,hWhen h is 1,2,3, the 1 st sub-learning module, the 2 nd sub-learning module, and the 3 rd sub-learning module represent that the path information is from the sub-learning module located at the same position in the adjacent virtual router group; otherwise, the path information is from the h sub-learning module corresponding to the 0 th router node in the same virtual router group;
step 3.2: according to the correction rule, path information is corrected from the 3 rd layer network structure to the 1 st layer network structure in sequence;
step 3.3: judging whether the anti-transmission rule is met, if so, sequentially performing anti-transmission from the 1 st network structure to the 3 rd network structure according to the anti-transmission rule, and otherwise, sequentially performing anti-transmission from the 1 st network structure to the 3 rd network structure according to a strategy 1 or a strategy 2; wherein, the strategy 1 is to directly obtain the uncorrected path information of the current layer; strategy 2 is to obtain the uncorrected path information of the lower layer;
step 3.4: and sending the corrected and reversely transmitted path information into a routing table of a corresponding layer network structure.
The correction rule is:
if L is3 i:L2 i:L1 i,2The 2 nd sub-learning module is shown located at the position of the adjacent virtual router group in the layer 3 grid structure, and L3 i:L2 i:L1 i,3The path information in the 3 rd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module is changed to L3 i:L2 i:L1 i,3The path information of the 3 rd sub-learning module is represented, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module;
if L is3 i:L2 i:L1 i,1The 1 st sub-learning module is located at the position of the adjacent virtual router group in the layer 2 grid structure, and L3 i:L2 i:L1 i,2The path information in the 2 nd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module is changed to L3 i:L2 i:L1 i,2The expressed path information of the 2 nd sub-learning module is divided, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module;
the anti-transmission rule is as follows:
if L is3 i:L2 i:L1 i,hWhen the path information of the 1 st sub-learning module and the 2 nd sub-learning module denoted by h 1,2 is corrected, L is corrected3 i:L2 i:L1 i,hPath information of the 1 st and 2 nd sub-learning modules denoted by h 1,2 is inversely transmitted to L3 i:L2 i:L1 i,(h+1)H is 1,2 represents the 2 nd sub-learning module and the 3 rd sub-learning module, otherwise, the reverse transmission is not carried out;
and 4, step 4: and assigning i +1 to i, and returning to the step 2.3 until i is 3, thereby completing the path planning of each router node as a destination node.
And 5: referring to fig. 5, the transmission of the network-on-chip packet is as follows:
the data packet in the network-on-chip accesses the routing table in each layer network structure, sequentially passes through the input port and the multi-way gate, and the multi-way gate performs access operation on the routing table selection module in the learning module;
the routing table selection module reads the position information of the destination router node in the data packet, including the information of the nodes in the layer 3, layer 2 and layer 1, and accesses the routing table and takes out the path information stored in the routing table according to the following access rules:
step a, initializing i to 3;
and b, comparing the data packet from the i-th layer network structure according to the position information of the destination router node, if the destination router node of the data packet is in the group corresponding to the i-th layer network structure, assigning i-1 to i, and returning to the step b until i is equal to 1, otherwise, accessing the routing table in the i-th layer network structure by the data packet.
And according to the taken-out path information in the routing table, if the taken-out path information is return information, the destination router node transmits the data packet to a packet receiver of the destination router node, otherwise, the data packet is transmitted to a corresponding output port according to the taken-out path information, so that the transmission of the data packet is completed.

Claims (6)

1. A routing method based on hierarchical Q-routing planning is applied to a network on chip consisting of w router nodes, w resource nodes and a plurality of interconnecting channels, wherein the router nodes comprise input ports, output ports, congestion perceptrons, multi-way gates and access routing tables; the method is characterized in that a learning module is arranged in the router node; the learning module includes: the system comprises a learning mode arbitrator, a hierarchical control module, a routing table selection module, 3 sub-learning modules and 3 routing tables; the routing method comprises the following steps:
step 1: dividing all router nodes into three-layer network structures according to the following rules, thereby forming a pyramid structure; the rule is as follows:
w router nodes per x in a layer 1 network structure2Each node is divided into a group, thereby forming a group consisting of
Figure FDA0003439563250000011
A layer 1 network structure formed by virtual router groups;
in a layer 2 network structure
Figure FDA0003439563250000012
Each virtual router group in every y2Each node is divided into a group, thereby forming a group consisting of
Figure FDA0003439563250000013
A layer 2 network structure formed by virtual router groups;
in a layer 3 network structure
Figure FDA0003439563250000014
A virtual router group with z2Dividing the nodes into a group, thereby forming a layer 3 consisting of 1 virtual router group;
the 3 sub-learning modules and the 3 routing tables correspond to the network structures of all layers; each sub-learning module includes: r matrix, Q value comparator;
let L1 i,hRepresents the h sub-learning module, i, corresponding to the ith router node in the 1 st network structure<x2,h=1,2,3;
Let Lj iIndicating the ith virtual router group in the jth layer network structure, wherein j is not equal to 1;
let L3 i:L2 iIndicating an ith virtual router group in the layer 2 network structure in the ith virtual router group in the layer 3 network structure;
let L3 i:L2 i:L1 i,hThe h sub-learning module represents the h sub-learning module corresponding to the ith router node in the 1 st network structure in the ith virtual router group in the 2 nd network structure in the ith virtual router group in the 3 rd network structure;
step 2: each router node senses the congestion degree and learns in parallel in each layer of network structure;
step 2.1: with L3 i:L2 i:L1 i,1The 1 st sub-learning module is used as the current router node; for the current router node in the 1 st layer network structure, the occupancy rate counting is carried out on the input port of the current router node through the congestion sensor of the current router node, the flow counting is carried out on the output port of the current router node, and therefore the 1 st layer congestion level on each path is obtained and stored in the L in the 1 st layer network structure3 i:L2 i:L1 i,1In the R matrix of the 1 st sub-learning module represented;
for L in layer 2 network structure3 i:L2 i:L1 i,2The represented 2 nd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 2 nd network structure to obtain pooled congestion levels, and stores the pooled congestion levels into the L in the 2 nd network structure3 i:L2 i:L1 i,2In the R matrix of the 2 nd sub-learning module represented;
for L in layer 3 network structure3 i:L2 i:L1 i,3The 3 rd sub-learning module performs average pooling on congestion levels of all nodes in the ith virtual router group in the 3 rd network structure to obtain pooled congestion levels, and stores the pooled congestion levels into the L in the topmost network structure3 i:L2 i:L1 i,3In the R matrix of the 3 rd sub-learning module represented;
step 2.2: initializing i to 0;
step 2.3: mixing L with3 i:L2 i:L1 i,hThe reward values in the sub-learning modules represented by h 1,2 and 3 are initialized to the target reward value; initializing the reward values in the sub-learning modules of other router nodes to 0;
step 2.4: for L in layer 1 network structure3 i:L2 i:L1 i,1The 1 st sub-learning module reads the maximum reward value of the sub-learning module on the adjacent router node and according to L3 i:L2 i:L1 i,1The R matrix in the 1 st sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to transmit to all adjacent router nodes;
for L in layer 2 network structure3 i:L2 i:L1 0,2The 2 nd sub-learning module, read and L3 i:L2 i:L1 0,2The maximum reward value of the 2 nd sub-learning module at the same position in the virtual router group adjacent to the 2 nd sub-learning module is expressed and is taken as L3 i:L2 iThe maximum reward value of all nodes in the represented ith virtual router group; according to item L3 i:L2 i:L1 0,2The R matrix in the 2 nd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,2The 2 nd sub-learning module at the same position in the virtual router group adjacent to the represented 2 nd sub-learning module;
for L in layer 3 network structure3 i:L2 i:L1 0,33 rd sub-learning module, read and L3 i:L2 i:L1 0,3The maximum reward value of the 3 rd sub-learning module at the same position in the 3 rd sub-learning module adjacent virtual router group is taken as the ith virtual router group L in the 3 rd layer network structure3 iMaximum reward values for all nodes in the set; according to L in layer 3 network structures3 i:L2 i:L1 0,3The R matrix in the 3 rd sub-learning module performs weighting processing on the read maximum reward value to obtain a weighted reward value, and selects the weighted maximum reward value to be transmitted to the L3 i:L2 i:L1 0,3The 3 rd sub-learning module at the same position in the virtual router group adjacent to the 3 rd sub-learning module;
step 2.5: after the path information of the next hop of each sub-learning module of each router node is obtained, the path information of the next hop of each sub-learning module of each router node is transmitted to the hierarchical control module in parallel;
and step 3: and correcting the path information in the hierarchical control module, namely: correcting path information from a high-layer network structure to a low-layer network structure, and then performing reverse direction transmission on the corrected path information from the low-layer network structure to the high-layer network structure, so that each sub-learning module of the router node obtains the corrected path information, and stores the corrected path information into a routing table of a corresponding layer network structure;
and 4, step 4: assigning i +1 to i, and returning to the step 2.3 until i is max-1, so as to complete the path planning of each router node as a destination node, wherein max represents the maximum number of router nodes in the virtual router group in each layer of the network structure;
and 5: and (3) transmission of network-on-chip data packets:
the data packet in the network-on-chip accesses the routing table in each layer network structure, sequentially passes through the input port and the multi-way gate, and the multi-way gate performs access operation on the routing table selection module in the learning module;
the routing table selection module reads the position information of the destination router node in the data packet, accesses the routing table according to the access rule and takes out the path information stored in the routing table:
and according to the taken-out path information in the routing table, if the taken-out path information is return information, the destination router node transmits the data packet to a packet receiver of the destination router node, otherwise, the data packet is transmitted to a corresponding output port according to the taken-out path information, so that the transmission of the data packet is completed.
2. The routing method based on hierarchical Q-routing planning of claim 1, wherein the step 3 is performed as follows:
step 3.1, the learning mode arbitrator judges the source of the path information sent to the hierarchical control module, if the source is L3 i:L2 i:L1 0,hWhen h is 1,2,3, the 1 st sub-learning module, the 2 nd sub-learning module, and the 3 rd sub-learning module represent that the path information is from the sub-learning module located at the same position in the adjacent virtual router group; otherwise, the path information is from the h sub-learning module corresponding to the 0 th router node in the same virtual router group;
step 3.2, correcting the path information from the layer 3 network structure to the layer 1 network structure in sequence according to the correction rule;
step 3.3, judging whether the anti-transmission rule is met, if so, sequentially performing anti-transmission from the 1 st network structure to the 3 rd network structure according to the anti-transmission rule, and otherwise, sequentially performing anti-transmission from the 1 st network structure to the 3 rd network structure according to the strategy 1 or the strategy 2; wherein, the strategy 1 is to directly obtain the uncorrected path information of the current layer; strategy 2 is to obtain the uncorrected path information of the lower layer;
and 3.4, sending the corrected and reversely transmitted path information into a routing table of a corresponding layer network structure.
3. The hierarchical Q-routing based routing method of claim 2,
the correction rule is as follows:
if L is3 i:L2 i:L1 i,2The 2 nd sub-learning module represented is located in the layer 3 grid structure adjacent to the virtual routeIn the position of the group, and L3 i:L2 i:L1 i,3The path information in the 3 rd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module is changed to L3 i:L2 i:L1 i,3The path information of the 3 rd sub-learning module is represented, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,2The path information of the 2 nd sub-learning module;
if L is3 i:L2 i:L1 i,1The 1 st sub-learning module is located at the position of the adjacent virtual router group in the layer 2 grid structure, and L3 i:L2 i:L1 i,2The path information in the 2 nd sub-learning module represented needs to be transmitted across the adjacent virtual router group; then L is3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module is changed to L3 i:L2 i:L1 i,2The expressed path information of the 2 nd sub-learning module is divided, and a correction signal is generated; otherwise, L is reserved3 i:L2 i:L1 i,1The path information of the 1 st sub-learning module;
the reverse transmission rule is as follows:
if L is3 i:L2 i:L1 i,hWhen the path information of the 1 st sub-learning module and the 2 nd sub-learning module denoted by h 1,2 is corrected, L is corrected3 i:L2 i:L1 i,hPath information of the 1 st and 2 nd sub-learning modules denoted by h 1,2 is inversely transmitted to L3 i:L2 i:L1 i,(h+1)And h is 1,2 represents the 2 nd sub-learning module and the 3 rd sub-learning module, otherwise, the reverse transmission is not carried out.
4. The hierarchical Q-routing scheme-based routing method of claim 1,
and the target reward value is (p-1) multiplied by q, wherein p represents the number of the router nodes in the virtual router group in the corresponding layer network structure, and q represents the weighted value corresponding to the maximum congestion level.
5. The hierarchical Q-routing scheme-based routing method of claim 1,
the rule of the weighting processing is as follows:
if the next hop is a path, subtracting y from the maximum reward value;
if the next jump is the first-level blockage, subtracting 3 multiplied by y +1 from the maximum reward value;
if the next hop is a secondary occlusion, then the maximum reward value is subtracted by 3 × (3 × y +1) + 1;
if the next hop is a device edge or temporary deactivation, the zero maximum reward value is zero; wherein y represents a positive integer.
6. The hierarchical Q-routing scheme-based routing method of claim 1,
the access rule is as follows:
step a, initializing i to 3;
and b, comparing the data packet from the i-th layer network structure according to the position information of the destination router node, if the destination router node of the data packet is in the group corresponding to the i-th layer network structure, assigning i-1 to i, and returning to the step b until i is equal to 1, otherwise, accessing the routing table in the i-th layer network structure by the data packet.
CN202110389260.2A 2021-04-12 2021-04-12 Routing method based on hierarchical Q-routing planning Active CN113079093B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110389260.2A CN113079093B (en) 2021-04-12 2021-04-12 Routing method based on hierarchical Q-routing planning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110389260.2A CN113079093B (en) 2021-04-12 2021-04-12 Routing method based on hierarchical Q-routing planning

Publications (2)

Publication Number Publication Date
CN113079093A CN113079093A (en) 2021-07-06
CN113079093B true CN113079093B (en) 2022-03-15

Family

ID=76617277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110389260.2A Active CN113079093B (en) 2021-04-12 2021-04-12 Routing method based on hierarchical Q-routing planning

Country Status (1)

Country Link
CN (1) CN113079093B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111522775A (en) * 2020-04-22 2020-08-11 合肥工业大学 Network-on-chip routing device and control method thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9787571B2 (en) * 2014-12-22 2017-10-10 Intel Corporation Link delay based routing apparatus for a network-on-chip
CN111770019B (en) * 2020-05-13 2021-06-15 西安电子科技大学 Q-learning optical network-on-chip self-adaptive route planning method based on Dijkstra algorithm

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111522775A (en) * 2020-04-22 2020-08-11 合肥工业大学 Network-on-chip routing device and control method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种基于预判机制的极化码译码算法及 VLSI架构;杜高明 等;《微电子学与计算机》;20191231;全文 *
片上网络层次划分及多目标映射技术研究;张泽奇;《中国优秀硕士学位论文全文数据库》;20090415(第02期);全文 *

Also Published As

Publication number Publication date
CN113079093A (en) 2021-07-06

Similar Documents

Publication Publication Date Title
Trik et al. A hybrid selection strategy based on traffic analysis for improving performance in networks on chip
JP5860670B2 (en) Table-driven routing in a Dragonfly processor interconnect network
US7555001B2 (en) On-chip packet-switched communication system
US9825844B2 (en) Network topology of hierarchical ring with recursive shortcuts
CN101834789B (en) Packet-circuit exchanging on-chip router oriented rollback steering routing algorithm and router used thereby
EP2596603B1 (en) Ethernet switch and method for routing ethernet data packets
CN103973564B (en) The adaptive routing method of interconnected network system
US9529775B2 (en) Network topology of hierarchical ring with gray code and binary code
CN110830394B (en) Method for generating routing table based on RapidIO network
CN116260760A (en) Topology reconstruction method based on flow sensing in multi-core interconnection network
CN111245730B (en) Routing system and communication method of network on chip
CN114116596A (en) Dynamic relay-based infinite routing method and architecture for neural network on chip
CN117135059B (en) Network topology structure, construction method, routing algorithm, equipment and medium
CN116886591B (en) Computer network system and routing method
CN113079093B (en) Routing method based on hierarchical Q-routing planning
CN113839878A (en) Data-intensive application-oriented network-on-chip approximate communication system
Ebrahimi et al. Partitioning methods for unicast/multicast traffic in 3D NoC architecture
CN112001141B (en) Brain network inspired middle-large scale on-die interconnection system comprehensive method
AU2020101176A4 (en) Exploring a new adaptive routing based on dijkstra algorithm in optical networks-on-chip
US20040073699A1 (en) Dynamic routing method for multistage bus networks in distributed shared memory environment
US7236497B2 (en) Facilitating arbitration via information associated with groups of requesters
Zhang et al. A cellular NoC architecture based on butterfly network coding (CBNoC)
CN115499271B (en) Hybrid network topology structure and routing method thereof
CN117135107B (en) Network communication topology system, routing method, device and medium
CN116383126A (en) Network-on-chip-based transmission method of deep neural network accelerator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant