WO2011120567A1

WO2011120567A1 - Efficient routing across regions of varying bit width

Info

Publication number: WO2011120567A1
Application number: PCT/EP2010/054260
Authority: WO
Inventors: Nigel Pearson; Dominic Nancekievill
Original assignee: Panasonic Corporation
Priority date: 2010-03-30
Filing date: 2010-03-30
Publication date: 2011-10-06

Abstract

A router and a routing method for routing a plurality of sources to at least one destination across a first region of a first bit width and a second region of a section bit width, the first bit width being smaller than the second bit width and the first and second regions being connected to each other by a plurality of merging means which merge a plurality of connections of a first bit width into a single connection of a second bit width. The router comprises first calculating means arranged to calculate, using a known costing algorithm, the cost of each route between each source and each merging means and first selecting means, arranged to select the lowest cost route for each combination of source and merging means. The router also comprises second calculating means arranged to calculate, using a known costing algorithm, the cost of each route between each of the at least one destination and each merging means and second selecting means, arranged to select the lowest cost route for each route between each of the at least one destination and each merging means. The router also comprises third calculating means arranged to calculate, for each merging means, the total cost of the routes found by the first and second selecting means, determining means, for determining the merging means with the lowest total cost and third selecting means arranged to select the lowest cost route through the merging means determined by the determining means.

Description

EFFICIENT ROUTING ACROSS REGIONS OF VARYING BIT WIDTH

The present invention relates to the field of user programmable logic devices (PLDs). More specifically, the present invention is directed to a method of routing in a routing network comprising regions of varying bit width.

Reconfigurable devices/fabrics, such as D-Fabrix (disclosed in, for example, US6353841 and US2002/0157066) are commonly made up of a plurality of interconnected user programmable logic blocks or tiles, the fundamental building blocks of the system.

The tiles can be combined and configured to produce large and complex functional units having various bit widths. In order to efficiently connect these functional units of different width, it is necessary to have a routing network which supports these different widths. These varying widths pose a significant problem to known routing algorithms.

Existing routing algorithms (e.g. Pathfinder®) use an iterative approach where routing resources such as wires and switches are assigned a cost related to delay and desirability. Routes are selected for all point-to-point links while observing the costs of resources and measure of the importance of the point-to- point link being created. On any given iteration, a resource may be used more than once in such a way that electrical contention would occur and the signals would not be transmitted across the routing network as desired - this is known as congestion. The cost of congested resources is increased to encourage the least important of the links to find an alternative route. This process is repeated until a set of routes that exhibit no congestion are found.

In routing from a narrow region to a wider region a problem occurs in identifying the architecture resource to use that will merge the bits of a signal from the narrow region onto a single routing resource in the wider region. Typically, a narrow region can be connected to a wider region by way of merging means, which will normally consist of hardwired connections.

If two signals are to be routed from two separate start points on a narrow routing region to a single end point on a wider region, a typical prior art routing algorithm would find the least costly route from the first start point to the end point, and then find the least costly route from the second start point to the end point. It may be that these two routes are not routed through the same merging device, if this is the case, the signals sent along those routes will not properly merge and, at the very least, it will be necessary to use more routing and processing resources in the wider region to combine these two signals.

In a worst case scenario, the prior art continue to increase the cost of the congested resources until the routing algorithm reaches a fixed number of attempts and fails, thereby providing no routing solution whatsoever.

Accordingly, in order to effectively route signals across a boundary between regions of varying bit width, it will be necessary to have an improved routing algorithm.

In order to solve the problems associated with the prior art, the present invention provides a routing method for routing a plurality of sources to at least one destination across a first region of a first bit width and a second region of a second bit width, the first bit width being smaller than the second bit width and the first and second regions being connected to each other by a plurality of merging means which merge a plurality of connections of a first bit width into a single connection of a second bit width, the method comprises the steps of: calculating, using a known costing algorithm, the cost of each route between each source and each merging means;

selecting the lowest cost route for each combination of source and merging means;

calculating, using a known costing algorithm, the cost of each route between each of the at least one destination and each merging means;

selecting the lowest cost route for each route between each of the at least one destination and each merge means;

calculating, for each merging means, the total cost of the routes found in the above selecting steps;

determining the merging means with the lowest total cost; and

selecting the lowest cost route through the merging means determined in the determining step.

Preferably, the at least one destination is a single destination. Preferably, the costing algorithm is based on associating a delay value to each routing resource.

Preferably, the costing algorithm is based on associating a criticality value to each routing resource.

The present invention also provides a router for routing a plurality of sources to at least one destination across a first region of a first bit width and a second region of a second bit width, the first bit width being smaller than the second bit width and the first and second regions being connected to each other by a plurality of merging means which merge a plurality of connections of a first bit width into a single connection of a second bit width, the router comprises: first calculating means arranged to calculate, using a known costing algorithm, the cost of each route between each source and each merging means;

first selecting means, arranged to select the lowest cost route for each combination of source and merging means;

second calculating means arranged to calculate, using a known costing algorithm, the cost of each route between each of the at least one destination and each merging means;

second selecting means, arranged to select the lowest cost route for each route between each of the at least one destination and each merging means; third calculating means arranged to calculate, for each merging means, the total cost of the routes found by the first and second selecting means;

determining means, for determining the merging means with the lowest total cost; and

third selecting means arranged to select the lowest cost route through the merging means determined by the determining means.

Preferably, the costing algorithm is based on associating a delay value to each routing resource.

Preferably, the costing algorithm is based on associating a criticality value to each routing resource. The present invention also provides a computer-readable medium, having computer executable code written thereon, the computer executable code being arranged to cause a computer to perform the above method. The present invention also provides a system for reconfiguring a reconfigurable device, the reconfigurable device having a plurality of routing regions having varying bit widths, the system comprising the above router.

As will be appreciated, the present invention provides several advantages over the prior art. For example, because the method of routing in accordance with the present invention takes into consideration the merge points, it is possible to efficiently route signals across a boundary between regions of varying bit width. Moreover, by using the present invention, it is possible to route optimal paths through a routing network of varying bit widths using a minimal number of routing resources. As will be appreciated, this will result in a much more efficient router.

Specific embodiments of the present invention will now be described with reference to the accompanying drawings, in which:

Figure 1 represents a typical software tool chain for use in reconfigurable logic architectures;

Figure 2 represents a simplified example of a routing network which can be used in accordance with the present invention;

Figure 3 represents a route found using a prior art routing method;

Figure 4 is a flow chart representing a method of one embodiment in accordance with the present invention;

Figure 5 is a table of routing costs associated with the routing solution of Figure 6; and

Figure 6 represents a route found using the method of one embodiment in accordance with the present invention.

To use any reconfigurable logic architecture a set of software tools must be written to convert the description of a design written in a Hardware Description Language (HDL) such as Verilog into a configuration stream which causes the architecture to perform as specified by the design.

Firstly, a Verilog or Very High Speed Integrated Circuit Hardware Descriptor Language (VHDL) is used to create a design file representing the design.

Secondly, a Synthesis tool reads the files, builds a data structure in memory representing the logic functions in the design and the connections between them, and performs a sequence of transformation steps on this data structure. For example, optimizations such as replacing an AND gate where one input is a logical "HIGH" with a wire will be performed, and technology mapping steps will find optimal ways to implement the logic functions using the logic resources available on the array.

Thirdly, a Placer tool selects a specific logic resource on the array for each logic resource in the design as output by the Synthesis tool. For example, if the design contains (among other things) an adder and the array provides (among other things) two hundred adders, then the Placer will select which of the two hundred adders on the array to use for the adder in the design.

Fourthly, a Router tool selects specific routing resources on the array to connect each logic resource as described by the design as output by Synthesis tool. For example, if the adder mentioned above is driven by a register in the user design, and the Placer has placed that register onto one of the registers in the array, then the Router will need to select routing resources (for example wires and multiplexers) to make a connection on the array between the chosen register and adder.

Finally, a Configuration Generation tool calculates a stream of bits which, when fed into the configuration interface of the device, causes it to behave as selected by the Synthesis Tool, the Placer tool and the Router tool.

As described above, the Router tool provides routing whereby the placed elements of the design must be connected together by selecting appropriate architecture resources to achieve the required interconnection.

For the design of a semi-custom reconfigurable logic architecture it may be advantageous to have regions where the bit-width of the available routing resources are wider or narrower than other regions to suit the logic resources present in those regions. This is particularly useful where a reconfigurable logic architecture has been customised for the characteristics of a particular design space.

Figure 2 shows a routing network 10 having an n-bit wide region 1 1 and an m-bit wide routing region 12, where n and m a positive integers, and n is smaller than m. Figure 2 shows part of a border between two regions. It will be appreciated that such a routing network could comprise an n-bit wide region 1 1 completely surrounded by an m-bit wide region 12 or, alternatively, an m-bit wide region 12 completely surrounded by an n-bit wide region 11. Each routing region 1 1 , 12 comprises a plurality of switches 13, 14 and a plurality of connections C which are effectively hardwired connections between switches. As will be appreciated, Figure 2 is a simplified representation of a routing network 10 which is used to describe an embodiment of the present invention. Switch 13 is an n-bit switch which connects an incoming n-bit connection C to an outgoing n-bit connection C. Similarly, switch 14 is an m-bit switch which connects an incoming m-bit connection C to an outgoing m-bit connection C. In practice, switches 13 and 14 can be implemented using multiplexers, as described in, for example, US patent number US6107824.

For the purposes of illustrating the invention, the figures only show the routing network. As will be appreciated however, the routing network will be connected to logic units (not shown) and other elements (not shown) via other connections (not shown) to switches 13 and 14.

The routing regions 11 , 12 are interconnected via merging devices M1 , M2, M3, M4. Each merging device M1 , M2, M3, M4 is arranged to merge two n- bit wide connections into a single m-bit wide connection. As will be appreciated, this can mean that n is half the value of m, or, alternatively, n is less than half the value of m. Each merging device M1 , M2, M3, M4 can be implemented using hardwired connections.

Now, with reference to Figure 3, an example of how a prior art router would find a route through the architecture described in Figure 2 will now be described. In this example, the n-bit signals at switches S1 and S2 are to be combined to arrive at switch D. As will be appreciated, the signals to be routed will depart from logic units which are connected to the routing network by way of switches S1 and S2. Similarly, switch D will be connected to a logic unit outside the routing network, which routing unit will be the ultimate destination of the signal.

A prior art router would first associate a timing cost to each of the connections C in the routing network 10. For the purpose of this example, each vertical connection C in routing region 11 is attributed a cost of 2 units and each horizontal connection C in routing region 1 1 is attributed a cost of 1 unit. Similarly, each vertical connection C in routing region 12 is attributed a cost of 1 unit and each horizontal connection C in routing region 12 is attributed a cost of 2 units. Finally, connections C being input into the merging devices M1 , M2, M3, M4 are associated with a cost of 3 units.

Once each connection C is associated with a cost, a prior art router would find the shortest (i.e. least costly) route between switch S1 and D, using for example, a Dijkstra expansion algorithm. In the example of Figure 3, the path shown by a broken line is one of the shortest paths from S1 to D, having a total cost of 12.

Then, a prior art router would calculate the cost of each route between switch S1 and D, using the same algorithm. Accordingly, a prior art router would choose the (or one of) the paths with the lowest costs between S2 and D. In the example of Figure 3, the path shown by a broken line is one of the shortest paths from S2 to D, having a total cost of 12.

In this situation, a problem arises because each route has entered the m- bit routing region 12 through different merging devices. The route from S1 to D passes through merging device M2 and the route from S2 to D passes through merging device M4. As will be appreciated, the n-bits of the signal coming from S1 will be merged with n other bits taken from the other input of merging device M2 and the n-bits of the signal coming from S2 will be merged with n other bits taken from the other input of merging device M4. Thus, each of the n-bit signals coming from S1 and S2 will use an m-bit route once they are merged into the m- bit routing region 12. In this example, m-bit connection R2 will be used for the route between S1 and D and m-bit connection R3 will be used for the route between S2 and D. Because both connection R1 and R2 are connected to connection R3 via the same switch, only one of the signals on connections R1 and R2 will be able to reach switch D at any given time. Thus, an n-bit signal coming from S1 and an n-bit signal coming from S2 will not have been merged into a single m-bit signal at D, but rather at two m-bit signals comprising conflicting information. Thus, when switch D sends the signals on to the logic unit (not shown) to which it is connected, only one of two signals can be sent at any one time; either an m-bit signal comprising the signal from S1 and m-n random bits, or an m -bit signal comprising the signal from S2 and m-n random bits.

Accordingly, the route between S1 and D and the route between S2 and

D will congest with each other. This congestion can only be avoided if both routes are chosen in such a way that they use the same merge point M1 , M2, M3, M4. Prior art devices however do not provide means to guide this choice. Instead, prior art devices simply keep detecting the inevitable congestion that happens in the wide region, increasing the cost of whatever resources that congestion happens on and performing another routing iteration.

Whilst it is possible that the method of increasing the congestion on certain bits of the wide array and performing successive routing iterations may stumble upon a solution in which both routes pass through the same merging device, it is unlikely that this will actually happen. Instead, it is more likely that prior art methods would simply stop at a fixed number of iterations and indicate to a user that no solution can be found.

To overcome this problem, a prior art system would need to increase the number of routing resources (and thereby increase the size of the array and the number of merge points) in order to find a solution. This approach of increasing the number of routing resources in response to the number and position of logic resources in a design, whilst technically possible, is inefficient, particularly when considered in the context of reconfigurable devices.

The present invention, an embodiment of which is shown in Figure 4, overcomes this problem by provided a routing algorithm which divides the task of routing a plurality of sources S to a single destination D over regions of varying bit with into distinct stages. As will be appreciated, depending on the values of n and m, any number of sources S can be merged to any number of destinations D.

The first step of the algorithm in accordance with the embodiment of Figure 4 is the selection of the first source amongst the plurality of sources S. This can be done arbitrarily or by any other known method. Then, the router finds the lowest cost route from that source to a first merging point. After that, if other merge points exist, the router will find the lowest cost route from that first source to every other merge point. The above steps are then repeated for every source in the plurality of sources S.

Once these costs have been found, the router calculates the lowest cost route from the destination to each of the merge points identified above. As will be appreciated by the skilled reader, the method of the present invention can also be used in a situation where a number of sources are being routed to a number of destinations. Then, the total cost for each merge point is calculated and the merge point with the lowest total cost is selected. As will be appreciated, whilst it is possible to replace this step with the step of calculating the lowest cost route from each merge point to the destination, for practical purposes, calculating the lowest cost route from the destination to each of the merge points is advantageous.

With reference to Figures 5 and 6, an example of how the method of

Figure 4 can be used will now be described. Figure 5 represents a table showing calculated costs between two points, and Figure 6 represents the route found in accordance with the method of Figure 4.

First, S1 is identified as the first source. Then, for each merge point (in this example, the merging devices M1 , M2, M3, M4 of Figure 6), a cost is calculated from the source S1 to that merge point. These values are shown in the "S1 " column of the table of Figure 5. Then, this process is repeated for source S2, thereby populating the "S2" column of Figure 6.

Once the above is complete, the next step in the method is to calculate the cost between the destination D and each merging device M1 , M2, M3, M4. These values are represented in the "D" column of Figure 5. The next step is to calculate the total cost for each merge point M1 , M2, M3, M4. Finally, the merge point with the lowest total cost is used and the shortest routes to and from that merge point are use to route from the sources S1 , S2 to the destination D.

As will be appreciated, the order in which the steps of the method of Figure 4 are not important. For example, the first step could be the calculation of the shortest route between the destination D and each of the merging points M1 , M2, M3, M4.

Moreover, the example of Figures 4, 5 and 6 uses a very simple costing regime based on a single unit value for each connection C. The skilled reader will understand that one of any number of alternate cost functions could be used instead. For example, a more likely cost function for a given merge point is to choose a route from the sources S1 , S2 to the merge points M, and from the destination to the merge point M which is based on minimizing the costliest single connection C in the route. By minimizing the longest connection C between two switches 13, 14, it is possible to increase the maximum clock frequency at which the circuit will operate. Such a costing scheme can be represented by Cost = max(cost(Si->M)) + cost(M->D).

Other factors such as timing criticality and congestion costs may also be introduced to differentiate between different sources of the same net and between different nets. The criticality of a path can be defined as the ratio of the delay of that path to the delay of the longest path in a circuit.

Because the different routes through the narrow region come from different sources, they may have varying timing criticalities. This will happen when, for example, one narrow signal was driven directly by a register, whereas the other was driven by a long chain of logic functions driven by a register. The narrow route with the extra logic would have accumulated a significant amount of delay before starting and therefore would be far more likely to be the critical path. In this case, it would be ascribed a high criticality value. One way of costing involves multiplying the delays of each of the narrow routes by their criticality value. This would mean that a merge point would be chosen which was better for the more critical route, rather than the one that was best on average.

Claims

1. A routing method for routing a plurality of sources to at least one destination across a first region of a first bit width and a second region of a section bit width, the first bit width being smaller than the second bit width and the first and second regions being connected to each other by a plurality of merging means which merge a plurality of connections of a first bit width into a single connection of a second bit width, the method comprising the steps of: calculating, using a known costing algorithm, the cost of each route between each source and each merging means;

selecting the lowest cost route for each route between each of the at least one destination and each merging means;

determining the merging means with the lowest total cost; and

2. The routing method of claim 1 , wherein the at least one destination is a single destination.

3. The routing method of any of the preceding claims, wherein the costing algorithm is based on associating a delay value to each routing resource.

4. The routing method of any of the preceding claims, wherein the costing algorithm is based on associating a criticality value to each routing resource.

5. A router for routing a plurality of sources to at least one destination across a first region of a first bit width and a second region of a section bit width, the first bit width being smaller than the second bit width and the first and second regions being connected to each other by a plurality of merging means which merge a plurality of connections of a first bit width into a single connection of a second bit width, the router comprising:

first calculating means arranged to calculate, using a known costing algorithm, the cost of each route between each source and each merging means;

6. The router of claim 5, wherein the costing algorithm is based on associating a delay value to each routing resource.

7. The router of any of claims 5 or 6, wherein the costing algorithm is based on associating a criticality value to each routing resource.

8. A computer-readable medium, having computer executable code written thereon, the computer executable code being arranged to cause a computer to perform the method steps of any of claims 1 to 4.

9. A system for reconfiguring a reconfigurable device, the reconfigurable device having a plurality of routing regions having varying bit widths, the system comprising:

a router in accordance with any of claims 5 to 7.