LU502060B1 - Method for constructing practical architecture-level fpga router for logic verification - Google Patents

Method for constructing practical architecture-level fpga router for logic verification Download PDF

Info

Publication number
LU502060B1
LU502060B1 LU502060A LU502060A LU502060B1 LU 502060 B1 LU502060 B1 LU 502060B1 LU 502060 A LU502060 A LU 502060A LU 502060 A LU502060 A LU 502060A LU 502060 B1 LU502060 B1 LU 502060B1
Authority
LU
Luxembourg
Prior art keywords
net
fpga
tdm
edge
ratio
Prior art date
Application number
LU502060A
Other languages
French (fr)
Inventor
Genggeng Liu
Wenzhong Guo
Guolong Chen
Original Assignee
Univ Fuzhou
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Fuzhou filed Critical Univ Fuzhou
Priority to LU502060A priority Critical patent/LU502060B1/en
Application granted granted Critical
Publication of LU502060B1 publication Critical patent/LU502060B1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/10Packet switching elements characterised by the switching fabric construction
    • H04L49/109Integrated on microchip, e.g. switch-on-chip
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7867Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
    • G06F15/7871Reconfiguration support, e.g. configuration loading, configuration switching, or hardware OS
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • G06F30/33Design verification, e.g. functional simulation or model checking
    • G06F30/3308Design verification, e.g. functional simulation or model checking using simulation
    • G06F30/331Design verification, e.g. functional simulation or model checking using simulation with hardware acceleration, e.g. by using field programmable gate array [FPGA] or emulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/34Circuit design for reconfigurable circuits, e.g. field programmable gate arrays [FPGA] or programmable logic devices [PLD]
    • G06F30/347Physical level, e.g. placement or routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/48Routing tree calculation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/20Support for services
    • H04L49/205Quality of Service based
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/25Routing or path finding in a switch fabric
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/40Constructional details, e.g. power supply, mechanical construction or backplane
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/12Timing analysis or timing optimisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • Quality & Reliability (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

A method for constructing a practical architecture-level FPGA router for logic verification comprises: S1: generating a routing topology of each net: generating a routing prototype for each net, and before TDM ratio assignment, routing FPGAs in each net or routing all nets in parallel to guarantee the connectivity of the nets; S2: performing TDM ratio assignment: assigning a TDM ratio to each edge of each net according to a delay of each net group; and S3: optimizing a system delay of a system: continuously optimizing net groups with large TDM ratios in parallel by iteration until an iteration end condition is met, so that processing of an entire router is ended. The method can improve chip performance by decreasing a corresponding system delay.

Description

METHOD FOR CONSTRUCTING PRACTICAL ARCHITECTURE-LEVEL FPGA LU502060
ROUTER FOR LOGIC VERIFICATION BACKGROUND OF THE INVENTION
[0001] 1. Technical Field
[0002] The invention relates to technical fields relating to computer-aided design of integrated circuits, in particular to a method for constructing a practical architecture-level FPGA router for logic verification.
[0003] 2. Description of Related Art
[0004] With the development of technical nodes, logic verification has become a time-consuming stage. In the design process of SoC, it is estimated that 60%-80% of time is spent in verification in the design of an application-specific integrated circuit (ASIC). Software logic emulation, hardware emulation and FPGA prototyping approaches are used for logic verification. However, when software logic simulation is used for logic verification, it takes a large amount of runtime to simulate each logic gate. The cost of implementing hardware emulation is high. The FPGA prototyping approach can shorten the runtime and reduce the cost, thus being widely used in the industry to make the logic verification cost lower and the logic verification speed higher. Due to the fact that it is impossible to design a prototyping system in one FPGA, multiple FPGAs are connected to form a complete system.
[0005] To design a multi-FPGA prototyping system, first, a complete circuit is divided into multiple sub-circuits each accommodating one FPGA; then, each sub-circuit is disposed on a different FPGA board; and finally, routing between FPGAs is performed on the basis of considering system performance and routing resources. Because the number of signals between FPGAs is often greater than that of I/O pins, timing division multiplexing (TMD) is put forward to transmit different signals at different times through the same line. However, timing division multiplexing of the signals increases the signal | 5430060 delay between FPGAs.
[0006] The TDM ratio of signals may be used to measure a system delay. In the whole design process, the TDM ratio is generally determined after inter-FPGA routing.
Common methods for optimizing the TDM ratio of signals are based on integer linear programming (ILP). However, existing methods for optimizing the TDM ratio of signals regard the TDM ratio as an integer, which greatly deviates from the actual condition. Although many methods have been used to optimize TMD, a good solution can hardly be obtained within a proper runtime.
[0007] With the rapid increase of the scale of VLSI, multi-FPGA prototyping systems have been widely applied to logic verification. However, limited connections between FPGAs seriously restrain the routability of the prototyping systems. So, TDM is used to improve the availability of the prototyping systems, but it leads to a sharp increase of the system delay. Therefore, how to decrease the corresponding system delay to improve chip performance has become an issue urgently to be settled.
BRIEF SUMMARY OF THE INVENTION
[0008] In view of this, the objective of the invention is to provide a method for constructing a practical architecture-level FPGA router for logic verification to decrease a corresponding system delay to improve chip performance.
[0009] The invention is implemented through the following solution: a method for constructing a practical architecture-level FPGA router for logic verification comprises the following steps:
[0010] S1: generating a routing topology of each net: generating a routing prototype for each net, and before TDM ratio assignment, connecting FPGAs in each net or routing all nets in parallel to guarantee connectivity of the nets;
[0011] S2: performing TDM ratio assignment: assigning a TDM ratio to each edge | 502060 of each net or assigning TDM ratios in parallel according to a delay of each net group; and
[0012] S3: optimizing a system delay of a multi-FPGA prototyping system subjected to routing: continuously optimizing net groups with large TDM ratios in parallel by iteration until an iteration end condition is met, so that processing of an entire router is ended.
[0013] Further, S1 specifically comprises the following steps:
[0014] S11: sorting all the nets by priority;
[0015] S12: establishing a routing graph of a current net based on an FPGA connection pair set and an FPGA set in input data of a data set formed by the FPGA set, FPGA connection pairs and net groups, marking out FPGAs to be connected, and marking out a cost of each FPGA connection pair;
[0016] S13: in terms of the established routing graph, routing the current net through a Dijkstra-based Steiner tree algorithm to construct a Steiner tree to connect the FPGAs to be routed;
[0017] S14: saving and recording a routing topology of the current net;
[0018] S15: updating costs of edges in the routing graph; initializing the cost of each FPGA connection pair to 1 before a first loop, and increasing the cost of an FPGA connection pair selected by the current net to route the FPGAs by 1; and
[0019] S16: traversing each net through a for-loop to determine whether FPGAs of all the nets are routed; if so, ending routing; otherwise, performing S12.
[0020] Further, in S15, the costs of the edges in the routing graph are updated specifically as follows: the cost of each FPGA pair is updated; if an FPGA pair is selected by the current net, the selected FPGA pair is used to route the FPGAs, and the cost of the selected FPGA pair is increased by 1.
[0021] Further, S11 is performed specifically as follows: LU502060
[0022] Before routing of each net, all the nets are sorted according to indicators: first, all the net groups are sorted in a decreasing order according to the number of nets; then, all nets in each net group are sorted in a decreasing order according to the number of FPGAs; and finally, all the nets are extracted in order.
[0023] Further, S2 specifically comprises the following steps:
[0024] S21: preprocessing each net group, that is, calculating, by counting, a maximum number ECC m of edges of a net group NE 7m including an edge €ik.
[0025] S22: calculating a weight ratio peli of each edge 1% of a current FPGA connection pair to obtain a TDM ratio to be assigned to each edge of each net;
[0026] S23: traversing each net through a for-loop to determine whether all edges are processed; if so, performing S24; otherwise, performing S22;
[0027] S24: calculating a TDM ratio of a current edge, and recording the TDM ratio of the current edge;
[0028] S25: traversing each net through a for-loop to determine whether all edges are processed; if so, performing S26; otherwise, performing S24; and
[0029] S26: traversing each net through a for-loop to determine whether all connection pairs are processed; if so, ending the process; otherwise, performing S22.
[0030] Further, S22 is performed specifically as follows:
[0031] The weight ratio of each Cr is calculated as follows: pet, _ ngmec, 4 > ngmec, ,
[0032] ey eel, ngmec, , = {x | x = max (ngec, . <,Ngec, , )}
; Me, . . nel, ngec, . LU502
[0033] Wherein, Em is an mth net group in Et , SÉC jm is the number 502060 ng. ngmec. . . of edges of the net group 5, w, EMEC; x is the number of edges of the net group with e, . . nel, a maximum number of edges /*, P is the number of net groups in 51, , and pet ik: . . . . . er, . 7% is the weight ratio; based on the weight ratio, the TDM ratio /* of the edge is 5 calculated as follows: ; 1 etr,, =—— J.k ct,
[0034] Pk
[0035] Further, S3 specifically comprises the following steps:
[0036] S31: sorting net groups: sorting all the nets in a decreasing order according to maximum TDM ratios of the net groups to which the net belong;
[0037] S32: updating “Uk of the edge “/* in the net group";
[0038] S33: determining whether i of all edges in the net group is updated according to S32; if so, updating nel, ; otherwise, performing S32;
[0039] S34: determining whether all net groups are traversed; if so, performing S35; otherwise, performing S32;
[0040] S35: determining whether the FPGA connection pair Pr meets a TDM ratio constraint; if so, updating 5 : otherwise, updating “Ui in case where “x of the edge “is increased, or validating the edge “ in case where “ik of the edge is decreased;
[0041] S36: determining whether ST; of all FPGA connection pairs is updated; if so, ending the process; otherwise, performing S35.
[0042] Further, S32 is performed specifically as follows:
[0043] The TDM ratio of each edge is decreased, and the TDM ratio of each edge | u502060 is updated according to the following formula: etr’, = staratio X etr; +
[0044] mng;
[0045] Wherein, etr’i is a new TDM ratio of ET ik staratio is an optimization goal of ngmir and is defined by users; MS is a net group of ngl; with a maximum TDM ratio.
[0046] Further, S35 is performed specifically as follows:
[0047] The sum totalpct of reciprocals of the TDM ratios of all the edges of the current FPGA connection pair is calculated, and whether totalpct is less than or equal to 1 is determined; if totalpct is less than or equal to 1, it is determined that the current FPGA connection pair meets the TDM ratio constraint, and processing of the current FPGA connection pair is skipped; otherwise, it is determined that the current FPGA connection pair does not meet the TDM ratio constraint, and the following steps are formed for subsequent legalization, that is, 1f the TDM ratio of the edge Cikig increased, the new TDM ratio etr’s is used directly; or, if the TDM ratio of the edge Es, is decreased, the TDM ratio of the edge €ik is validated according to the following equation: etr”,,= etr’; x X rec X (rec + ad)
[0048] 1— ad
[0049] Compared with the prior art, the invention has the following beneficial effects:
[0050] It decreasing a corresponding system delay, and can also optimize the routing capacity and system delay of the multi-FPGA prototyping system. Through effective optimization of the maximum TDM ratio of net groups, the system delay 1s effectively stabilized, and the performance of the prototyping system is improved. LU502060
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
[0051] FIG. 1 is a schematic diagram of TDM according to one embodiment of the invention.
[0052] FIG. 2 is an overall flow diagram of a router according to one embodiment of the invention.
[0053] FIG. 3 is a flow diagram of routing topology generation according to one embodiment of the invention.
[0054] FIG. 4 is a flow diagram of TDM ratio assignment according to one embodiment of the invention.
[0055] FIG. 5 is an overall flow diagram of key system delay optimization according to one embodiment of the invention.
DETAILED DESCRIPTION OF THE INVENTION
[0056] The invention will be further described below in conjunction with the accompanying drawings and embodiments.
[0057] It should be pointed out that the following detailed description is merely illustrative and is intended to further explain the invention. Unless otherwise stated, all technical and scientific terms used in this specification have meanings commonly understood by those ordinarily skilled in the art.
[0058] It should be noted that terms used in this specification are merely for describing specific embodiments, and are not intended to limit illustrative embodiments of the application. For example, unless otherwise expressly stated in the context, the singular form used here is also intended to include the plural form. In addition, it should be understood that the terms “include” and/or “comprise” in this specification indicate the presence of a feature, step, operation, device, assembly and/or a combination thereof. LU502060
[0059] This embodiment provides a method for constructing a practical architecture-level FPGA router for logic verification, comprising the following steps:
[0060] S1: a routing topology of each net is generated: a routing prototype 1s generated for each net, and before TDM ratio assignment, FPGAs in each net are routed or all nets are routed in parallel to guarantee the connectivity of the nets;
[0061] S2: TDM ratio assignment is performed: a TDM ratio is assigned to each edge of each net or TDM ratios are assigned in parallel according to a delay of each net group; and
[0062] S3: system delay of a multi-FPGA prototyping system subjected to routing is optimized: net groups with large TDM ratios are continuously optimized in parallel by iteration until an iteration end condition is met, so that processing of an entire router is ended.
[0063] In this embodiment, S1 specifically comprises the following steps:
[0064] S11: all the nets are sorted by priority;
[0065] S12: a routing graph of a current net is established based on an FPGA connection pair set and an FPGA set in input data of a data set formed by the FPGA set, FPGA connection pairs and net groups, wherein the net groups are defined according to a design goal; FPGAs to be routed are marked out, and a cost of each FPGA connection pair is marked out;
[0066] S13: in terms of the established routing graph, the current net is routed through a Dijkstra-based Steiner tree algorithm (K. Mehlhorn, “A faster approximation algorithm for the Steiner problem in graphs,” Information Processing Letters, vol. 27, no. 3, pp. 125-128, 1988.) to construct a Steiner tree to rout the FPGAs to be routed; it should be noted that any Steiner tree algorithm may be applied to the router designed in the invention, so that the flexibility of the router is improved;
[0067] S14: a routing topology of the current net is saved and recorded; LU502060
[0068] S15: costs of edges in the routing graph are updated; the cost of each FPGA connection pair is initialized to 1 before a first loop, and the cost of an FPGA connection pair selected by the current net to route the FPGAs is increased by 1; and
[0069] S16: whether FPGAS of all the nets are routed are determined; if so, routing is ended; otherwise, S12 is performed.
[0070] In this embodiment, in S15, the costs of the edges in the routing graph are updated specifically as follows: the cost of each FPGA pair is updated; if an FPGA pair is selected by the current net, the selected FPGA pair is used to route the FPGAs, and the cost of the selected FPGA pair is increased by 1.
[0071] In this embodiment, S11 is performed specifically as follows:
[0072] Before routing of each net, all the nets are sorted according to indicators: first, because the delay is related to the number of nets and a net group with more nets is more likely to cause a delay, all the net groups are sorted in a decreasing order according to the number of nets; then, because routing will become more difficult with the increase of the number of FPGAs in the nets, all nets in each net group are sorted in a decreasing order according to the number of FPGAs; and finally, all the nets are extracted in order.
[0073] In this embodiment, S2 specifically comprises the following steps:
[0074] S21: each net group is preprocessed, that is, a maximum number ECC m of edges of a net group NE 7m including an edge Cit is calculated by counting;
[0075] S22: a weight ratio Pet, x of each edge “jk of a current FPGA connection pair is calculated to obtain a TDM ratio to be assigned to each edge of each net;
[0076] S23: each net is traversed through a for-loop to determine whether all edges are processed; if so, S24 is performed; otherwise, S22 is performed;
[0077] S24: a TDM ratio of a current edge is calculated, and the TDM ratio of the | 502060 current edge is recorded,
[0078] S25: each net is traversed through a for-loop to determine whether all edges are processed; if so, S26 is performed; otherwise, S24 is performed; and
[0079] S26: each net is traversed through a for-loop to determine whether all connection pairs are processed; if so, the process is ended; otherwise, S22 is performed.
[0080] In this embodiment, S22 is performed specifically as follows: e,
[0081] The weight ratio of each 7 is calculated as follows: ngmec,, pet, =H— — > ngmec, ,
[0082] e, rel ngmec,, = {x | x = max (ngec, . <,Ngec, , )} ng. nel, ngec,
[0083] Wherein, 5 jm is an mth net group in Ej , SEC jm is the number ng. ngmec. . . of edges of the net group 5, w, EMEC; x is the number of edges of the net group with . e, . . ngl, a maximum number of edges /*, P is the number of net groups in 51, , and pet ik: . . . . . et, . 7% is the weight ratio; based on the weight ratio, the TDM ratio /* of the edge is calculated as follows: ; 1 el = ———— J.k ct,
[0084] PC
[0085] In this embodiment, S3 specifically comprises the following steps:
[0086] S31: net groups are sorted: all the nets are sorted in a decreasing order according to maximum TDM ratios of the net groups to which the net belong;
[0087] S32: Wik of the edge “ in the net group” is updated:
[0088] $33: whether “7 of all edges in the net group is updated according to LU502060 S32 is determined; if so, nel, is updated; otherwise, S32 is performed;
[0089] S34: whether all net groups are traversed is determined; if so, S35 is performed; otherwise, S32 is performed;
[0090] S35: whether the FPGA connection pair Pr meets a TDM ratio constraint is determined; if so, er, is updated; otherwise, er, is updated in case where er, of the edge €ikis increased, or the edge Ci js validated in case where “4 of the edge is decreased:;
[0091] S36: whether “14 of all FPGA connection pairs is updated is determined, if so, the process is ended; otherwise, S35 is performed.
[0092] In this embodiment, S32 is performed specifically as follows:
[0093] The TDM ratio of each edge is decreased, and the TDM ratio of each edge is updated according to the following formula: , staratio X etr; x = ung,
[0094] 9; , .
[0095] Wherein, etr 4 is a new TDM ratio of ET ; staratio is an optimization goal of ngmir and is defined by users; TRI; is a net group of ngl; with a maximum TDM ratio.
[0096] In this embodiment, S35 is specifically as follows:
[0097] The sum totalpct of reciprocals of the TDM ratios of all the edges of the current FPGA connection pair is calculated, and whether totalpct is less than or equal to 1 is determined; if totalpct is less than or equal to 1, it is determined that the current FPGA connection pair meets the TDM ratio constraint, and processing of the current FPGA connection pair is skipped; otherwise, it is determined that the current FPGA connection pair does not meet the TDM ratio constraint, and the following steps are formed for | ;502060 subsequent legalization, that is, if the TDM ratio of the edge Cikis increased, the new TDM ratio etr’s is used directly; or, if the TDM ratio of the edge Es, is decreased, the TDM ratio of the edge €ik is validated according to the following equation: etr”,,= etr”;,, X rec X (rec + ad)
[0098] 1— ad
[0099] In this embodiment, TDM 1s used to transmit multiple signals in one wire at different times to solve the problem of routing deficiencies. FIG. 1 illustrates a simple diagram of TDM used between two FPGAs. Wherein, the two big rectangles represent two FPGAs, the two trapezoids in the two big rectangles represent modules to be routed in the two FPGAs. The six small rectangles represent examples. The three dotted arrows represent three different signals. The sold arrow represents a unique metal wire between the two FPGAs. As shown in FIG. 1, only one signal can be transmitted through the metal wire within one system clock period. By adoption of TDM, three signals can be transmitted at different times within one system clock period through the metal wire. The TDM may improve the routability of a system with the increase of a system delay.
[00100] In this embodiment, F represents all FPGA nets; P represents all FPGA connection pairs, and each FPGA connection pair Pk toutes two FPGAs; N represents a net group formed by two or more FPGA nets; NG represents a net group set, and each net group belongs to the net group set ng, € NG ; each net A, may belong to different net groups, and the net group comprises a sub-set ngl, < NG . The net group is defined according to the design goal. For example, nets with the same attribute or the same power may be configured in the same net group. The goal is to route all nets and assign a TDM ratio to each net to minimize a maximum TDM ratio of each net group.
[00101] As required by the actual condition, the TDM ratio meets the following | 500060 requirement: 1 > La “ee, El,
[00102] “45% CIE * etr,, €{x|x=2xy,ye N',2< x < 4294967296}
[00103] J: er, e , el,
[00104] Wherein, J* is the TDM ratio of the edge /*, and is a set of edges using the FPGA connection pair Pk The TDM ratio is an even number as required by the implementation of multiplexing hard-wiring.
[00105] The system clock period is a time of arrival from a source point to a meeting point (end point). When one FPGA connection pair is used by one net, the system clock period of the connection pair is the TDM ratio assigned to the connection pair. First, the maximum system clock period has a great influence on the system delay; second, the net group with a maximum system clock period determines the delay of the whole system. Meanwhile, because the TDM ratio reflects the duration of the system clock period, the optimization goal of the invention is to minimize the maximum TDM ratio of the net groups. . n, . ng,
[00106] The TDM ratio of each / and the TDM ratio of each ! are defined as follows: nir, = > er, cel
[00107] “rE ner, = > nir, n,enl,
[00108] J el, n, nl .
[00109] Wherein, is a set of edges of the net /, ‘ is a set of nets of the net group NE: , nr, is the TDM ratio of the net N; , and NEU, is the TDM ratio of the net 1006060 group ns;
[00110] The optimization goal of this embodiment is defined as follows:
[00111] Minimize : ngmtr = {x| X = max (ngtr,, +, ngtr, )}
[00112] Wherein, Πis the number of all net groups, and ngmir is the maximum TDM ratio of the net groups.
[00113] Preferably, as shown in FIG. 2, the construction of a router in this embodiment comprises three steps.
[00114] First, routing topology generation: in this step, a routing prototype is generated for each net, and before TDM ratio assignment, FPGAs in each net are routed or all nets are routed in parallel to guarantee the connectivity of the nets.
[00115] Second, TDM ratio assignment: a proper TDM ratio is assigned to each edge of each net according to a delay of each net group.
[00116] Third, system delay optimization: net groups with large TDM ratios are continuously optimized in parallel by iteration until an iteration end condition is met, so that processing of the entire router is ended.
[00117] First step for constructing the router (routing topology generation):
[00118] In the first step for constructing the router, a routing topology of each net is generated to ensure the connectivity of the nets. The process of this step, as shown in FIG. 3, mainly comprises five steps. In the first step, all the nets are sorted by priority; and from the second step to the fifth step, a whole net set is processed cyclically, and FPGAs of one net are routed in each cycle until routing topologies all the nets are determined.
[00119] First step: before routing of each net, all the nets are sorted according to indicators: first, because the delay is related to the number of nets and a net group with more nets is more likely to cause a delay, all the net groups are sorted in a decreasing | ;502060 order according to the number of nets; then, because routing will become more difficult with the increase of the number of FPGAs in the nets, all nets in each net group are sorted in a decreasing order according to the number of FPGAs; and finally, all the nets are extracted in order.
[00120] Second step: a routing graph of a current net is established based on an FPGA connection pair set and an FPGA set in input data of a data set formed by the FPGA set, FPGA connection pairs and net groups, FPGAs to be routed are marked out, and a cost of each FPGA connection pair is marked. The cost of each FPGA connection pair is initialized to 1 before a first loop and is updated every time one net is routed according to the fifth step.
[00121] Third step: in terms of the established routing graph, the current net is routed through a Dijkstra-based Steiner tree algorithm (K. Mehlhorn, “A faster approximation algorithm for the Steiner problem in graphs,” Information Processing Letters, vol. 27, no. 3, pp. 125-128, 1988.) to construct a Steiner tree to route the FPGAs to be routed.
[00122] Fourth step: a routing topology of the current net is saved and recorded.
[00123] Fifth step: a cost of each FPGA pair is updated; and if an FPGA pair is selected by the current net, the FPGAs are routed through the selected FPGA pair, and the cost of the selected FPGA pair is increased by 1.
[00124] (2) Second step for constructing the router (TDM ratio assignment):
[00125] Although the routing topology has a great influence of the whole solution, it is hardly possible to assign TDM ratios in the stage of routing topology generation, so a suitable TDM ratio assignment method is designed after routing topologies are generated.
The second step designed in the invention is TDM ratio assignment. The process of this step, as shown in FIG. 4, comprises seven steps. From the second step to the seventh step,
the entire FPGA connection pair set is cyclically processed, and in each cycle, TDM ratios | 502060 are assigned to edges of the current FPGA connection pair, and the cycle is ended when all FPGA connection pairs are processed. From the fourth step to the sixth step, a set of edges of the current FPGA connection pair is processed cyclically, a TDM ratio is assigned to the current edge in each cycle, and the cycle is ended when all edges are processed.
[00126] To assign suitable TDM ratios to the nets, each FPGA connection pair should take into consideration a TDM ratio constraint. The edges of one FPGA connection pair cannot be processed according to the same TDM ratio. Since the goal is to optimize the maximum TDM ratio of the net group, the TDM ratio of each edge may be determined according to the position of the net group and the TDM ratio constraint. Moreover, the TDM ratio of the net group is closely related to the number of edges, so the TDM ratio can be effectively assigned when the FPGA connection pairs are processed one by one according to the number of edges of the net group.
[00127] So, this embodiment provides a weight ratio calculation method. The e, weight ratio of each edge / is calculated as follows: _ ngmec Jk pct jk 7” ~~ > ngmec, ,
[00128] “ok Sl ngmec, , = {x | x = max (ngec, . <,Ngec, , )} ng. nel, ngec,
[00129] Wherein, 5 jm is an m net group in Ej , ECC) m is the number of ng. ngmec. . . edges of the net group 5, w, EMEC; x is the number of edges of the net group with a . e, . . nel, ct, maximum number of edges /*, P is the number of net groups in 5}, , and Pl is the weight ratio; based on the weight ratio, the TDM ratio Wik of the edge is calculated as follows: LU502060 etr., = ! Jk TT
[00130] Pet
[00131] First, each net group is preprocessed, that is, the number MECC jm of edges of each net group NE 7m required for assigning the TDM ratio to each edge is calculated.
[00132] Second, the weight ratio peli of each edge Cr using the current FPGA connection pair is calculated according to the pet, = ngmec, > ngmec, , formula “ok Sl
[00133] (3) Third step for constructing the router (system delay optimization):
[00134] In the stage of TDM ratio assignment, an initial TDM ratio is assigned to each edge. Because the delay is estimated according to the number of edge of the net group, which cannot be completely identical with the actual condition, the TDM ratio needs to be further optimized. To optimize the maximum TDM ratio of each net group, a system delay optimization method is introduced in the third step for constructing the router. The process of this step, as shown in FIG. 5, comprises a step of TDM ratio reduction and a step of edge legalization which are performed sequentially. These two steps are performed cyclically until conditions are met, so that the third step is ended, and the construction process of the entire router is also ended.
[00135] As shown in FIG. 6, the process of the first step of TDM ratio reduction of system delay optimization comprises three steps. From the second step to the third step, a whole net group set is processed cyclically, and the TDM ratio of the current net group is decreased in each cycle until all net groups are processed.
[00136] First step of TDM ratio reduction: all the nets are sorted in a decreasing order according to maximum TDM ratios of the net groups to which the nets belong. LU502060
[00137] Second step of TDM ratio reduction: TR of each edge is reduced. TR of each edge 1s updated according to the following formula: etr”,, = staratio X etr; +
[00138] mng;
[00139] Third step of TDM ratio reduction: TR of the net group is updated.
[00140] The process of the second step of edge legalization of system delay optimization is shown in FIG. 5. This process is used to guarantee that a result obtained by TDM ratio reduction meets the TDM ratio constraint. Different from the process of TDM ratio reduction, this process is mainly used to process each FPGA connection pair rather than each net group. In the process of edge legalization, the whole FPGA connection pair set is processed cyclically, and the current FPGA connection pair is validated in each cycle until all FPGA connection pairs are processed.
[00141] If the FPGA connection pair Pr meets the TDM ratio constraint, the TDM ratio of the edge Cikin Pris replaced with a new TDM ratio ft 34 If PF does not meet the TDM ratio constraint, the edge €ik directly uses the new TDM ratio etr iy case where the TDM ratio of the edge + is increased, or the edge €ik is validated according to the following equation in case where the TDM ratio of the edge €ik is deceased. etr”,,= etr’; x X rec X (rec + ad)
[00142] 1— ad
[00143] (4) Parallelization method
[00144] To further improve the running efficiency of the ALIFRouter, a multi-thread parallelization method is used in all stages of the ALIFRouter. In the stage of routing topology generation, routing of the nets may be performed in parallel. However, | ;502060 S15 in this stage should be locked to avoid resource conflicts between different nets. In the stage of TDM ratio assignment, each FPGA connection pair is processed completely independently, so in this stage, the parallelization method can greatly increase the processing speed. In the stage of system delay optimization, TDM ratio reduction of the nets may be performed in parallel. However, the third step in the stage should be locked to avoid resource conflicts between different nets. The step of edge legalization may be performed completely in parallel like the stage of TDM ratio assignment in S2.
[00145] Preferably, this embodiment designs an overall routing process. Input data of this process includes an FPGA set, an FPGA connection pair set, a net set and a net group set, and output data is a routing solution. This process mainly comprises three sub-processes, routing topology generation, TDM ratio assignment and system delay optimization, which are performed in sequence.
[00146] Preferably, in this embodiment, before TDM ratio assignment, a routing topology of each net is generated through the sub-process of routing topology generation to optimize the routability of a multi-FPGA prototyping system. After the routing topology is obtained, a TDM ratio is assigned to each edge of each net through the sub-process of TDM ratio assignment.
[00147] Preferably, in this embodiment, the process of system delay optimization comprises two sub-processes, TDM ratio reduction and edge legalization, which are performed in sequence. In this process, an initial TDM ratio assignment solution generated in the sub-process of TDM ratio assignment is optimized to minimize a maximum system delay.
[00148] Preferably, this embodiment designs a multi-thread parallelization method to improve the running efficiency of a router and shorten the runtime of the router.
[00149] The above embodiments are merely preferred ones of the invention. All equivalent variations and modifications made according to the patent application scope of | ;502060 the invention should fall within the protection scope of the invention.

Claims (9)

CLAIMS LU502060
1. A method for constructing a practical architecture-level FPGA router for logic verification, comprising the following steps: S1: generating a routing topology of each net: generating a routing prototype for each net, and before TDM ratio assignment, routing FPGAs in each net or routing all nets in parallel to guarantee the connectivity of the nets; S2: performing TDM ratio assignment: assigning a TDM ratio to each edge of each net or assigning TDM ratios in parallel according to a delay of each net group; and S3: optimizing a system delay of a multi-FPGA prototyping system subjected to routing: continuously optimizing net groups with large TDM ratios in parallel by iteration until an iteration end condition is met, so that an entire router is processed.
2. The method for constructing a practical architecture-level FPGA router for logic verification according to Claim 1, wherein S1 specifically comprises the following steps: S11: sorting all the nets by priority; S12: establishing a routing graph of a current net based on an FPGA connection pair set and an FPGA set in input data of a data set formed by the FPGA set, FPGA connection pairs and net groups, marking out FPGAs to be routed, and marking out a cost of each FPGA connection pair; S13: in terms of the established routing graph, routing the current net through a Dijkstra-based Steiner tree algorithm to construct a Steiner tree to connect the FPGAs to be connected; S14: saving and recording a routing topology of the current net; S15: updating costs of edges in the routing graph; initializing the cost of each FPGA connection pair to 1 before a first loop, and increasing the cost of an FPGA connection pair selected by the current net to route the FPGAs by 1; and S16: traversing each net through a for-loop to determine whether FPGAs of all the nets are routed; if so, ending routing; otherwise, performing S12. LU502060
3. The method for constructing a practical architecture-level FPGA router for logic verification according to Claim 2, wherein in S15, the costs of the edges in the routing graph are updated specifically as follows: the cost of each FPGA pair is updated; if an FPGA pair is selected by the current net, the selected FPGA pair is used to route the FPGAs, and the cost of the selected FPGA pair is increased by 1.
4. The method for constructing a practical architecture-level FPGA router for logic verification according to Claim 2, wherein S11 1s performed specifically as follows: before routing of each net, all the nets are sorted according to indicators: first, all the net groups are sorted in a decreasing order according to the number of nets; then, all nets in each net group are sorted in a decreasing order according to the number of FPGAs; and finally, all the nets are extracted in order.
5. The method for constructing a practical architecture-level FPGA router for logic verification according to Claim 1, wherein S2 specifically comprises the following steps: S21: preprocessing each net group, that is, calculating, by counting, a maximum number ECC m of edges of a net group NE 7m including an edge 34. S22: calculating a weight ratio Pet, x of each edge Crk of a current FPGA connection pair to obtain a TDM ratio to be assigned to each edge of each net; S23: traversing each net through a for-loop to determine whether all edges are processed; if so, performing S24; otherwise, performing S22; S24: calculating a TDM ratio of a current edge, and recording the TDM ratio of the current edge; S25: traversing each net through a for-loop to determine whether all edges are processed; if so, performing S26; otherwise, performing S24; and S26: traversing each net through a for-loop to determine whether all connection pairs are processed; if so, ending the process; otherwise, performing S22. LU502060
6. The method for constructing a practical architecture-level FPGA router for logic verification according to Claim 5, wherein S22 is performed specifically as follows: e . . the weight ratio of each 7 is calculated as follows: ngmec, pct jk =H— —— > ngmec, , e, sel; ngmec,, = {x | x = max (ngec, ,. ©, ngec; 5 )} ng. . nel, ngec. . wherein, 5 jm is an m“ net group in 51, , ECC m is the number of edges of ng. ngmec. . . the net group Em , & Jk is the number of edges of the net group with a e, . nel ct, maximum number of edges /*, P is the number of net groups in 5}, , and Pk is the weight ratio; based on the weight ratio, the TDM ratio Wik of the edge is calculated as follows: ; 1 etr,, =—— jh pet;
7. The method for constructing a practical architecture-level FPGA router for logic verification according to Claim 1, wherein S3 specifically comprises the following steps: S31: sorting net groups: sorting all the nets in a decreasing order according to maximum TDM ratios of the net groups to which the net belong; S32: updating Wik of the edge IA in the net group “a S33: determining whether Uk of all edges in the net group is updated according to . .__ ngl . . S32; if so, updating /; otherwise, performing S32; S34: determining whether all net groups are traversed; if so, performing S35;
otherwise, performing S32; LU502060 S35: determining whether the FPGA connection pair Pr meets a TDM ratio constraint; if so, updating el ; otherwise, updating “Mik in case where “x of the edge ©kis increased, or validating the edge Cik in case where “it of the edge is decreased; S36: determining whether rk of all FPGA connection pairs is updated; if so, ending the process; otherwise, performing S35.
8. The method for constructing a practical architecture-level FPGA router for logic verification according to Claim 7, wherein S32 is performed specifically as follows: the TDM ratio of each edge is decreased, and the TDM ratio of each edge is updated according to the following formula: etr”,, = staratio X etr; ; mng; wherein, eÉT sk is a new TDM ratio of ELTA , staratio is an optimization goal of ngmtr and is defined by users; mng; is a net group of ngl; with a maximum TDM ratio.
9. The method for constructing a practical architecture-level FPGA router for logic verification according to Claim 7, wherein S35 is performed specifically as follows: the sum totalpct of reciprocals of the TDM ratios of all the edges of the current FPGA connection pair is calculated, and whether totalpct is less than or equal to 1 is determined; if totalpct is less than or equal to 1, it is determined that the current FPGA connection pair meets the TDM ratio constraint, and processing of the current FPGA connection pair is skipped; otherwise, it is determined that the current FPGA connection pair does not meet the TDM ratio constraint, and the following steps are formed for subsequent legalization,
that is, if the TDM ratio of the edge Cikis increased, the new TDM ratio etr 4 is used 1502060 directly; or, if the TDM ratio of the edge €ik is decreased, the TDM ratio of the edge €ik is validated according to the following equation:
oir? = etr’; X rec X (rec + ad)
Dh 1—ad
LU502060A 2022-05-10 2022-05-10 Method for constructing practical architecture-level fpga router for logic verification LU502060B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
LU502060A LU502060B1 (en) 2022-05-10 2022-05-10 Method for constructing practical architecture-level fpga router for logic verification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
LU502060A LU502060B1 (en) 2022-05-10 2022-05-10 Method for constructing practical architecture-level fpga router for logic verification

Publications (1)

Publication Number Publication Date
LU502060B1 true LU502060B1 (en) 2022-11-10

Family

ID=84046086

Family Applications (1)

Application Number Title Priority Date Filing Date
LU502060A LU502060B1 (en) 2022-05-10 2022-05-10 Method for constructing practical architecture-level fpga router for logic verification

Country Status (1)

Country Link
LU (1) LU502060B1 (en)

Similar Documents

Publication Publication Date Title
US20220398373A1 (en) Multi-stage fpga routing method for optimizing time division multiplexing
Singh et al. A heuristic algorithm for the fanout problem
US20130218299A1 (en) MCP Scheduling For Parallelization Of LAD/FBD Control Program In Multi-Core PLC
US7886252B2 (en) Same subgraph detector for data flow graph, high-order combiner, same subgraph detecting method for data flow graph, same subgraph detection control program for data flow graph, and readable recording medium
US8789031B2 (en) Software constructed strands for execution on a multi-core architecture
US20150213188A1 (en) Concurrent timing-driven topology construction and buffering for vlsi routing
CN102395954A (en) Apparatus &amp; associated methodology of generating a multi-core communications topology
US11055210B2 (en) Software test equipment and software testing method
CN109656544A (en) A kind of cloud service API adaptation method based on execution route similarity
CN113128143B (en) AI processor simulation method, AI processor simulation device, computer equipment and storage medium
CN105431825A (en) System and/or method for computing interprocedural dominators
US8612917B2 (en) Method and system for selecting gate sizes, repeater locations, and repeater sizes of an integrated circuit
US20100281447A1 (en) Method for detecting contradictory timing constraint conflicts
US8266573B2 (en) Method and system for test point insertion
KR20230120850A (en) Deep-learning compiler for supporting heterogeneous computing platform and method thereof
LU502060B1 (en) Method for constructing practical architecture-level fpga router for logic verification
Li et al. High quality hypergraph partitioning for logic emulation
US7146590B1 (en) Congestion estimation for programmable logic devices
CN116306424A (en) PISA architecture chip resource arrangement method based on dynamic amplification layer-by-layer optimization algorithm with adjustable level margin improvement
CN103150461B (en) Parallel integration method and system thereof for IC design
Martin et al. An adaptive sequential decision making flow for FPGAs using machine learning
Bergamaschi et al. Scheduling under resource constraints and module assignment
KR102595347B1 (en) Jangin training method and device for supervised learning with small data set
JP7400833B2 (en) Topology design device, topology design method, and program
Papa et al. Automatic large-scale integrated circuit synthesis using allocation-based scheduling algorithm

Legal Events

Date Code Title Description
FG Patent granted

Effective date: 20221110