US7191426B1 - Method and apparatus for performing incremental compilation on field programmable gate arrays - Google Patents

Method and apparatus for performing incremental compilation on field programmable gate arrays Download PDF

Info

Publication number
US7191426B1
US7191426B1 US10/931,953 US93195304A US7191426B1 US 7191426 B1 US7191426 B1 US 7191426B1 US 93195304 A US93195304 A US 93195304A US 7191426 B1 US7191426 B1 US 7191426B1
Authority
US
United States
Prior art keywords
design
node
placement
determining whether
equivalent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US10/931,953
Inventor
Deshanand Singh
Stephen Brown
Kevin Chan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Altera Corp
Original Assignee
Altera Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Altera Corp filed Critical Altera Corp
Priority to US10/931,953 priority Critical patent/US7191426B1/en
Assigned to ALTERA CORPORATION reassignment ALTERA CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BROWN, STEPHEN, CHAN, KEVIN, SINGH, DESHANAND
Application granted granted Critical
Publication of US7191426B1 publication Critical patent/US7191426B1/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/34Circuit design for reconfigurable circuits, e.g. field programmable gate arrays [FPGA] or programmable logic devices [PLD]

Definitions

  • the present invention relates to the field of field programmable gate arrays (FPGAs). More specifically, the present invention relates to a method and apparatus for performing incremental compilation on systems on FPGAs using tools such as electronic design automation (EDA) tools.
  • EDA electronic design automation
  • FPGAs may be used to implement large systems that include millions of gates and megabits of embedded memory.
  • placement of components on the FPGAs, and routing connections between components on the FPGA utilizing available resources can be the most challenging and time consuming.
  • several iterations are often required to determine how components are to be placed on the target device and which routing resources to allocate to the components.
  • the complexity of large systems often requires the use of EDA tools to manage and optimize their design onto physical target devices.
  • Automated placement and routing algorithms in EDA tools perform the time consuming task of placement and routing of components onto physical devices.
  • a method and apparatus to support incremental compilation of a new design utilizing placement and/or routing strategies generated from a previous design on an FPGA. Differences between two netlists are identified to determine which nodes are equivalent and which nodes are new or have changed since a last compilation. An initial placement is created by assigning locations to the nodes that are equivalent or similar. Any illegal placement of the nodes is corrected using an incremental placement procedure. Greedy optimization is further performed if necessary. Incremental routing of the placed design is performed where routing associated with equivalent nodes from the first netlist is preserved.
  • FIG. 1 is a flow chart illustrating a method for designing a system according to an embodiment of the present invention
  • FIG. 2 illustrates a target device utilizing FPGAs according to an embodiment of the present invention
  • FIG. 3 illustrates a LAB according to an embodiment of the present invention
  • FIG. 4 is a flow chart illustrating a method for performing incremental placement according to an embodiment of the present invention
  • FIG. 5 illustrates fanin, fanout, and sibling relationship move proposals according to an embodiment of the present invention
  • FIG. 6 illustrates an exemplary critical vector move proposal according to an embodiment of the present invention
  • FIG. 7 illustrates horizontal and vertical cut-lines used for local congestion estimation according to an embodiment of the present invention
  • FIG. 8 is a flow chart illustrating a method for performing incremental placement utilizing directed hill-climbing according to an embodiment of the present invention
  • FIG. 9 illustrates a component trapped in a local minima according to an embodiment of the present invention.
  • FIG. 10 illustrates basin-filling according to an embodiment of the present invention.
  • FIG. 11 illustrates a method for performing routing according to an embodiment of the present invention.
  • FIG. 1 is a flow chart that illustrates a method for designing a system according to an embodiment of the present invention.
  • the method may be performed with the assistance of an EDA tool, for example.
  • synthesis is performed.
  • Synthesis includes generating a logic design of the system to be implemented by a target device.
  • synthesis generates an optimized logical representation of the system from a Hardware Description Language (HDL) design definition.
  • the optimized logical representation of the system may include a representation that includes a minimized number of logic gates and logic elements required for the system.
  • the optimized logical representation of the system may include a representation that has a reduced depth of logic and that generates a lower signal propagation delay.
  • Synthesis also includes mapping the optimized logic design.
  • Mapping includes determining how to implement the logic components such as logic gates in the optimized logical representation with general resources available on the target device.
  • a netlist is generated from mapping.
  • the netlist illustrates how the general resources available on the target device are utilized to implement the system.
  • the netlist may, for example, include a representation of the resources on the target device as nodes and how the nodes are connected.
  • the netlist may be an optimized technology-mapped netlist generated from the HDL.
  • FIG. 2 illustrates an exemplary target device 200 utilizing FPGAs according to an embodiment of the present invention.
  • the present invention may be used to design a system onto the target device 200 .
  • the target device 200 is a chip having a hierarchical structure that may take advantage of wiring locality properties of circuits formed therein.
  • the lowest level of the hierarchy is a logic element (LE) (not shown).
  • An LE is a small unit of logic providing efficient implementation of user logic functions.
  • an LE may include a 4-input lookup table (LUT) with a configurable flip-flop.
  • LUT 4-input lookup table
  • the target device 200 includes a plurality of logic-array blocks (LABs). Each LAB is formed from 10 LEs, LE carry chains, LAB control signals, LUT chain, and register chain connection lines. LUT chain connections transfer the output of one LE's LUT to the adjacent LE for fast sequential LUT connections within the same LAB. Register chain connection lines transfer the output of one LE's register to the adjacent LE's register within a LAB. LABs are grouped into rows and columns across the target device 200 . A first column of LABs is shown as 210 and a second column of LABs is shown as 211 .
  • the target device 200 includes memory blocks (not shown).
  • the memory blocks may be, for example, dual port random access memory (RAM) blocks that provide dedicated true dual-port, simple dual-port, or single port memory up to various bits wide at up to various frequencies.
  • RAM dual port random access memory
  • the memory blocks may be grouped into columns across the target device in between selected LABs or located individually or in pairs within the target device 200 .
  • the target device 200 includes digital signal processing (DSP) blocks (not shown).
  • DSP digital signal processing
  • the DSP blocks may be used to implement multipliers of various configurations with add or subtract features.
  • the DSP blocks include shift registers, multipliers, adders, and accumulators.
  • the DSP blocks may be grouped into columns across the target device 200 .
  • the target device 200 includes a plurality of input/output elements (IOEs) (not shown). Each IOE feeds an I/O pin (not shown) on the target device 200 .
  • the IOEs are located at the end of LAB rows and columns around the periphery of the target device 200 .
  • Each IOE includes a bidirectional I/O buffer and a plurality of registers for registering input, output, and output-enable signals. When used with dedicated clocks, the registers provide performance and interface support with external memory devices.
  • the target device 200 includes LAB local interconnect lines 220 – 221 that transfer signals between LEs in the same LAB.
  • the LAB local interconnect lines are driven by column and row interconnects and LE outputs within the same LAB.
  • Neighboring LABs, memory blocks, or DSP blocks may also drive the LAB local interconnect lines 220 – 221 through direct link connections.
  • the target device 200 also includes a plurality of row interconnect lines (“H-type wires”) 230 that span fixed distances.
  • Dedicated row interconnect lines 230 that include H 4 231 , H 8 232 , and H 24 233 interconnects, route signals to and from LABs, DSP blocks, and memory blocks within the same row.
  • the H 4 231 , H 8 232 , and H 2 233 interconnects span a distance of up to four, eight, and twenty-four LABs respectively, and are used for fast row connections in a four-LAB, eight-LAB, and twenty-four-LAB region.
  • the row interconnects 230 may drive and be driven by LABs, DSP blocks, RAM blocks, and horizontal IOEs.
  • the target device 200 also includes a plurality of column interconnect lines (“V-type wires”) 240 that operate similarly to the row interconnect lines 230 .
  • the column interconnect lines 240 vertically routes signals to and from LABs, memory blocks, DSP blocks, and IOEs.
  • Each column of LABs is served by a dedicated column interconnect, which vertically routes signals to and from LABs, memory blocks, DSP blocks, and IOEs.
  • These column interconnect lines 240 include V 4 241 , V 8 242 , and V 16 243 interconnects that traverse a distance of four, eight, and sixteen blocks respectively, in a vertical direction.
  • FIG. 2 illustrates an exemplary embodiment of a target device.
  • a system may include a plurality of target devices, such as that illustrated in FIG. 2 , cascaded together.
  • the target device may include programmable logic devices arranged in a manner different than that on the target device 200 .
  • a target device may also include components other than those described in reference to the target device 200 .
  • the invention described herein may be utilized on the architecture described in FIG. 2 , it should be appreciated that it may also be utilized on different architectures, such as those employed by Altera® Corporation in its APEXTM, MercuryTM, StratixTM, and StratixTM II family of chips and those employed by Xilinx®, Inc. in its VirtexTM and VirtexTM II line of chips.
  • FIG. 3 illustrates a LAB or clustered logic block 300 according to an embodiment of the present invention.
  • the LAB 300 may be used to implement any of the LABs shown in FIG. 2 .
  • LEs 301 – 303 illustrates a first, second, and tenth LE in the LAB 300 .
  • the LEs 301 – 303 each have a 4-input lookup table 311 – 313 , respectively, and a configurable register 321 – 323 s , respectively, connected at its output.
  • the LAB 300 includes a set of input pins 340 and a set of output pins 350 that connect to the general-purpose routing fabric so that LAB can communicate with other LABs.
  • the inputs to lookup tables 311 – 313 can connect to any one of the input pins 340 and output pins 350 using the appropriate configuration bits for each of the multiplexers 330 .
  • the number of LEs, n E , input pins, n I , and output pins, n O in a LAB impose important architectural constraints on a system.
  • the configurable registers 321 – 323 must be clocked by the same signal and initialized by the same signal.
  • the number of clock lines available in a LAB is represented by n C .
  • the number of reset lines available in a LAB is represented by n R .
  • placement works on the optimized technology-mapped netlist to produce a placement for each of the logic components.
  • placement includes fitting the system on the target device by determining the specific resources on the target device to be used for implementing the general resources mapped for logic components at 101 .
  • the placement procedure may be performed by a placer in an EDA tool that utilizes placement algorithms.
  • a user may provide input to the placer by specifying placement constraints.
  • routing of the system is performed.
  • routing resources on the target device are allocated to provide interconnections between logic gates, logic elements, and other components on the target device.
  • the routing procedure may be performed by a router in an EDA tool that utilizes routing algorithms.
  • synthesis is performed to generate a new logic design of the system to be implemented by the target device.
  • a new netlist is generated for the new logic design.
  • the synthesis is performed in response to layout-driven optimizations.
  • the layout-driven optimizations may be generated by using routing delays for connections on the netlist that are estimated by calculating a fastest possible route.
  • Timing-driven netlist optimization techniques may be applied to perturb the first netlist generated at 101 to reduce the critical path(s).
  • the first netlist may be perturbed by an EDA tool, a user of the EDA tool, or by a third party.
  • Perturbing the netlist may include adding, deleting, or moving components.
  • preferred locations are identified for the components that have been added or moved from the layout-driven optimization.
  • the locations assigned to components of the existing system from the placement procedure are identified as preferred locations for the components.
  • a netlist generated during synthesis at 101 is compared with a netlist generated at 104 (second netlist).
  • the comparison may be used to determine changes with regard to addition of (new) nodes in the second netlist, deletion of (old) nodes in the first netlist, or movement of (old) nodes in the second netlist.
  • a cost function is used to determine the likelihood that a first node synthesized in the first netlist is equivalent to a second node synthesized in the second netlist. If this probability exceeds a first threshold value, the two nodes are considered equivalent.
  • the cost function may be based on timing and/or placement constraints. If the first and second nodes have similar timing and/or placement constraints, such as for example maximum operating frequency restrictions on the node's connections or boundary restrictions for the node's placement, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
  • the cost function may be based on the number and/or the identity of the input connections (fanins) of a node. If the first and second nodes are driven by a same number of nodes or nodes that have been identified as being equivalent, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
  • the cost function may be based on the number and/or the identity of the output connections (fanouts) of a node. If the first and second nodes drive a same number of nodes or nodes that have been identified as being equivalent, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
  • the cost function may be based on a bit string (LUT mask) that effectively represents the truth table for a function being implemented by the node. If the first and second nodes have the same LUT mask, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
  • LUT mask bit string
  • the cost function may be based on the identity of neighboring nodes (siblings) of a node. If the first and second nodes are surrounded by siblings that are equivalent, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
  • the cost function may be based on the resource type of the node.
  • a resource type may be a category of resource such as logic element, pin, memory block, or other type of resource. If the first and second nodes are of the same resource type, the cost function may indicate that there is a probability that the first and second nodes are equivalent.
  • the cost function may be based on the synthesized name of a node. If the first and second nodes have the same or similar name, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
  • one or more of the parameters described may be utilized by the cost function to determine equivalence and that other criteria may also be used. According to an embodiment of the present invention, a match between the first and second nodes with respect to one of the parameters may not guarantee a determination of equivalence. Similarly, a cost function may determine that a first and a second node are equivalent regardless of whether there are differences with respect to one of the parameters,
  • a set of equivalent nodes that exist in both the first and second netlist and a set of new nodes which may be new to the first netlist or moved from the first netlist are identified.
  • the number of differences between the first and second netlists may be measured by the number of nodes in the second netlist that are not equivalent to nodes in the first netlist. If it is determined that the number of differences between the first and second netlists do not exceed the second threshold value, control proceeds to 107 . If it is determined that the number of differences between the first and second netlists exceed the second threshold value, control proceeds to 114 .
  • initial placement is performed.
  • initial placement is performed by attempting to put all the nodes in the second netlist at an optimal or preferred location without considering the legality or illegality of the placement. This would allow for nodes to be placed in same location as another nodes. Nodes in the second netlist that are determined to be equivalent to nodes in the first netlist are placed at the previous locations assigned to the nodes at 102 .
  • Nodes in the second netlist that are not determined to have equivalent nodes in the first netlist have initial locations assigned.
  • assignment of initial locations may be achieved by considering locations of inputs and outputs of a node and placing the node in a location relatively central to the inputs and outputs.
  • timing analysis information from a previous compile that may have been performed during procedures 101 – 103 or timing analysis information from a timing analysis performed during a different procedure may be utilized to product an optimal placement that considers the most critical path associated with the node.
  • the node may be placed at one of its inputs or outputs along the most critical path. The placement of the nodes may involve assigning initial locations that are not legal.
  • an incremental placement procedure is performed in order to resolve any illegalities of placement generated from 107 .
  • Incremental placement involves evaluating resources on a target device such as LABs that have architectural violations or illegalities from initial placement. Incremental placement attempts to perturb the preferred locations as little as possible to ensure that the final placement respects all architectural constraints. Incremental placement attempts to identify non-critical LEs that may be moved from their preferred locations to resolve architectural violations in order that truly critical elements may stay at their preferred locations. Incremental placement may be performed by an incremental placement engine (not shown) in the EDA tool that utilizes incremental placement algorithms.
  • an architectural description of the target device, A, and a netlist, N(E,C), that includes a set of logic elements, E, and a set of connections, C, is processed.
  • Each element, e is associated with a preferred physical location, (p x (e), p y (e)).
  • all atoms of the netlist have a preferred location.
  • Incremental placement generates a set of mapped locations, M, for each logic elements in N. Incremental placement tries to find a mapping from preferred locations to mapped locations, P ⁇ M, such that the mapped locations are architecturally feasible as well as being minimally disruptive. The definition of minimal disruption depends on the goal of netlist optimization.
  • the goal of netlist optimization is to optimize timing of the system.
  • T(S) represents an estimate of the critical path delay if all logic elements in E are mapped to (s x (e), s y (e)). The estimate may ignore the legality of locations and may be computed assuming a best case route is possible for each connection.
  • P ⁇ M is minimally disruptive if incremental placement minimizes ⁇ T(M) ⁇ T(P) ⁇ . Any logic element can be moved from its preferred location as long as it does not degrade the critical path.
  • routing area is also tracked to control excessive routing congestion.
  • A(S) represents the routing area consumed if the logic elements are mapped to (s x (e), s y (e)). Minimal disruptiveness is satisfied by minimizing the relationships shown below. ⁇ T(M) ⁇ T(P) ⁇ + ⁇ A(M) ⁇ A(P) ⁇ (1)
  • FIG. 4 is a flow chart illustrating a method for performing incremental placement according to an embodiment of the present invention.
  • the method described in FIG. 4 may be used to perform incremental placement as shown as 105 in FIG. 1 .
  • proposed moves for all LEs in a LAB having architectural violations are generated.
  • proposed moves may include a move-to-fanin, move-to-fanout, move-to-sibling, move-to-neighbor, move-to-space, a move towards a critical vector, and other moves.
  • a move-to-fanin involves moving an LE to a LAB that is a fanin of the LE.
  • a move-to-fanout involves moving an LE to a LAB that is a fanout of the LE.
  • a move-to-sibling involves moving an LE to a LAB that is fanout of a LAB that fans in to the LAB of the LE.
  • FIG. 5 illustrates examples of a move-to-fanin, move-to-fanout, and move-to-sibling.
  • a first LE in a first LAB transmits a signal to a second LE in a second LAB
  • the first LAB is said to be a fanin of the second LE.
  • a first LE in a first LAB receives a signal from a second LE in a second LAB
  • the first LAB is said to be a fanout of the second LE.
  • a first LE from a first LAB receives a signal from a second LE from a second LAB that also transmits to a third LE in a third LAB
  • the first LAB and the third LABs are said to be siblings.
  • Blocks 501 – 509 illustrate a plurality of LABs. Each of the LABs 501 – 509 has a number of shown LEs. A plurality of arrows 511 – 518 are shown to illustrate the direction of a signal transmitted between LEs. Relative to LAB 506 , LABs 501 – 504 are considered fanins, LABs 505 and 507 are considered siblings, and LABs 508 and 509 are considered fanouts.
  • Proposed moves may also include move-to-neighbor, move-to-space, and move towards critical vector.
  • a move-to-neighbor involves moving an LE to an adjacent LAB.
  • a move-to-space involves a move to any random free LE location in a target device.
  • a move towards critical vector involves moving an LE towards a vector that is computed by summing the directions of all critical connections associated with the moving LE.
  • FIG. 6 illustrates an exemplary critical vector 601 .
  • Vector 601 is the critical vector of LE 611 which has critical connections to LEs 612 and 613 , and a non-critical connection with LE 614 .
  • the cost function may include parameters which measure the legality of a LAB (cluster legality cost), timing (timing cost), and an amount of routing resources that is required for a placement (wirelength cost).
  • the cost function guides the reduction of architectural violations while ensuring minimal disruption.
  • This cost function, C is illustrated with the relationship shown below.
  • C K L *ClusterCost+ K T *TimingCost* K W *WirelengthCost (2)
  • K L , K T , and K W represent weighting coefficients that normalize the contributions of each parameter. It should be appreciated that other parameters may be used in addition to or in place of the parameters described.
  • the cluster legality cost is a cost associated with each LAB CL i . This cost may be represented as shown below.
  • ClusterCost( CL i ) kE i *legality( CL i , n E )+ KI i *legality( CL i , n I )+ kR i *legality(CL i , n R )+ kO i *legality( CL i , n O )+ kC i *legality( CL i , n C ) (3)
  • the legality (CL i , . . . ) function returns a measure of legality for a particular constraint. A value of 0 indicates legality, while any positive value is proportional to the amount to which the constraint has been violated.
  • Functions legality (CL i , n E ), legality (CL i , n I ), legality (CL i , n O ), legality (CL i , n R ), and legality (CL i , n C ) evaluate if LAB CL i has a feasible number of logic elements, inputs, outputs, reset lines and clock lines, respectively.
  • the weighting coefficients kE i , KI i , kO i , kR i , and kC I are all initially set to 1 for every LAB CL i in the target device.
  • TimingCost TC VPR +k DAMP *TC DAMP (4)
  • the first parameter, TC VPR is based upon the cost used by a versatile placement and routing (VPR) placer. This cost may be represented with the following relationship.
  • TC VPR ⁇ C crit( c )*delay( c ) (5)
  • This function encourages critical connections to reduce delay while allowing non-critical connections to optimize wirelength and other optimization criteria.
  • TC DAMP The second parameter, TC DAMP , operates as a damping component of the timing cost function and can be represented with the following relationships.
  • TC DAMP ⁇ C max(delay( c ) ⁇ maxdelay( c ), 0.0) (6)
  • maxdelay( c ) delay( c )+ ⁇ *slack( c ) (7)
  • the damping component penalizes any connection c whose delay(c) exceeds a maximum value maxdelay(c). This allows arbitrary moves to be made along a plateau defined by the maximum delays.
  • the maxdelay values may be updated every time a timing analysis of the system is executed.
  • the maxdelay values are controlled by the slack on the connection considered.
  • the parameter ⁇ determines how much of a connection's slack will be allocated to the delay growth of the connection.
  • the plateau is defined by the connection slack so that connection with large amounts of slack are free to move large distances in order to resolve architectural violations, while small slack values are relatively confined.
  • Wirelength cost of a placement may be measured by determining a number of routing wires that cross cut-lines that outline a LAB.
  • FIG. 7 illustrates the utilization of cut-lines according to an embodiment of the present invention.
  • Blocks 701 – 709 represent LABs having a plurality of shown LEs.
  • Horizontal cut-lines 711 and 712 and vertical cut-lines 713 and 714 are placed in each horizontal channel of a target device. Cut-lines provide a method to measure congestion by finding the regions that have the largest number of routing wires 721 – 724 . This measurement may be used to prevent the formation of localized congested areas that can cause circuitous routes.
  • the total number of routing wires that intersect a particular cut may be calculated by finding all the signals that intersect a particular cut-line and summing the average crossing-count for each of these signal wires.
  • the function q is given as a number of discrete crossing counts as a function of signal pin count.
  • the argument to the function q is the number of clustered logic block pins used to wire the signal.
  • the cost associated with any of the proposed moves is better than the cost associated with the current placement.
  • the costs associated with the proposed moves and current placement may be obtained by using cost function values generated from using the cost function described with respect to 402 . If it is determined that the cost associated with any of the proposed moves is better than the cost associated with the current placement, control proceeds to 404 . If it is determined that the cost associated with any of the proposed moves is not better than the cost associated with the current placement, control proceeds to 405 .
  • the proposed move associated with the best cost is selected as the current placement.
  • a counter may be used to track the number of proposed moves that have been generated, or the number of LEs or LABs that have had proposed moves generated. In this embodiment, when this number exceeds a threshold value, instead of proceeding to 401 , control terminates the procedure and returns an indication that a fit was not found.
  • FIG. 8 is a flow chart illustrating a method for performing incremental placement utilizing directed hill-climbing according to an embodiment of the present invention. The method described in FIG. 8 may be used to perform incremental placement as shown as 105 in FIG. 1 .
  • a loop iteration index, L is set to 1.
  • proposed moves for all LEs in a LAB having architectural violations are generated.
  • the proposed moves may be generated similarly as described in 401 shown in FIG. 4 .
  • the number of LEs having proposed moves generated is recorded.
  • a current placement of LEs in a LAB with architectural violations and proposed moves of the LEs in the LAB are evaluated by a cost function.
  • the evaluation performed may be similarly conducted as described in 402 of FIG. 4 .
  • the cost associated with any of the proposed moves is better than the cost associated with the current placement.
  • the costs associated with the proposed moves and current placement may be obtained by using values generated from using the cost function described with respect to 402 . If the cost associated with any of the proposed moves is better than the cost associated with the current placement, control proceeds to 804 . If the cost associated with any of the proposed moves is not better than the cost associated with the current placement, control proceeds to 805 .
  • the proposed move associated with the best cost is selected as the current placement.
  • timing analysis is performed.
  • the values for maxdelay and crit(c), used for evaluating timing cost are updated to reflect the current configuration of the system.
  • the cost function is updated.
  • weighting coefficients in the ClusterCost parameter are incremented in proportion to an amount of violation. Updating the cost function allows directed hill-climbing to be performed. Directed hill-climbing is a technique that is used for generating proposed moves when moves cannot be found to decreases the current cost of a placement.
  • FIG. 9 illustrates an example where directed hill-climbing may be applied.
  • the target device 900 includes a plurality of LABs 901 – 905 each having a plurality of shown LEs.
  • LAB 903 has one LE more than is allowed by its architectural specification. Every possible move attempt to resolve the architectural constraints of the center LAB 903 results in another architectural violation. If all architectural violations are costed in the same manner, then the method described in FIG. 4 may have difficulties resolving the constraint violation.
  • FIG. 10 illustrates a two dimensional slice of the multi-dimensional cost function described.
  • the current state 1001 represents the situation shown in FIG. 9 .
  • No single move in the neighborhood of the current state finds a solution with a lower cost.
  • the cost function itself could be modified to allow for the current state 1001 to climb the hill.
  • the weighting coefficients of the cost function may be gradually increased for LABs that have unsatisfied constraints. A higher weight may be assigned to unsatisfied constraints that have been violated over a long period of time or over many iterations. This results in the cost function being reshaped to allow for hill climbing.
  • the reshaping of the cost function has the effect of filling a basin where the local minima is trapped. Referring back to FIG. 9 , once the weighting coefficients have been increased for LAB 903 , a proposed move to one of the adjacent cluster may be made to allow for shifting the violation “outwards” to a free space.
  • Updating a cost function also allows for a quick convergence by preventing a phenomenon known as thrashing. Thrashing occurs when incremental placement is trapped in an endless cycle where an LE is moved between two points in the configuration space which both result in architectural violations. By increasing the cost or penalty for moving to the two points, a move to a third point would eventually be more desirable and accepted.
  • control terminates the procedure and returns an indication that a fit was not found.
  • a method for designing a system on a target device utilizing FPGAs includes placing new LEs at preferred locations on a layout of an existing system. Illegalities in placement of the components are resolved. According to one embodiment, resolving the illegalities in placement may be achieved by generating proposed moves for an LE, generating cost function values for a current placement of the LE and for placements associated with the proposed moves, and accepting a proposed move if its associated cost function value is better than the cost function value for the current placement.
  • a timing analysis may be performed to make the determination.
  • a timing analysis conducted during incremental placement at procedure 108 may be used to make the determination. If timing constraints are satisfied, control proceeds to 112 . If timing constraints are not satisfied, control proceeds to 110 .
  • Greedy optimizations are performed to improve the placement of nodes made at 108 According to an embodiment of the present invention, this may be achieved by first swapping components assigned to be implemented by LABs on the target device (LAB swapping). Afterwards, components assigned to be implemented by LEs on the target device are swapped (LE swapping).
  • a cost function may be used based on, but not limited to, wire length, criticality, power, or other metric. The procedure does not utilize hill-climbing. Thus, any move that improve the cost function is accepted.
  • 110 may be performed before 108 and the procedures may be reversed.
  • timing constraints it is determined whether the new locations for the nodes determined at 110 allow for the new design to satisfy timing constraints.
  • a timing analysis may be performed to make the determination. If timing constraints are satisfied, control proceeds to 112 . If timing constraints are not satisfied, control proceeds to 114 .
  • the system is incrementally routed.
  • routing resources that correspond to a node in the second netlist that is equivalent to a node in the first netlist are identified. If a routing resources has source and sink nodes that are also equivalent, the routing resources may be preserved for the node's use. The identified routing resources may be preserved for the node's use by using routing constraints.
  • FIG. 11 is a flow chart illustrating a method for performing routing according to an embodiment of the present invention.
  • the method described in FIG. 11 may be used to implement some of the procedures in 111 shown in FIG. 1 .
  • the routing resources that are determined to be preserved are specified as routing constraints.
  • the net, n is set to the first net, 1.
  • index i is set to 1.
  • a source represents a start point for a net or connection on the target device.
  • a sink represents an end or destination point for a net or connection on the device.
  • routing wires for segment i are included in a list referred to as “routing wires for segment i”.
  • the identified routing resources in the routing wires for segment i list that satisfy the routing constraints for the system are determined.
  • the routing wires for segment i list is updated to include only the routing resources that satisfy the routing constraints.
  • the routing resources in the routing wires for segment i list are potential segments on the connection.
  • an indication is generated that there is a routing failure.
  • a procedure that updates the routing constraints to remove a constraint that could not be satisfied may be called. After updating the routing constraints, control would return to step 1101 to retry the routing. This provides flexibility to allow some of the preserved routing to be altered if necessary.
  • control prepares to route the next connection. Control proceeds to 1103 .
  • control goes to the source of the next net and prepares the route the first connection in the next net.
  • Net n is set to n+1. Control proceeds to 1101 .
  • a route is selected for the connection. According to an embodiment of the present invention, if a plurality of routed paths that connect the source to the sink is available, the path that provides the shortest path, that utilizes routing resources having the smallest cost function value that yields the smallest delay, or that satisfies some other criteria is selected to be the routed path for the connection. If no routed path is available to select from, a routing failure is indicated.
  • index i is set to i+1
  • routing wires for segment i are included in a list referred to as “routing wires for segment i”.
  • the identified routing resources in the routing wires list for segment i that satisfy the routing constraints for the system are determined.
  • the routing wires list for segment i is updated to include only the routing resources that satisfy the routing constraints.
  • the routing resources in the routing wires list for segment i are potential segments on the connection.
  • full placement is performed.
  • full placement of the new design on the second netlist is performed.
  • the full placement may be performed similarly to how the first netlist was placed at 102 .
  • full routing is performed.
  • full routing of the new design on the second netlist is performed.
  • the full routing may be performed similarly to how the first netlist was routed at 103 .
  • FIGS. 1 , 4 , 8 , and 11 are flow charts illustrating a method for designing a system on a PLD, and methods for performing incremental placement. Some of the techniques illustrated in these figures may be performed sequentially, in parallel or in an order other than that which is described. It should be appreciated that not all of the techniques described are required to be performed, that additional techniques may be added, and that some of the illustrated techniques may be substituted with other techniques.
  • Embodiments of the present invention may be provided as a computer program product, or software, that may include a machine-readable medium having stored thereon instructions.
  • the machine-readable medium may be used to program a computer system or other electronic device.
  • the machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, magnetic or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing electronic instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

A method for designing a system on a target device utilizing field programmable gate arrays (FPGAs) includes generating a first design for the system that includes a first netlist describing a first logical design, and placement and routing of the first logical design. A second design for the system is generated that includes a second netlist describing a second logical design. Changes made to the first design in the second design are identified. Placement is performed on the changes made to the first design on the second design.

Description

FIELD OF THE INVENTION
The present invention relates to the field of field programmable gate arrays (FPGAs). More specifically, the present invention relates to a method and apparatus for performing incremental compilation on systems on FPGAs using tools such as electronic design automation (EDA) tools.
BACKGROUND
FPGAs may be used to implement large systems that include millions of gates and megabits of embedded memory. Of the tasks required in managing and optimizing a design, placement of components on the FPGAs, and routing connections between components on the FPGA utilizing available resources can be the most challenging and time consuming. In order to satisfy placement and timing specifications, several iterations are often required to determine how components are to be placed on the target device and which routing resources to allocate to the components. The complexity of large systems often requires the use of EDA tools to manage and optimize their design onto physical target devices. Automated placement and routing algorithms in EDA tools perform the time consuming task of placement and routing of components onto physical devices.
When modifications are made to a design of a system, current EDA tools require a re-work of the entire placement and routing procedures. This may require a significant amount of time. In situations when the modifications are minor, re-work of the entire placement and routing procedures is inefficient and undesirable and discourages designers from effectively carrying out the typical design flow which involves making small changes to a design and analyzing the effects of the changes.
Thus, what is needed is an efficient method and apparatus for performing incremental compilation on FPGAs.
SUMMARY
According to an embodiment of the present invention, a method and apparatus is disclosed to support incremental compilation of a new design utilizing placement and/or routing strategies generated from a previous design on an FPGA. Differences between two netlists are identified to determine which nodes are equivalent and which nodes are new or have changed since a last compilation. An initial placement is created by assigning locations to the nodes that are equivalent or similar. Any illegal placement of the nodes is corrected using an incremental placement procedure. Greedy optimization is further performed if necessary. Incremental routing of the placed design is performed where routing associated with equivalent nodes from the first netlist is preserved.
BRIEF DESCRIPTION OF THE DRAWINGS
The features and advantages of the present invention are illustrated by way of example and are by no means intended to limit the scope of the present invention to the particular embodiments shown, and in which:
FIG. 1 is a flow chart illustrating a method for designing a system according to an embodiment of the present invention;
FIG. 2 illustrates a target device utilizing FPGAs according to an embodiment of the present invention;
FIG. 3 illustrates a LAB according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating a method for performing incremental placement according to an embodiment of the present invention;
FIG. 5 illustrates fanin, fanout, and sibling relationship move proposals according to an embodiment of the present invention;
FIG. 6 illustrates an exemplary critical vector move proposal according to an embodiment of the present invention;
FIG. 7 illustrates horizontal and vertical cut-lines used for local congestion estimation according to an embodiment of the present invention;
FIG. 8 is a flow chart illustrating a method for performing incremental placement utilizing directed hill-climbing according to an embodiment of the present invention;
FIG. 9 illustrates a component trapped in a local minima according to an embodiment of the present invention;
FIG. 10 illustrates basin-filling according to an embodiment of the present invention; and
FIG. 11 illustrates a method for performing routing according to an embodiment of the present invention.
DETAILED DESCRIPTION
FIG. 1 is a flow chart that illustrates a method for designing a system according to an embodiment of the present invention. The method may be performed with the assistance of an EDA tool, for example. At 101, synthesis is performed. Synthesis includes generating a logic design of the system to be implemented by a target device. According to an embodiment of the present invention, synthesis generates an optimized logical representation of the system from a Hardware Description Language (HDL) design definition. The optimized logical representation of the system may include a representation that includes a minimized number of logic gates and logic elements required for the system. Alternatively, the optimized logical representation of the system may include a representation that has a reduced depth of logic and that generates a lower signal propagation delay. Synthesis also includes mapping the optimized logic design. Mapping includes determining how to implement the logic components such as logic gates in the optimized logical representation with general resources available on the target device. According to an embodiment of the present invention, a netlist is generated from mapping. The netlist illustrates how the general resources available on the target device are utilized to implement the system. The netlist may, for example, include a representation of the resources on the target device as nodes and how the nodes are connected. The netlist may be an optimized technology-mapped netlist generated from the HDL.
FIG. 2 illustrates an exemplary target device 200 utilizing FPGAs according to an embodiment of the present invention. The present invention may be used to design a system onto the target device 200. According to one embodiment, the target device 200 is a chip having a hierarchical structure that may take advantage of wiring locality properties of circuits formed therein. The lowest level of the hierarchy is a logic element (LE) (not shown). An LE is a small unit of logic providing efficient implementation of user logic functions. According to one embodiment of the target device 200, an LE may include a 4-input lookup table (LUT) with a configurable flip-flop.
The target device 200 includes a plurality of logic-array blocks (LABs). Each LAB is formed from 10 LEs, LE carry chains, LAB control signals, LUT chain, and register chain connection lines. LUT chain connections transfer the output of one LE's LUT to the adjacent LE for fast sequential LUT connections within the same LAB. Register chain connection lines transfer the output of one LE's register to the adjacent LE's register within a LAB. LABs are grouped into rows and columns across the target device 200. A first column of LABs is shown as 210 and a second column of LABs is shown as 211.
The target device 200 includes memory blocks (not shown). The memory blocks may be, for example, dual port random access memory (RAM) blocks that provide dedicated true dual-port, simple dual-port, or single port memory up to various bits wide at up to various frequencies. The memory blocks may be grouped into columns across the target device in between selected LABs or located individually or in pairs within the target device 200.
The target device 200 includes digital signal processing (DSP) blocks (not shown). The DSP blocks may be used to implement multipliers of various configurations with add or subtract features. The DSP blocks include shift registers, multipliers, adders, and accumulators. The DSP blocks may be grouped into columns across the target device 200.
The target device 200 includes a plurality of input/output elements (IOEs) (not shown). Each IOE feeds an I/O pin (not shown) on the target device 200. The IOEs are located at the end of LAB rows and columns around the periphery of the target device 200. Each IOE includes a bidirectional I/O buffer and a plurality of registers for registering input, output, and output-enable signals. When used with dedicated clocks, the registers provide performance and interface support with external memory devices.
The target device 200 includes LAB local interconnect lines 220221 that transfer signals between LEs in the same LAB. The LAB local interconnect lines are driven by column and row interconnects and LE outputs within the same LAB. Neighboring LABs, memory blocks, or DSP blocks may also drive the LAB local interconnect lines 220221 through direct link connections.
The target device 200 also includes a plurality of row interconnect lines (“H-type wires”) 230 that span fixed distances. Dedicated row interconnect lines 230, that include H4 231, H8 232, and H24 233 interconnects, route signals to and from LABs, DSP blocks, and memory blocks within the same row. The H4 231, H8 232, and H2 233 interconnects span a distance of up to four, eight, and twenty-four LABs respectively, and are used for fast row connections in a four-LAB, eight-LAB, and twenty-four-LAB region. The row interconnects 230 may drive and be driven by LABs, DSP blocks, RAM blocks, and horizontal IOEs.
The target device 200 also includes a plurality of column interconnect lines (“V-type wires”) 240 that operate similarly to the row interconnect lines 230. The column interconnect lines 240 vertically routes signals to and from LABs, memory blocks, DSP blocks, and IOEs. Each column of LABs is served by a dedicated column interconnect, which vertically routes signals to and from LABs, memory blocks, DSP blocks, and IOEs. These column interconnect lines 240 include V4 241, V8 242, and V16 243 interconnects that traverse a distance of four, eight, and sixteen blocks respectively, in a vertical direction.
FIG. 2 illustrates an exemplary embodiment of a target device. It should be appreciated that a system may include a plurality of target devices, such as that illustrated in FIG. 2, cascaded together. It should also be appreciated that the target device may include programmable logic devices arranged in a manner different than that on the target device 200. A target device may also include components other than those described in reference to the target device 200. Thus, while the invention described herein may be utilized on the architecture described in FIG. 2, it should be appreciated that it may also be utilized on different architectures, such as those employed by Altera® Corporation in its APEX™, Mercury™, Stratix™, and Stratix™ II family of chips and those employed by Xilinx®, Inc. in its Virtex™ and Virtex™ II line of chips.
FIG. 3 illustrates a LAB or clustered logic block 300 according to an embodiment of the present invention. The LAB 300 may be used to implement any of the LABs shown in FIG. 2. LEs 301303 illustrates a first, second, and tenth LE in the LAB 300. The LEs 301303 each have a 4-input lookup table 311313, respectively, and a configurable register 321323 s, respectively, connected at its output. The LAB 300 includes a set of input pins 340 and a set of output pins 350 that connect to the general-purpose routing fabric so that LAB can communicate with other LABs. The inputs to lookup tables 311313 can connect to any one of the input pins 340 and output pins 350 using the appropriate configuration bits for each of the multiplexers 330. The number of LEs, nE, input pins, nI, and output pins, nO in a LAB impose important architectural constraints on a system. In addition, since a single clock line 361 and a single asynchronous set/reset line 362 is attached to each configurable register 321323, the configurable registers 321323 must be clocked by the same signal and initialized by the same signal. The number of clock lines available in a LAB is represented by nC. The number of reset lines available in a LAB is represented by nR.
At 102, the mapped logical system design is placed. Placement works on the optimized technology-mapped netlist to produce a placement for each of the logic components. According to an embodiment of the present invention, placement includes fitting the system on the target device by determining the specific resources on the target device to be used for implementing the general resources mapped for logic components at 101. The placement procedure may be performed by a placer in an EDA tool that utilizes placement algorithms. According to an embodiment of the present invention, a user (designer) may provide input to the placer by specifying placement constraints.
At 103, routing of the system is performed. During routing, routing resources on the target device are allocated to provide interconnections between logic gates, logic elements, and other components on the target device. The routing procedure may be performed by a router in an EDA tool that utilizes routing algorithms.
At 104, synthesis is performed to generate a new logic design of the system to be implemented by the target device. According to an embodiment of the present invention, a new netlist is generated for the new logic design. In one embodiment, the synthesis is performed in response to layout-driven optimizations. The layout-driven optimizations may be generated by using routing delays for connections on the netlist that are estimated by calculating a fastest possible route. Timing-driven netlist optimization techniques may be applied to perturb the first netlist generated at 101 to reduce the critical path(s). The first netlist may be perturbed by an EDA tool, a user of the EDA tool, or by a third party. Perturbing the netlist may include adding, deleting, or moving components. According to an embodiment of the present invention, preferred locations are identified for the components that have been added or moved from the layout-driven optimization. The locations assigned to components of the existing system from the placement procedure are identified as preferred locations for the components.
At 105, differences between the logic designs are identified. According to an embodiment of the present invention, a netlist generated during synthesis at 101 (first netlist) is compared with a netlist generated at 104 (second netlist). The comparison may be used to determine changes with regard to addition of (new) nodes in the second netlist, deletion of (old) nodes in the first netlist, or movement of (old) nodes in the second netlist. According to one embodiment, a cost function is used to determine the likelihood that a first node synthesized in the first netlist is equivalent to a second node synthesized in the second netlist. If this probability exceeds a first threshold value, the two nodes are considered equivalent.
According to an embodiment of the present invention, the cost function may be based on timing and/or placement constraints. If the first and second nodes have similar timing and/or placement constraints, such as for example maximum operating frequency restrictions on the node's connections or boundary restrictions for the node's placement, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
According to an embodiment of the present invention, the cost function may be based on the number and/or the identity of the input connections (fanins) of a node. If the first and second nodes are driven by a same number of nodes or nodes that have been identified as being equivalent, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
According to an embodiment of the present invention, the cost function may be based on the number and/or the identity of the output connections (fanouts) of a node. If the first and second nodes drive a same number of nodes or nodes that have been identified as being equivalent, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
According to an embodiment of the present invention, the cost function may be based on a bit string (LUT mask) that effectively represents the truth table for a function being implemented by the node. If the first and second nodes have the same LUT mask, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
According to an embodiment of the present invention, the cost function may be based on the identity of neighboring nodes (siblings) of a node. If the first and second nodes are surrounded by siblings that are equivalent, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
According to an embodiment of the present invention, the cost function may be based on the resource type of the node. A resource type may be a category of resource such as logic element, pin, memory block, or other type of resource. If the first and second nodes are of the same resource type, the cost function may indicate that there is a probability that the first and second nodes are equivalent.
According to an embodiment of the present invention, the cost function may be based on the synthesized name of a node. If the first and second nodes have the same or similar name, the cost function may indicate that there is a high probability that the first and second nodes are equivalent.
It should be appreciated that one or more of the parameters described may be utilized by the cost function to determine equivalence and that other criteria may also be used. According to an embodiment of the present invention, a match between the first and second nodes with respect to one of the parameters may not guarantee a determination of equivalence. Similarly, a cost function may determine that a first and a second node are equivalent regardless of whether there are differences with respect to one of the parameters,
According to an embodiment of the present invention, a set of equivalent nodes that exist in both the first and second netlist and a set of new nodes which may be new to the first netlist or moved from the first netlist are identified.
At 106, it is determined whether the number of differences between the first and second netlists exceeds a second threshold value. According to an embodiment of the present invention, the number of differences between the first and second netlists may be measured by the number of nodes in the second netlist that are not equivalent to nodes in the first netlist. If it is determined that the number of differences between the first and second netlists do not exceed the second threshold value, control proceeds to 107. If it is determined that the number of differences between the first and second netlists exceed the second threshold value, control proceeds to 114.
At 107, initial placement is performed. According to an embodiment of the present invention, initial placement is performed by attempting to put all the nodes in the second netlist at an optimal or preferred location without considering the legality or illegality of the placement. This would allow for nodes to be placed in same location as another nodes. Nodes in the second netlist that are determined to be equivalent to nodes in the first netlist are placed at the previous locations assigned to the nodes at 102.
Nodes in the second netlist that are not determined to have equivalent nodes in the first netlist have initial locations assigned. According to an embodiment of the present invention, assignment of initial locations may be achieved by considering locations of inputs and outputs of a node and placing the node in a location relatively central to the inputs and outputs. According to an alternate embodiment of the present invention, timing analysis information from a previous compile that may have been performed during procedures 101103 or timing analysis information from a timing analysis performed during a different procedure may be utilized to product an optimal placement that considers the most critical path associated with the node. In one embodiment, the node may be placed at one of its inputs or outputs along the most critical path. The placement of the nodes may involve assigning initial locations that are not legal.
At 108, illegalities in placement are addressed. According to an embodiment of the present invention, an incremental placement procedure is performed in order to resolve any illegalities of placement generated from 107. Incremental placement involves evaluating resources on a target device such as LABs that have architectural violations or illegalities from initial placement. Incremental placement attempts to perturb the preferred locations as little as possible to ensure that the final placement respects all architectural constraints. Incremental placement attempts to identify non-critical LEs that may be moved from their preferred locations to resolve architectural violations in order that truly critical elements may stay at their preferred locations. Incremental placement may be performed by an incremental placement engine (not shown) in the EDA tool that utilizes incremental placement algorithms.
In performing incremental placement, an architectural description of the target device, A, and a netlist, N(E,C), that includes a set of logic elements, E, and a set of connections, C, is processed. Each element, e, is associated with a preferred physical location, (px(e), py(e)). According to an embodiment of the present invention, all atoms of the netlist have a preferred location. Incremental placement generates a set of mapped locations, M, for each logic elements in N. Incremental placement tries to find a mapping from preferred locations to mapped locations, P→M, such that the mapped locations are architecturally feasible as well as being minimally disruptive. The definition of minimal disruption depends on the goal of netlist optimization.
According to an embodiment of the present invention, the goal of netlist optimization is to optimize timing of the system. In this embodiment, T(S) represents an estimate of the critical path delay if all logic elements in E are mapped to (sx(e), sy(e)). The estimate may ignore the legality of locations and may be computed assuming a best case route is possible for each connection. In this example, P→M is minimally disruptive if incremental placement minimizes {T(M)−T(P)}. Any logic element can be moved from its preferred location as long as it does not degrade the critical path. According to one embodiment, routing area is also tracked to control excessive routing congestion. In this embodiment, A(S) represents the routing area consumed if the logic elements are mapped to (sx(e), sy(e)). Minimal disruptiveness is satisfied by minimizing the relationships shown below.
{T(M)−T(P)}+{A(M)−A(P)}  (1)
FIG. 4 is a flow chart illustrating a method for performing incremental placement according to an embodiment of the present invention. The method described in FIG. 4 may be used to perform incremental placement as shown as 105 in FIG. 1. At 401 proposed moves for all LEs in a LAB having architectural violations are generated. According to an embodiment of the present invention, proposed moves may include a move-to-fanin, move-to-fanout, move-to-sibling, move-to-neighbor, move-to-space, a move towards a critical vector, and other moves. A move-to-fanin involves moving an LE to a LAB that is a fanin of the LE. A move-to-fanout involves moving an LE to a LAB that is a fanout of the LE. A move-to-sibling involves moving an LE to a LAB that is fanout of a LAB that fans in to the LAB of the LE.
FIG. 5 illustrates examples of a move-to-fanin, move-to-fanout, and move-to-sibling. When a first LE in a first LAB transmits a signal to a second LE in a second LAB, the first LAB is said to be a fanin of the second LE. When a first LE in a first LAB receives a signal from a second LE in a second LAB, the first LAB is said to be a fanout of the second LE. When a first LE from a first LAB receives a signal from a second LE from a second LAB that also transmits to a third LE in a third LAB, the first LAB and the third LABs are said to be siblings. Blocks 501509 illustrate a plurality of LABs. Each of the LABs 501509 has a number of shown LEs. A plurality of arrows 511518 are shown to illustrate the direction of a signal transmitted between LEs. Relative to LAB 506, LABs 501504 are considered fanins, LABs 505 and 507 are considered siblings, and LABs 508 and 509 are considered fanouts.
Proposed moves may also include move-to-neighbor, move-to-space, and move towards critical vector. A move-to-neighbor involves moving an LE to an adjacent LAB. A move-to-space involves a move to any random free LE location in a target device. A move towards critical vector involves moving an LE towards a vector that is computed by summing the directions of all critical connections associated with the moving LE. FIG. 6 illustrates an exemplary critical vector 601. Vector 601 is the critical vector of LE 611 which has critical connections to LEs 612 and 613, and a non-critical connection with LE 614.
Referring back to FIG. 4, at 402, a current placement of LEs in a LAB with architectural violations and proposed moves of the LEs in the LAB are evaluated by a cost function. The cost function may include parameters which measure the legality of a LAB (cluster legality cost), timing (timing cost), and an amount of routing resources that is required for a placement (wirelength cost). According to an embodiment of the present invention, the cost function guides the reduction of architectural violations while ensuring minimal disruption. This cost function, C, is illustrated with the relationship shown below.
C=K L*ClusterCost+K T*TimingCost*K W*WirelengthCost  (2)
KL, KT, and KW represent weighting coefficients that normalize the contributions of each parameter. It should be appreciated that other parameters may be used in addition to or in place of the parameters described.
The cluster legality cost is a cost associated with each LAB CLi. This cost may be represented as shown below.
ClusterCost(CL i)=kE i*legality(CL i , n E)+
KI i*legality(CL i , n I)+
kR i*legality(CLi , n R)+
kO i*legality(CL i , n O)+
kC i*legality(CL i , n C)  (3)
The legality (CLi, . . . ) function returns a measure of legality for a particular constraint. A value of 0 indicates legality, while any positive value is proportional to the amount to which the constraint has been violated. Functions legality (CLi, nE), legality (CLi, nI), legality (CLi, nO), legality (CLi, nR), and legality (CLi, nC) evaluate if LAB CLi has a feasible number of logic elements, inputs, outputs, reset lines and clock lines, respectively. According to an embodiment of the present invention, the weighting coefficients kEi, KIi, kOi, kRi, and kCI are all initially set to 1 for every LAB CLi in the target device.
The timing cost associated with a placement may be represented as shown below.
TimingCost=TC VPR +k DAMP *TC DAMP  (4)
The first parameter, TCVPR, is based upon the cost used by a versatile placement and routing (VPR) placer. This cost may be represented with the following relationship.
TC VPRCcrit(c)*delay(c)  (5)
This function encourages critical connections to reduce delay while allowing non-critical connections to optimize wirelength and other optimization criteria.
The second parameter, TCDAMP, operates as a damping component of the timing cost function and can be represented with the following relationships.
TC DAMPCmax(delay(c)−maxdelay(c), 0.0)  (6)
maxdelay(c)=delay(c)+α*slack(c)  (7)
The damping component penalizes any connection c whose delay(c) exceeds a maximum value maxdelay(c). This allows arbitrary moves to be made along a plateau defined by the maximum delays. The maxdelay values may be updated every time a timing analysis of the system is executed. The maxdelay values are controlled by the slack on the connection considered. The parameter α determines how much of a connection's slack will be allocated to the delay growth of the connection. Thus, the plateau is defined by the connection slack so that connection with large amounts of slack are free to move large distances in order to resolve architectural violations, while small slack values are relatively confined.
Wirelength cost of a placement may be measured by determining a number of routing wires that cross cut-lines that outline a LAB. FIG. 7 illustrates the utilization of cut-lines according to an embodiment of the present invention. Blocks 701709 represent LABs having a plurality of shown LEs. Horizontal cut- lines 711 and 712 and vertical cut- lines 713 and 714 are placed in each horizontal channel of a target device. Cut-lines provide a method to measure congestion by finding the regions that have the largest number of routing wires 721724. This measurement may be used to prevent the formation of localized congested areas that can cause circuitous routes. The total number of routing wires that intersect a particular cut may be calculated by finding all the signals that intersect a particular cut-line and summing the average crossing-count for each of these signal wires. The average crossing count for a signal may be computed using the following relationship.
CrossingCount(net)=q(NumCLBlockPins(net))  (8)
The function q is given as a number of discrete crossing counts as a function of signal pin count. The argument to the function q is the number of clustered logic block pins used to wire the signal. With respect to the functions shown in (3)–(8), it should be appreciated that other types of functions may be used in addition or in place of the functions represented.
Referring back to FIG. 4, at 403, it is determined whether the cost associated with any of the proposed moves is better than the cost associated with the current placement. The costs associated with the proposed moves and current placement may be obtained by using cost function values generated from using the cost function described with respect to 402. If it is determined that the cost associated with any of the proposed moves is better than the cost associated with the current placement, control proceeds to 404. If it is determined that the cost associated with any of the proposed moves is not better than the cost associated with the current placement, control proceeds to 405.
At 404, the proposed move associated with the best cost is selected as the current placement.
At 405, it is determined whether any additional LABs in the system have architectural violations. If additional LABs in the system have architectural violations, control will move to one of these LABs and proceeds to 401. If no additional LABs in the system have architectural violations, control proceeds to 406 and terminates the procedure. According to an embodiment of the present invention, a counter may be used to track the number of proposed moves that have been generated, or the number of LEs or LABs that have had proposed moves generated. In this embodiment, when this number exceeds a threshold value, instead of proceeding to 401, control terminates the procedure and returns an indication that a fit was not found.
FIG. 8 is a flow chart illustrating a method for performing incremental placement utilizing directed hill-climbing according to an embodiment of the present invention. The method described in FIG. 8 may be used to perform incremental placement as shown as 105 in FIG. 1. At 800, a loop iteration index, L, is set to 1.
At 801 proposed moves for all LEs in a LAB having architectural violations are generated. According to an embodiment of the present invention, the proposed moves may be generated similarly as described in 401 shown in FIG. 4. The number of LEs having proposed moves generated is recorded.
At 802, a current placement of LEs in a LAB with architectural violations and proposed moves of the LEs in the LAB are evaluated by a cost function. According to an embodiment of the present invention, the evaluation performed may be similarly conducted as described in 402 of FIG. 4.
At 803, it is determined whether the cost associated with any of the proposed moves is better than the cost associated with the current placement. The costs associated with the proposed moves and current placement may be obtained by using values generated from using the cost function described with respect to 402. If the cost associated with any of the proposed moves is better than the cost associated with the current placement, control proceeds to 804. If the cost associated with any of the proposed moves is not better than the cost associated with the current placement, control proceeds to 805.
At 804, the proposed move associated with the best cost is selected as the current placement.
At 805, it is determined whether any additional LABs in the system have architectural violations. If additional LABs in the system have architectural violations, control will move to one of these LABs and proceeds to 807. If no additional LABs in the system have architectural violations, control proceeds to 806 and terminates the procedure.
At 807, it is determined whether the number of LEs that have proposed moves generated exceeds the value K where K is a predefined value. If the number of LEs that have proposed moves generated exceeds the value K, control proceeds to 809. If the number of LEs that have proposed moves generated does not exceed the value K, control proceeds to 808.
At 808, the loop iteration index, L, is incremented. Control returns to 801.
At 809, timing analysis is performed. According to an embodiment of the present invention, the values for maxdelay and crit(c), used for evaluating timing cost, are updated to reflect the current configuration of the system.
At 810, the cost function is updated. According to an embodiment of the present invention, weighting coefficients in the ClusterCost parameter are incremented in proportion to an amount of violation. Updating the cost function allows directed hill-climbing to be performed. Directed hill-climbing is a technique that is used for generating proposed moves when moves cannot be found to decreases the current cost of a placement.
FIG. 9 illustrates an example where directed hill-climbing may be applied. The target device 900 includes a plurality of LABs 901905 each having a plurality of shown LEs. In this example, LAB 903 has one LE more than is allowed by its architectural specification. Every possible move attempt to resolve the architectural constraints of the center LAB 903 results in another architectural violation. If all architectural violations are costed in the same manner, then the method described in FIG. 4 may have difficulties resolving the constraint violation.
FIG. 10 illustrates a two dimensional slice of the multi-dimensional cost function described. The current state 1001 represents the situation shown in FIG. 9. No single move in the neighborhood of the current state finds a solution with a lower cost. However, the cost function itself could be modified to allow for the current state 1001 to climb the hill. The weighting coefficients of the cost function may be gradually increased for LABs that have unsatisfied constraints. A higher weight may be assigned to unsatisfied constraints that have been violated over a long period of time or over many iterations. This results in the cost function being reshaped to allow for hill climbing. The reshaping of the cost function has the effect of filling a basin where the local minima is trapped. Referring back to FIG. 9, once the weighting coefficients have been increased for LAB 903, a proposed move to one of the adjacent cluster may be made to allow for shifting the violation “outwards” to a free space.
Updating a cost function also allows for a quick convergence by preventing a phenomenon known as thrashing. Thrashing occurs when incremental placement is trapped in an endless cycle where an LE is moved between two points in the configuration space which both result in architectural violations. By increasing the cost or penalty for moving to the two points, a move to a third point would eventually be more desirable and accepted.
Referring back to FIG. 8, at 811, it is determined whether the loop index, L, is greater than a threshold value. If the loop index, L, is not greater than the threshold value, control proceeds to 808. If the loop index, L, is greater than the threshold value, control proceeds to 812.
At 812, control terminates the procedure and returns an indication that a fit was not found.
The incremental placement techniques disclosed allow logic changes to be incorporated into an existing system design without reworking placement of the entire system. The incremental placement techniques attempt to minimize disruption to the original placement and maintain the original timing characteristics. According to an embodiment of the present invention, a method for designing a system on a target device utilizing FPGAs is disclosed. The method includes placing new LEs at preferred locations on a layout of an existing system. Illegalities in placement of the components are resolved. According to one embodiment, resolving the illegalities in placement may be achieved by generating proposed moves for an LE, generating cost function values for a current placement of the LE and for placements associated with the proposed moves, and accepting a proposed move if its associated cost function value is better than the cost function value for the current placement.
Referring back to FIG. 1, it is determined whether the new locations for the nodes determined at 109 allow for the new design to satisfy timing constraints. According to an embodiment of the present invention, a timing analysis may be performed to make the determination. According to an alternate embodiment of the present invention, a timing analysis conducted during incremental placement at procedure 108 may be used to make the determination. If timing constraints are satisfied, control proceeds to 112. If timing constraints are not satisfied, control proceeds to 110.
At 110, greedy optimizations are performed. Greedy optimizations are performed to improve the placement of nodes made at 108 According to an embodiment of the present invention, this may be achieved by first swapping components assigned to be implemented by LABs on the target device (LAB swapping). Afterwards, components assigned to be implemented by LEs on the target device are swapped (LE swapping). A cost function may be used based on, but not limited to, wire length, criticality, power, or other metric. The procedure does not utilize hill-climbing. Thus, any move that improve the cost function is accepted. According to an embodiment of the present invention 110 may be performed before 108 and the procedures may be reversed.
At 111, it is determined whether the new locations for the nodes determined at 110 allow for the new design to satisfy timing constraints. According to an embodiment of the present invention, a timing analysis may be performed to make the determination. If timing constraints are satisfied, control proceeds to 112. If timing constraints are not satisfied, control proceeds to 114.
At 112, the system is incrementally routed. According to an embodiment of the present invention, routing resources that correspond to a node in the second netlist that is equivalent to a node in the first netlist are identified. If a routing resources has source and sink nodes that are also equivalent, the routing resources may be preserved for the node's use. The identified routing resources may be preserved for the node's use by using routing constraints.
FIG. 11 is a flow chart illustrating a method for performing routing according to an embodiment of the present invention. The method described in FIG. 11 may be used to implement some of the procedures in 111 shown in FIG. 1. In this embodiment, the routing resources that are determined to be preserved are specified as routing constraints. At 1100, the net, n, is set to the first net, 1.
At 1101, index i is set to 1.
At 1102, the source and sinks are determined for net n. According to an embodiment of the present invention, a source represents a start point for a net or connection on the target device. A sink represents an end or destination point for a net or connection on the device.
At 1103, for the current connection on the current net, all possible routing resources that may be used to route from the source are identified. The identified routing resources may be included in a list referred to as “routing wires for segment i”.
At 1104, the identified routing resources in the routing wires for segment i list that satisfy the routing constraints for the system are determined. The routing wires for segment i list is updated to include only the routing resources that satisfy the routing constraints. The routing resources in the routing wires for segment i list are potential segments on the connection.
At 1105, if none of the identified routing resources in the routing wires for segment i list satisfies the constraints for the system, control proceed to 1106. If at least one of the identified routing resources in the routing wires for segment i list satisfies the constraints for the system, control proceeds to 1107.
At 1106, an indication is generated that there is a routing failure. Alternatively, a procedure that updates the routing constraints to remove a constraint that could not be satisfied may be called. After updating the routing constraints, control would return to step 1101 to retry the routing. This provides flexibility to allow some of the preserved routing to be altered if necessary.
At 1107, it is determined whether a sink for the connection has been reached from each of the identified routing resources in the routing wires for segment i list. If a sink for the connection has been reached, control proceeds to 1108. If a sink for the connection has not been reached, control proceeds to 1113.
At 1108, it is determined whether additional connections are to be routed for the current net. If additional connections are to be routed, control proceeds to 1109. If additional connections are not to be routed, control proceeds to 1110.
At 1109, control prepares to route the next connection. Control proceeds to 1103.
At 1110, it is determined whether additional nets are to be routed. If additional nets are to be routed, control proceeds to 1111. If additional nets are not to be routed, control proceeds to 512.
At 1111, control goes to the source of the next net and prepares the route the first connection in the next net. Net n is set to n+1. Control proceeds to 1101.
At 1112, a route is selected for the connection. According to an embodiment of the present invention, if a plurality of routed paths that connect the source to the sink is available, the path that provides the shortest path, that utilizes routing resources having the smallest cost function value that yields the smallest delay, or that satisfies some other criteria is selected to be the routed path for the connection. If no routed path is available to select from, a routing failure is indicated.
At 1113, index i is set to i+1
At 1114, for the current identified routing resource in the connection, all possible routing resources that may be used to route from the identified routing resource are determined. The identified routing resources may be included in a list referred to as “routing wires for segment i”.
At 1115, the identified routing resources in the routing wires list for segment i that satisfy the routing constraints for the system are determined. The routing wires list for segment i is updated to include only the routing resources that satisfy the routing constraints. The routing resources in the routing wires list for segment i are potential segments on the connection.
At 1116, if none of the identified routing resources in the routing wire list satisfies the constraints for the system, control proceed to 1106. If at least one of the identified routing resources in the routing wire list satisfies the constraints for the system, control proceeds to 1107.
Referring back to FIG. 1, at 113, it is determined whether valid routing has been achieved. If valid routing has been achieved, control proceeds to 116. If valid routing has not been achieved, control proceeds to 115.
At 114, full placement is performed. According to an embodiment of the present invention, full placement of the new design on the second netlist is performed. The full placement may be performed similarly to how the first netlist was placed at 102.
At 115, full routing is performed. According to an embodiment of the present invention, full routing of the new design on the second netlist is performed. The full routing may be performed similarly to how the first netlist was routed at 103.
FIGS. 1, 4, 8, and 11 are flow charts illustrating a method for designing a system on a PLD, and methods for performing incremental placement. Some of the techniques illustrated in these figures may be performed sequentially, in parallel or in an order other than that which is described. It should be appreciated that not all of the techniques described are required to be performed, that additional techniques may be added, and that some of the illustrated techniques may be substituted with other techniques.
Embodiments of the present invention (e.g. exemplary process described with respect to FIGS. 1, 4, and 5) may be provided as a computer program product, or software, that may include a machine-readable medium having stored thereon instructions. The machine-readable medium may be used to program a computer system or other electronic device. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, magnetic or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing electronic instructions.
In the foregoing specification the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than restrictive sense.

Claims (32)

1. A method for designing a system on a target device utilizing field programmable gate arrays (FPGAs) comprising:
generating a first design for the system that includes a first netlist describing a first logical design, and placement and routing of the first logical design;
generating a second design for the system that includes a second netlist describing a second logical design;
identifying changes made to the first design in the second design;
performing placement on the changes made to the first design on the second design;
determining whether the placement on the changes made satisfies timing constraints on the second design; and
performing incremental routing in response to determining that the placement satisfies the timing constraints.
2. The method of claim 1, wherein identifying the changes comprises determining whether a first node in the first design is equivalent to a second node in the second design.
3. The method of claim 2, wherein determining whether the first node in the first design is equivalent to the second node in the second design comprises determining whether the first and second nodes have similar timing constraints.
4. The method of claim 2, wherein determining whether the first node in the first design is equivalent to the second node in the second design comprises determining whether the first and second nodes have similar placement constraints.
5. The method of claim 2, wherein determining whether the first node in the first design is equivalent to the second node in the second design comprises determining whether the first and second nodes have a similar number of input connections.
6. The method of claim 2, wherein determining whether the first node in the first design is equivalent to the second node in the second design comprises determining whether the first and second nodes have a similar number of outputs.
7. The method of claim 2, wherein determining whether the first node in the first design is equivalent to the second node in the second design comprises determining whether the first and second nodes have matching logic unit table (LUT) masks.
8. The method of claim 2, wherein determining whether the first node in the first design is equivalent to the second node in the second design comprises determining whether the first and second nodes are surrounded by same neighbors.
9. The method of claim 2, wherein determining whether the first node in the first design is equivalent to the second node in the second design comprises determining whether the first and second nodes are of matching type.
10. The method of claim 2, wherein determining whether the first node in the first design is equivalent to the second node in the second design comprises determining whether the first and second nodes have a same name.
11. The method of claim 1, wherein performing placement on the changes made to the first design comprises:
placing new logic elements (LEs) at preferred locations;
resolving illegalities in placement of the new LEs.
12. The method of claim 11, wherein resolving illegalities in the placement of the new LEs comprises:
generating proposed moves for an LE;
generating cost function values for a current placement and placements with the proposed moves; and
accepting a proposed move if its associated cost function value is better than the cost function value for the current placement.
13. The method of claim 12, wherein generating the proposed moves comprises moving the LE to a logic-array block (LAB) that is a fanin of the LE.
14. The method of claim 12, wherein generating the proposed moves comprises moving the LE to a logic-array block (LAB) that is a fanout of the LE.
15. The method of claim 12, wherein generating the proposed moves comprises moving the LE to a logic-array block (LAB) that is a sibling of a LAB where the LE resides.
16. The method of claim 12, wherein generating the proposed moves comprises moving the LE to a logic-array block (LAB) that is adjacent to the LE.
17. The method of claim 12, wherein generating the proposed moves comprises moving the LE to any random free LE.
18. The method of claim 12, wherein generating the proposed moves comprises moving the LE in a direction of a critical vector.
19. The method of claim 12, wherein generating the cost function values comprises computing values using cluster legality as a parameter.
20. The method of claim 11, further comprising:
determining whether placement on the identified changes made to the first design satisfies timing constraints; and
performing logic array block and logic element moves if the timing constraints are not satisfied.
21. The method of claim 1, further comprising utilizing placement information of nodes in the first design that have not changed in the second design.
22. The method of claim 1, wherein determining whether the placement on the changes made satisfies timing constraints comprises performing timing analysis on the placement of the system.
23. The method of claim 1, wherein performing incremental routing comprises utilizing routing resources identified for a first node in the first design for an equivalent second node in the second design having equivalent sources and destinations.
24. The method of claim 1, wherein performing incremental routing comprises utilizing routing resources identified for a first node in the first design for an equivalent second node in the second design having equivalent sources and destinations when valid routing can be performed.
25. A method for designing a system on a target device utilizing field programmable gate arrays (FPGAs) comprising:
identifying changes made to a first design in a second design by determining whether a first node in the first design is equivalent to a second node in the second design by comparing timing constraints of the first and second nodes;
performing placement on the changes made to the first design in the second design; and
utilizing placement information from the first design for nodes that have not changed in the second design.
26. The method of claim 25, wherein identifying the changes further comprises comparing placement constraints on the first and second nodes.
27. The method of claim 25, further comprising performing incremental routing in response to determining that the placement satisfies timing constraints of the second design.
28. The method of claim 27, wherein performing incremental routing comprises utilizing routing resources identified for a first node in the first design for an equivalent second node in the second design having equivalent sources and destinations.
29. A machine-readable medium having stored thereon sequences of instructions, the sequences of instructions including instructions which, when executed by a processor, causes the processor to perform:
generating a first design for a system that includes a first netlist describing a first logical design, and placement and routing of the first logical design;
generating a second design for the system that includes a second netlist describing a second logical design;
identifying changes made to the first design in the second design by determining whether a first node in the first design is equivalent to a second node in the second design by determining whether the first and second nodes have matching logic unit table (LUT) masks that are bit strings that represent truth tables for functions; and
performing placement on the changes made to the first design on the second design.
30. The machine-readable medium of claim 29, further including sequences of instructions that when executed by the processor causes to processor to perform utilizing placement information from the first design for nodes that have not changed in the second design.
31. The machine-readable medium of claim 29, further comprising performing incremental routing in response to determining that the placement satisfies timing constraints of the second design.
32. The machine-readable medium of claim 31, wherein performing incremental routing comprises utilizing routing resources identified for a first node in the first design for an equivalent second node in the second design having equivalent sources and destinations.
US10/931,953 2004-09-01 2004-09-01 Method and apparatus for performing incremental compilation on field programmable gate arrays Active 2025-05-12 US7191426B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/931,953 US7191426B1 (en) 2004-09-01 2004-09-01 Method and apparatus for performing incremental compilation on field programmable gate arrays

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/931,953 US7191426B1 (en) 2004-09-01 2004-09-01 Method and apparatus for performing incremental compilation on field programmable gate arrays

Publications (1)

Publication Number Publication Date
US7191426B1 true US7191426B1 (en) 2007-03-13

Family

ID=37833528

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/931,953 Active 2025-05-12 US7191426B1 (en) 2004-09-01 2004-09-01 Method and apparatus for performing incremental compilation on field programmable gate arrays

Country Status (1)

Country Link
US (1) US7191426B1 (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060225021A1 (en) * 2005-04-01 2006-10-05 Altera Corporation Automatic adjustment of optimization effort in configuring programmable devices
US7401314B1 (en) * 2005-06-09 2008-07-15 Altera Corporation Method and apparatus for performing compound duplication of components on field programmable gate arrays
US20090183134A1 (en) * 2008-01-15 2009-07-16 International Business Machines Corporation Design structure for identifying and implementing flexible logic block logic for easy engineering changes
US20090183135A1 (en) * 2008-01-15 2009-07-16 Herzl Robert D Method and Device for Identifying and Implementing Flexible Logic Block Logic for Easy Engineering Changes
US7620925B1 (en) * 2006-09-13 2009-11-17 Altera Corporation Method and apparatus for performing post-placement routability optimization
US20100257497A1 (en) * 2007-12-21 2010-10-07 David Mallon System and method for solving connection violations
CN102116839A (en) * 2009-12-30 2011-07-06 中国科学院沈阳自动化研究所 Method for testing field programmable gate array (FPGA) based on maximum flow method
US8196081B1 (en) * 2010-03-31 2012-06-05 Xilinx, Inc. Incremental placement and routing
CN103412253A (en) * 2013-08-05 2013-11-27 电子科技大学 Interconnection structure modeling method and interconnection resource allocation vector automatic generation method
US8621411B1 (en) * 2012-07-19 2013-12-31 International Business Machines Corporation Generating and selecting bit-stack candidates from a graph using dynamic programming
US8638120B2 (en) 2011-09-27 2014-01-28 International Business Machines Corporation Programmable gate array as drivers for data ports of spare latches
US8856713B1 (en) * 2010-01-08 2014-10-07 Altera Corporation Method and apparatus for performing efficient incremental compilation
US9449133B2 (en) 2014-05-07 2016-09-20 Lattice Semiconductor Corporation Partition based design implementation for programmable logic devices
US20170286585A1 (en) * 2016-03-29 2017-10-05 Wipro Limited Methods and Systems for Reducing Congestion in Very Large Scale Integrated (VLSI) Chip Design
US10275557B1 (en) * 2010-01-08 2019-04-30 Altera Corporation Method and apparatus for performing incremental compilation using structural netlist comparison
US10339241B1 (en) * 2016-05-13 2019-07-02 Altera Corporation Methods for incremental circuit design legalization during physical synthesis
TWI684987B (en) * 2019-05-31 2020-02-11 創意電子股份有限公司 Circuit correction system and method for increasing coverage of scan test
US20200050729A1 (en) * 2014-01-14 2020-02-13 Altera Corporation Method and apparatus for relocating design modules while preserving timing closure

Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5084824A (en) * 1990-03-29 1992-01-28 National Semiconductor Corporation Simulation model generation from a physical data base of a combinatorial circuit
JPH0974138A (en) * 1995-09-04 1997-03-18 Matsushita Electric Ind Co Ltd Layout verification method and delayed value calculation method
US5659484A (en) * 1993-03-29 1997-08-19 Xilinx, Inc. Frequency driven layout and method for field programmable gate arrays
JPH10283382A (en) * 1997-04-03 1998-10-23 Nec Corp Cad system for integrated circuit
US5867396A (en) * 1995-08-31 1999-02-02 Xilinx, Inc. Method and apparatus for making incremental changes to an integrated circuit design
US6031981A (en) * 1996-12-19 2000-02-29 Cirrus Logic, Inc. Reconfigurable gate array cells for automatic engineering change order
US6035106A (en) * 1997-04-28 2000-03-07 Xilinx, Inc. Method and system for maintaining hierarchy throughout the integrated circuit design process
US6167558A (en) * 1998-02-20 2000-12-26 Xilinx, Inc. Method for tolerating defective logic blocks in programmable logic devices
US6209123B1 (en) * 1996-11-01 2001-03-27 Motorola, Inc. Methods of placing transistors in a circuit layout and semiconductor device with automatically placed transistors
US20010001881A1 (en) * 1998-03-27 2001-05-24 Sundararajarao Mohan Methods and media for utilizing symbolic expressions in circuit modules
US20020124234A1 (en) * 2001-01-04 2002-09-05 Stefan Linz Method for designing circuits with sections having different supply voltages
US6449761B1 (en) * 1998-03-10 2002-09-10 Monterey Design Systems, Inc. Method and apparatus for providing multiple electronic design solutions
US20020129325A1 (en) * 2001-03-06 2002-09-12 Genichi Tanaka Engineering-change method of semiconductor circuit
US6453454B1 (en) * 1999-03-03 2002-09-17 Oridus Inc. Automatic engineering change order methodology
US20020162086A1 (en) * 2001-04-30 2002-10-31 Morgan David A. RTL annotation tool for layout induced netlist changes
US20020188918A1 (en) * 2001-06-08 2002-12-12 Cirit Mehmet A. Apparatus and methods for wire load independent logic synthesis and timing closure with constant replacement delay cell libraries
US6539536B1 (en) * 2000-02-02 2003-03-25 Synopsys, Inc. Electronic design automation system and methods utilizing groups of multiple cells having loop-back connections for modeling port electrical characteristics
US20030154458A1 (en) * 2000-05-11 2003-08-14 Quickturn Design Systems, Inc. Emulation circuit with a hold time algorithm, logic analyzer and shadow memory
US20040088671A1 (en) * 2002-11-05 2004-05-06 Qinghong Wu Adaptive adjustment of constraints during PLD placement processing
US6760899B1 (en) * 2002-08-08 2004-07-06 Xilinx, Inc. Dedicated resource placement enhancement
US6779169B1 (en) * 2002-05-31 2004-08-17 Altera Corporation Method and apparatus for placement of components onto programmable logic devices
US20040199880A1 (en) * 2003-03-31 2004-10-07 Kobi Kresh Hierarchical evaluation of cells
US20050091627A1 (en) * 2003-10-23 2005-04-28 Lalita Satapathy Comparison of two hierarchical netlist to generate change orders for updating an integrated circuit layout
US6910200B1 (en) * 1997-01-27 2005-06-21 Unisys Corporation Method and apparatus for associating selected circuit instances and for performing a group operation thereon
US20060048085A1 (en) * 2004-09-01 2006-03-02 Sean Tyler Method and system for performing timing analysis on a circuit

Patent Citations (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5084824A (en) * 1990-03-29 1992-01-28 National Semiconductor Corporation Simulation model generation from a physical data base of a combinatorial circuit
US5659484A (en) * 1993-03-29 1997-08-19 Xilinx, Inc. Frequency driven layout and method for field programmable gate arrays
US5867396A (en) * 1995-08-31 1999-02-02 Xilinx, Inc. Method and apparatus for making incremental changes to an integrated circuit design
JPH0974138A (en) * 1995-09-04 1997-03-18 Matsushita Electric Ind Co Ltd Layout verification method and delayed value calculation method
US6370677B1 (en) * 1996-05-07 2002-04-09 Xilinx, Inc. Method and system for maintaining hierarchy throughout the integrated circuit design process
US6209123B1 (en) * 1996-11-01 2001-03-27 Motorola, Inc. Methods of placing transistors in a circuit layout and semiconductor device with automatically placed transistors
US6031981A (en) * 1996-12-19 2000-02-29 Cirrus Logic, Inc. Reconfigurable gate array cells for automatic engineering change order
US6910200B1 (en) * 1997-01-27 2005-06-21 Unisys Corporation Method and apparatus for associating selected circuit instances and for performing a group operation thereon
JPH10283382A (en) * 1997-04-03 1998-10-23 Nec Corp Cad system for integrated circuit
US6035106A (en) * 1997-04-28 2000-03-07 Xilinx, Inc. Method and system for maintaining hierarchy throughout the integrated circuit design process
US6167558A (en) * 1998-02-20 2000-12-26 Xilinx, Inc. Method for tolerating defective logic blocks in programmable logic devices
US6449761B1 (en) * 1998-03-10 2002-09-10 Monterey Design Systems, Inc. Method and apparatus for providing multiple electronic design solutions
US20010001881A1 (en) * 1998-03-27 2001-05-24 Sundararajarao Mohan Methods and media for utilizing symbolic expressions in circuit modules
US6453454B1 (en) * 1999-03-03 2002-09-17 Oridus Inc. Automatic engineering change order methodology
US6539536B1 (en) * 2000-02-02 2003-03-25 Synopsys, Inc. Electronic design automation system and methods utilizing groups of multiple cells having loop-back connections for modeling port electrical characteristics
US20030154458A1 (en) * 2000-05-11 2003-08-14 Quickturn Design Systems, Inc. Emulation circuit with a hold time algorithm, logic analyzer and shadow memory
US20020124234A1 (en) * 2001-01-04 2002-09-05 Stefan Linz Method for designing circuits with sections having different supply voltages
US20020129325A1 (en) * 2001-03-06 2002-09-12 Genichi Tanaka Engineering-change method of semiconductor circuit
US6530073B2 (en) * 2001-04-30 2003-03-04 Lsi Logic Corporation RTL annotation tool for layout induced netlist changes
US20020162086A1 (en) * 2001-04-30 2002-10-31 Morgan David A. RTL annotation tool for layout induced netlist changes
US20030088842A1 (en) * 2001-06-08 2003-05-08 Library Technologies, Inc. Apparatus and methods for wire load independent logic synthesis and timing closure with constant replacement delay cell libraries
US20020188918A1 (en) * 2001-06-08 2002-12-12 Cirit Mehmet A. Apparatus and methods for wire load independent logic synthesis and timing closure with constant replacement delay cell libraries
US6779169B1 (en) * 2002-05-31 2004-08-17 Altera Corporation Method and apparatus for placement of components onto programmable logic devices
US6760899B1 (en) * 2002-08-08 2004-07-06 Xilinx, Inc. Dedicated resource placement enhancement
US20040088671A1 (en) * 2002-11-05 2004-05-06 Qinghong Wu Adaptive adjustment of constraints during PLD placement processing
US20040199880A1 (en) * 2003-03-31 2004-10-07 Kobi Kresh Hierarchical evaluation of cells
US20050091627A1 (en) * 2003-10-23 2005-04-28 Lalita Satapathy Comparison of two hierarchical netlist to generate change orders for updating an integrated circuit layout
US20060048085A1 (en) * 2004-09-01 2006-03-02 Sean Tyler Method and system for performing timing analysis on a circuit

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
Bian et al., "Local Logic Substitution Algorithm for Post-Layout Re-Synthesis", Proceedings of 5th International Conference on ASIC, Oct. 21-24, 2003, vol. 1, pp. 136-139. *
Emmert et al., "Incremental Routing in FPGAs", Eleventh Annual IEEE International ASIC Conference, Sep. 13-16, 1998, pp. 217-221. *
Emmert et al., "On-Line Incremental Routing for Interconnect Fault Tolerance in FPGAs Minus the Router", Proceedings of 2001 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, Oct. 24-26, 2001, pp. 149-157. *
Kannan et al., "A Methodology and Algorithms for Post-Placement Delay Optimization", 31st Conference on Design Automation, Jun. 6-10, 1994, pp. 327-332. *
Raman et al., "A Timing-Constrained Incremental Routing Algorithm for Symmetrical FPGAs", Proceedings of European Design and Test Conference, Mar. 11-14, 1996, pp. 170-174. *
Sing et al., "Incremental Placement for Layout-Driven Optimization on FPGAs", IEEE/ACM International Conference on Computer Aided Design, Nov. 10-14, 2002, pp. 752-759. *
Suaris et al., "Smart Move: A Placement-Aware Retiming and Replication Method for Filed Programmable Gate Arrays", Proceedings of 5th International Conference on ASIC, vol. 1, Oct. 21-24, 2003, pp. 67-70. *
Tessier, "Incremental Compilation for Logic Emulation", IEEE International Workshop on Rapid System Prototyping, Jul. 1999, pp. 236-241. *
Togawa et al., "An Incremental Placement and Global Routing Algorithm for Field-Programmable Gate Arrays", Proceedings of the Asia and South Pacific Design Automation Conference, Feb. 10-13, 1998, pp. 519-526. *
Verma et al., "A Search-Based Bump-and-Refit Approach to Incremental Routing for ECO Applications in FPGAs", IEEE/ACM International Conference on Computer Aided Design, Oct. 4-8, 2001, pp. 144-151. *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7415682B2 (en) * 2005-04-01 2008-08-19 Altera Corporation Automatic adjustment of optimization effort in configuring programmable devices
US20060225021A1 (en) * 2005-04-01 2006-10-05 Altera Corporation Automatic adjustment of optimization effort in configuring programmable devices
US7401314B1 (en) * 2005-06-09 2008-07-15 Altera Corporation Method and apparatus for performing compound duplication of components on field programmable gate arrays
US7620925B1 (en) * 2006-09-13 2009-11-17 Altera Corporation Method and apparatus for performing post-placement routability optimization
US20100257497A1 (en) * 2007-12-21 2010-10-07 David Mallon System and method for solving connection violations
US8713493B2 (en) * 2007-12-21 2014-04-29 Cadence Design Systems, Inc. System and method for solving connection violations
US20090183135A1 (en) * 2008-01-15 2009-07-16 Herzl Robert D Method and Device for Identifying and Implementing Flexible Logic Block Logic for Easy Engineering Changes
US8141028B2 (en) 2008-01-15 2012-03-20 International Business Machines Corporation Structure for identifying and implementing flexible logic block logic for easy engineering changes
US8181148B2 (en) 2008-01-15 2012-05-15 International Business Machines Corporation Method for identifying and implementing flexible logic block logic for easy engineering changes
US20090183134A1 (en) * 2008-01-15 2009-07-16 International Business Machines Corporation Design structure for identifying and implementing flexible logic block logic for easy engineering changes
CN102116839A (en) * 2009-12-30 2011-07-06 中国科学院沈阳自动化研究所 Method for testing field programmable gate array (FPGA) based on maximum flow method
US11507722B2 (en) * 2010-01-08 2022-11-22 Altera Corporation Method and apparatus for performing incremental compilation using structural netlist comparison
US10073941B1 (en) 2010-01-08 2018-09-11 Altera Corporation Method and apparatus for performing efficient incremental compilation
US8856713B1 (en) * 2010-01-08 2014-10-07 Altera Corporation Method and apparatus for performing efficient incremental compilation
US10275557B1 (en) * 2010-01-08 2019-04-30 Altera Corporation Method and apparatus for performing incremental compilation using structural netlist comparison
US8196081B1 (en) * 2010-03-31 2012-06-05 Xilinx, Inc. Incremental placement and routing
US8638120B2 (en) 2011-09-27 2014-01-28 International Business Machines Corporation Programmable gate array as drivers for data ports of spare latches
US8621411B1 (en) * 2012-07-19 2013-12-31 International Business Machines Corporation Generating and selecting bit-stack candidates from a graph using dynamic programming
CN103412253B (en) * 2013-08-05 2016-01-20 电子科技大学 Interconnect architecture modeling method and interconnect resources configuration vector automatic generation method
CN103412253A (en) * 2013-08-05 2013-11-27 电子科技大学 Interconnection structure modeling method and interconnection resource allocation vector automatic generation method
US20200050729A1 (en) * 2014-01-14 2020-02-13 Altera Corporation Method and apparatus for relocating design modules while preserving timing closure
US10909296B2 (en) * 2014-01-14 2021-02-02 Altera Corporation Method and apparatus for relocating design modules while preserving timing closure
US9449133B2 (en) 2014-05-07 2016-09-20 Lattice Semiconductor Corporation Partition based design implementation for programmable logic devices
US20170286585A1 (en) * 2016-03-29 2017-10-05 Wipro Limited Methods and Systems for Reducing Congestion in Very Large Scale Integrated (VLSI) Chip Design
US10169517B2 (en) * 2016-03-29 2019-01-01 Wipro Limited Methods and systems for reducing congestion in very large scale integrated (VLSI) chip design
US10339241B1 (en) * 2016-05-13 2019-07-02 Altera Corporation Methods for incremental circuit design legalization during physical synthesis
TWI684987B (en) * 2019-05-31 2020-02-11 創意電子股份有限公司 Circuit correction system and method for increasing coverage of scan test

Similar Documents

Publication Publication Date Title
US7318210B1 (en) Method and apparatus for performing incremental placement for layout-driven optimizations on field programmable gate arrays
US7500216B1 (en) Method and apparatus for performing physical synthesis hill-climbing on multi-processor machines
US7191426B1 (en) Method and apparatus for performing incremental compilation on field programmable gate arrays
US7251800B2 (en) Method and apparatus for automated circuit design
US7207020B1 (en) Method and apparatus for utilizing long-path and short-path timing constraints in an electronic-design-automation tool
US6557145B2 (en) Method for design optimization using logical and physical information
US7594204B1 (en) Method and apparatus for performing layout-driven optimizations on field programmable gate arrays
US9589090B1 (en) Method and apparatus for performing multiple stage physical synthesis
US8151228B2 (en) Method and apparatus for automated circuit design
JP2891328B2 (en) A Method of Generating Delay Time Values for Multilevel Hierarchical Circuit Design
US8156463B1 (en) Method and apparatus for utilizing long-path and short-path timing constraints in an electronic-design-automation tool for routing
US8296696B1 (en) Method and apparatus for performing simultaneous register retiming and combinational resynthesis during physical synthesis
US7412680B1 (en) Method and apparatus for performing integrated global routing and buffer insertion
US10339243B2 (en) Method and apparatus for automatic hierarchical design partitioning
US7257800B1 (en) Method and apparatus for performing logic replication in field programmable gate arrays
US7853911B1 (en) Method and apparatus for performing path-level skew optimization and analysis for a logic design
Li et al. Guiding a physical design closure system to produce easier-to-route designs with more predictable timing
Dhar et al. An effective timing-driven detailed placement algorithm for FPGAs
Singh et al. Incremental placement for layout driven optimizations on FPGAs
US8898609B1 (en) Method and apparatus for integrating signal transition time modeling during routing
US9443054B1 (en) Method and apparatus for utilizing constraints for the routing of a design on a programmable logic device
US7509618B1 (en) Method and apparatus for facilitating an adaptive electronic design automation tool
Iida Design Methodology
Yang et al. Blockage-aware terminal propagation for placement wirelength minimization
Jahanian et al. Performance and timing yield enhancement using Highway-on-Chip Planning

Legal Events

Date Code Title Description
AS Assignment

Owner name: ALTERA CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SINGH, DESHANAND;BROWN, STEPHEN;CHAN, KEVIN;REEL/FRAME:015768/0079;SIGNING DATES FROM 20040810 TO 20040824

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12