EP1248989A2 - Behavioral-synthesis electronic design automation tool and business-to-business application service provider - Google Patents
Behavioral-synthesis electronic design automation tool and business-to-business application service providerInfo
- Publication number
- EP1248989A2 EP1248989A2 EP00936347A EP00936347A EP1248989A2 EP 1248989 A2 EP1248989 A2 EP 1248989A2 EP 00936347 A EP00936347 A EP 00936347A EP 00936347 A EP00936347 A EP 00936347A EP 1248989 A2 EP1248989 A2 EP 1248989A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- business
- tree
- service provider
- application service
- business application
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/30—Circuit design
- G06F30/32—Circuit design at the digital level
- G06F30/327—Logic synthesis; Behaviour synthesis, e.g. mapping logic, HDL to netlist, high-level language to RTL or netlist
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/30—Circuit design
Definitions
- the present invention relates to electronic design automation and more particularly to behavioral-synthesis tools provided on a pay-per-use basis over the Internet
- EDA tool needs to be offered for all front-end design phases, e.g., architectural, RTL, datapath, logic.
- Architectural synthesis would enable true system-level design at ten times larger capacity.
- Faster runtimes would permit better quality-reliability (QoR) because global optimization is possible.
- QoR quality-reliability
- a new EDA tool is needed with two-to-five times reduction in code and design time.
- a business-to-business application service provider embodiment of the present invention includes an Internet website and webserver with EDA-on-demand solutions for system-on-a-chip designers.
- Such website allows electronic designs hardware description language to be uploaded into a front-end EDA design environment.
- a behavioral model simulation tool hosted privately on the webserver tests and validates the design.
- Such tool executes only in the secure environment of the business-to-business application service provider.
- the validated solution is then downloaded back over the Internet for a pay-per-use fee to the customer, in a form ready to be placed and routed by a back-end EDA tool.
- Such validated design solutions are also downloadable to others in exchange for other designs, or available technology libraries.
- the intellectual property created can be re-used, sold, shared, exchanged, and otherwise distributed efficiently and easily from a central for-profit clearinghouse.
- Fig. 1 is a functional block diagram of a business-to-business application service provider embodiment of the present invention that includes an Internet website and webserver with EDA-on-demand solutions for system-on-a-chip designers; and
- Fig. 2 is a flowchart diagram of an electronic design automation method embodiment of the present invention
- Fig. 3 is a flowchart diagram of a timing analysis method embodiment of the present invention
- Figs. 4A, 4B, and 4C are diagrams that represent the transitions from a circuit, to a logic tree, and a simplified tree;
- Fig. 5 is a diagram representing a design that comprises a set of complex-model arcs at the input boundary, a set of simplified-model arcs inside, and another set of complex-model arcs at the output boundary;
- Fig. 6 is a diagram of a general graph-matching problem in electronic design automation
- Fig. 7 is a diagram of a circuit and its corresponding bipartite graph representation
- Figs. 8 is a diagram of a first step in technology mapping which is to partition a network graph into a collection of trees;
- Fig. 9 is a diagram representing a decomposed technology library
- Fig. 10 diagrams a circuit tree on the right, and the only two pattern trees on the left that are needed to match every part of the circuit tree;
- Fig. 11 is a diagram of a covering selection method embodiment of the present invention.
- Fig. 12 is a diagram representing the partitioning of a circuit into trees, and the ordering of those trees into a list that conforms to a basic rule
- Fig. 13 is a diagram of an example that illustrates how control signals can dominate a critical-timing path
- Fig. 14 is a diagram of a Verilog sequential block "begin ... end" statement being transformed into a control-flow graph arc-A between a source node-S and a sink node- T;
- Fig. 15 is a diagram of an example of an "if" statement with all the options, and its control flow graph reduction;
- Figs. 16A-16D are diagrams representing various forms of loops in Verilog and their corresponding control flow graph reductions
- Fig. 17 is a diagram a simple Verilog HDL text suitable for high-level synthesis at the left, and a fully reduced control flow graph that corresponds to the text on the right;
- Fig. 18 is a diagram illustrating a procedure for building the function MAP
- Figs. 19A-19D are diagrams that represent various piecewise constructions by which a complete one-hot FSM can be constructed by the procedure of Table I;
- Fig. 20 is a diagram that shows the correspondence between a Verilog text sample, its control flow graph, and its ultimate one-hot FSM.
- Fig. 1 represents an Internet system embodiment of the present invention, and is referred to herein by the general reference numeral 100.
- the system 100 includes an Internet connection 102 for a business-to-business application service provider 104 that sells high-level synthesis (HLS) services and intellectual property.
- application service provider 104 has a webserver 106 with, e.g., WINDOWS-NT, IIS, and ASP, commercial software from Microsoft to host client web browser visits.
- a pay-per-use electronic design automation (EDA) tool 108 is installed as a software application on the webserver 106. Any of a number of users and customers are represented by a web client 1 10 and browser 1 12.
- EDA electronic design automation
- HDL hardware description language
- the EDA tool 108 is supported by a subscriptions module 1 18 that charges users a per-use-fee and allows uploading and downloading of designs.
- An HDL conversion module 120 translates HDL into control-flow (CF) graphs.
- An operations scheduling module 122 maps each HDL statement to an appropriate CF graph node.
- a resource allocation module 124 optimizes the hardware necessitated by the schedule.
- a collection of user tools 126 are included to help the users navigate, understand, and use the website. User designs, e.g., after uploading and for downloading, are stored in a database 128.
- the business-to-business application service provider 1 04 preferably includes on- the-fly timing analysis of a digital design during a scheduling phase of a high-level synthesis.
- An abstract timing model expresses a bit-level timing of a component without incurring the complexity penalties of a conventional full timing analysis.
- a fast, accurate estimate of the timing consequences of each scheduling decision is available. Such estimates can then be used to determine if any scheduling decision should b e rejected.
- High-level synthesis automates certain subtasks of a digital system design in an electronic design automation (EDA) system.
- EDA electronic design automation
- a system architect begins by designing and validating an overall algorithm to be implemented, e.g., using C, C++, a specialized language, or a capture system.
- the resulting architectural specification is partitioned into boards, chips, and blocks. Each block is a single process having its own control flow. There are usually tens to hundreds of such blocks in a modern large- scale chip design. Typical blocks represent whole filters, queues, and pipeline stages.
- Scheduling process 122 assigns operations such as additions and multiplications to states of a finite-state machine (FSM).
- FSM finite-state machine
- Such FSM describes a control flow in an algorithm performed by a block being synthesized. Some operations are locked into particular states, and represent communication with other blocks.
- Allocation process 124 maps the operations of a scheduled FSM to particular hardware resources. For example, three addition operations can be scheduled to only require a single adder. An appropriate adder is constructed, and the operations are assigned to the adder. But complications can arise when more than one hardware resource of a given bitwidth and function is needed. And so which resource to use for each operation must be decided. Considerations include multiplexing cost, the creation of false timing paths, register assignment, and even using a large resources for small operations. Hardware resources can be used for multiple functions. Calculating a minimum set of resources for an entire process is difficult but rewarding. Sometimes alternative implementations will be possible. It is often possible to choose implementations that meet the overall timing constraints and minimize the gate count. Resource allocation also includes mapping resources (abstract functions) to gate level implementations.
- Allocation includes calculating a register set and assigning data to registers for use h later states. For example, temporary variables are used to store intermediate results a larger calculation. But the contents of such temporary variables could share a common register in different states. The contents are only needed in one state each. So it is possible to save on hardware by assigning the data that needs to be stored to such storage elements. But register and storage allocations can be complicated if data values can form mutually exclusive sets or can share storage. Data values often drive functional resources, and in turn are often produced by functional resources. A good assignment of data to storage will result in reduced multiplexing costs and delays. The allocation is also made more complex if any register and functional hardware interact.
- Technology mapping follows Boolean optimization, the abstract Boolean gates of the circuit are mapped to standard cells from a technology library Standard library cells include simple AND, OR or NOT functions, and much more complex functions For example, full adders, and-or-invert gates, and multiplexers Technology-library gates are available in a variety of drive strengths, delays, input loadings, etc Technology mapping is made more complex by the fact that there are many ways to map an individual Boolean gate and each way having its own unique advantages
- Technology mapping can sometimes be avoided by constructing custom gate layouts for the gates of a circuit, instead of selecting cells from a library of preconstructed and precharacte ⁇ zed cells But this method is not commonly associated with automatic synthesis
- Fig. 2 represents an electronic design automation (EDA) method embodiment of the present invention, and is referred to herein by the general reference numeral 200
- EDA electronic design automation
- EDA method begins with an algorithm design step 202
- the system design is partitioned into blocks and protocol design in a step 204
- Verilog or other kind of hardware description language (HDL) coding is done in a step 206
- a high-level synthesis (HLS) step 208 includes an operation scheduling step 210 and a resource allocation step 212
- a timing analysis is applied each time an individual operation is scheduled, and may be called many times to get a single operation scheduled
- a technology-independent (Boolean) optimization step 214 follows
- a technology mapping step 216 maps the abstract Boolean gates of the circuit to standard cells from a technology library, for example
- a placement step 218 locates the gates on the chip real estate, and a routing step 220 interconnects them with wires
- the timing analysis starts with a state diagram, a collection of resources, a technology library, and at least partially scheduled and allocated operations It determines whether a design as a whole meets its timing requirements The total delays of such circuit must be such that valid data can
- the timing analysis starts with a state diagram, a collection of resources, a technology library, and at least partially scheduled and allocated operations It determines whether a design as a whole meets its timing requirements The total delays of such circuit must be such that valid data can be latched into destination registers at the end of each clock cycle.
- a scheduling system can be used to construct realistic schedules, and result n allocated circuits that have a good probability of meeting timing after layout
- Timing questions must be answered quickly and efficiently, because they will be asked many times in the course of scheduling a single design
- the transitions of the design are considered individually, and n order
- a set of "ready" operations is constructed
- the ready operations are those with input data that is available in the source state of the transition
- One of these operations is selected using some criterion, e g , most urgent first, and is removed from the ready list and assigned to a resource that is otherwise unused in the current transition
- Its result data is then added to the set of data that is available Other operations that depend on its results may then be added to the ready list
- Such process continues until no more operations are in the ready list, or there are no more resources available to perform the operations on the ready list, or the operations on the ready list cannot be scheduled on the current transition This repeats until all of the arcs have been considered and all of the operations have been scheduled
- the timing process stops if the design cannot be scheduled using a given resource set
- Timing analysis must be done each time an individual operation is scheduled If a first candidate operation-transition-resource scheduling tuple is not accepted, the timing analysis procedure must be called repeatedly until one is accepted The timing analysis procedure must be able to evaluate the timing for all of the resources of a design each time a scheduling tuple for an operation is considered
- timing analysis must be bit-true, as opposed to bitwise-lumped
- a single number or a single load/delay function associated with each resource is inadequate.
- a delay or load/delay function must be associated with each output bit of a resource.
- a fast, accurate, bit-level timing model for a combinational resource can be constructed because a graph representing combinational logic can always be partitioned into a collection of trees.
- a logic tree uses its nodes to represent gates and terminals, and its arcs represent connections. Electrical drivers are always below the gates they drive a logic tree, e.g. farther from the root, or else the root of the tree and the gates it drives are in another tree.
- the terminals of a tree represent its connections to gates outside the logic tree. When a gate in the network has a fan-out of two or more, the root of a maximal logic tree is the output terminal of such gate. All other trees become subtrees.
- Fig. 3 represents a timing analysis method of the present invention, and is referred to herein by the general reference numeral 300.
- the method 300 begins with a step 302 that partitions a circuit design into its corresponding logic trees. Once a circuit that has been partitioned into logic trees, it becomes possible to construct a compacted model of the circuit in a step 304. Logic trees are replaced with equivalent trees having no interior nodes, e.g., as in a step 306. These equivalent trees are often substantially simpler than the original trees.
- the timing in the original circuit is analyzed along each path from a tree leaf to its root. A propagation delay calculated for each path.
- such computed delays are annotated onto the corresponding arcs of the simplified trees.
- any dependency of a propagation delay of the original circuit on the slew rate of the input signal is annotated onto the corresponding leaf of the simplified tree.
- a step 314 copies capacitive loads from the leaves of the logic tree to the leaves of the simplified tree.
- the load/delay response curve of the output gate e.g., at the apex of the logic tree, is copied in a step 316 to the root of the simplified tree.
- Figs. 4A, 4B, and 4C represent the transitions from a circuit 400, to a logic tree 410, and a simplified tree 420.
- the logic gates are represented as nodes in the tree.
- the AND-gates have a pointed-up chevron symbol, and the NOR-gates have the down-pointed chevron.
- Fig. 4C the nodes are eliminated entirely.
- the simplified tree 420 is decorated with annotations that provide quick answers about delay issues in circuit 400 (Fig. 4A).
- the dotted envelopes show the subtrees.
- Fig. 5 represents a design 500 that comprises a set of complex-model arcs 502 at the input boundary, a set of simplified-model arcs 504 inside, and another set of complex- model arcs 506 at the output boundary.
- a simplified model of propagation delay in the interior of a model may be some loss of accuracy associated with such use of a simplified model of propagation delay in the interior of a model, but in practice these inaccuracies appear to be insubstantial. If a loss of accuracy is an issue, the more complex models can be used on additional interior arcs. The accuracy would be increased, but so would the solution's run time. Only arcs touching circuit boundaries really need complex models. Arcs having simplified models are shown with dashed arrows.
- the goal is to find a one-to-one mapping 606 from the elements of the pattern to the elements of the target graph.
- a subgraph of G 2 defined by the matched nodes, must be isomorphic to G r
- An example of such a matching is shown here, with the matched nodes and arcs shown in an envelope 608.
- Fig. 7 illustrates a circuit 702 and its corresponding bipartite graph representation 704.
- the gates are converted to nodes, and the interconnects to arcs.
- the edges are ordered pairs, e g , they have a direction
- the nodes in v 2 represent gates, and the nodes in v- represent circuit nets.
- Graph matching for bipartite directed graphs is such that a net node can only map to a net node, a gate node can only map to a gate node, and isomorphism is used to preserve the direction of edges
- the nodes in bipartite graph representation 704 that represent gates have types, e g , AND ( ⁇ ), OR (v), and NOT (') These form part of the isomorphism construction
- a node of type X in G is mapped to a node of type X
- DAG's Directed acyclic graphs
- DAG's can be used to represent some types of multiple-output gates, and so combinational logic circuits can be represented by a DAG That is, as long as they contain no cycles DAG's must also be well-formed and avoid including cyclic false paths
- Fig. 8 illustrates a first step in technology mapping which is to partition a network graph 802 into a collection of trees 804-808 or DAG's Each tree 804-308 can then be worked on as an individual mapping problem
- the simplest formulation is to use trees, but an extension to DAG's would not be difficult
- the trees are defined so that their roots and leaves are all net nodes, as opposed to gate nodes Root and leaf nodes are duplicated as many times as necessary, otherwise they will have to be shared between trees
- the trees 804-808 should be as large as possible, e.g., maximal trees In any extension to DAG's, the number of output terminals of a DAG must be limited to two or other small number Otherwise, the matching computations get too complex
- a typical technology library includes a number of cells that represent primitive elements. Combinational cells in the library have Boolean functions A selection of bipartite directed graph representations is constructed for each cell Each of these graphs is associated with the cell's Boolean function expressed in a small p ⁇ mitive- type alphabet.
- One convenient alphabet of primitive type is that of two-input NAND gates and inverters
- Each cell of the library is described by a tree (or DAG) comprising only net nodes, two-input NAND nodes, and NOT nodes
- the exact alphabet chosen is not crucial as long as it is relatively simple, and it is logically complete All Boolean functions can be expressed as networks comprising only units of the alphabet
- Such a decomposed library is shown in Fig. 9. Cell names are listed in the left column. The middle column lists the corresponding Boolean functions. And each cell is represented by one or more pattern trees as shown in the right column.
- Any circuit designs submitted to technology mapping processes are usually represented as networks of simple gates, e.g. NAND, AND, NOR, OR, XOR, and NOT.
- Each network can be converted to a functionally equivalent network using only the library-tree gate types and fan-ins.
- the circuit would be converted into an equivalent circuit comprising only inverters and two-input NAND gates. It is then mapped into a bipartite directed graph representation, e.g., ri the same style as the library-pattern graphs. After that, it is possible to do the graph matching.
- Both the library cells and the circuit to be mapped are represented using the same graph formalism.
- the trees of a circuit are individual matching problems. Any matching results are preferably encoded by attaching a list of matching pattern trees to each net node N. Such list represents a set of pattern trees whose roots match N.
- the following pseudocode implements such, and refers to this list as matchings(N),
- R be the root of a circuit tree T.
- S be a set comprising initially only of R.
- a recursive algorithm can recognize that a circuit tree matches a pattern tree if the roots match, and all the circuit-tree subtrees map to all the pattern- tree subtrees in matching pairs.
- the subtree mapping must be one-to-one circuit subtrees to pattern subtrees, each subtree of a circuit tree must map to exactly one subtree of a pattern tree. Every subtree of the pattern tree must be mapped to b y some subtree of the circuit tree, e.g., "onto" mapping. Without this, more than one subtree of the circuit tree might be mapped to a single subtree of the pattern tree, or fail to map to some subtree of the pattern at all.
- the trees might be asymmetric. All permutations of the ordered list of subtrees of the pattern tree are tested against the ordered list of subtrees of the circuit tree. If all permutations fail, the match attempt as a whole is abandoned. A match is found if any permutation succeeds.
- the following pseudocode implements a matching algorithm. All of the named nodes are net nodes.
- the gate nodes are net node drivers.
- a list "U” referred to in the if statement is a list of drivers of the net N, e.g., drivers of gate G that drive net node N.
- " M” is a net node of the pattern tree, and "N” is a net node of the circuit tree.
- the matching step produces a one-to-many mapping from the net nodes of the circuit to root nodes of pattern trees.
- mapping candidates are functions that take a net node as their argument, and return a list of pattern trees.
- An implementation of the circuit becomes a set of net nodes, for which one member of the candidate set is selected. The set of net nodes chosen will usually be smaller than the entire set of net nodes because some net nodes will be "buried" inside patterns that have interior net nodes.
- Figure 10 shows a circuit tree 1002 on the right, and the only two pattern trees 1004 and 1006 on the left that are needed to match every part of the circuit tree.
- Each of the possible matches of a pattern tree to a piece of the circuit tree net nodes is shown as a dashed line.
- the entire circuit can be "covered” with as few as four gates if a proper subset of six matchings shown is chosen.
- the circuit tree can be decomposed into four constituent gate types.
- Subsets of the six matchings shown could produce redundant or incomplete coverings.
- the challenge is to select a subset of the matchings that minimizes delay and redundancy, and that covers.
- Fig. 11 represents a covering selection method embodiment of the present invention, and is referred to herein by the general reference numeral 1 100.
- a circuit is partitioned into trees.
- such trees are ordered using a topological sorting algorithm.
- a depth-first graph traversal algorithm can be used.
- a rule of ordering states that a tree "T” must precede all trees whose leaves “L” it drives, and it must succeed all trees that drive any leaves “L” of tree “T”. This ordered list of trees is called "0".
- a sweep forward in the ordered linear list is made while computing a set of Pareto-optimal load/arrival curves for each of a plurality of net nodes that match a technology-library element.
- a sweep backward in the ordered linear list is made while using the set of Pareto-optimal load/arrival curves for each of the net nodes and a capacitive load to select a best one of the technology-library elements with a shortest signal arrival time. Only net nodes that correspond to a gate input are considered, and any capacitive loads are always predetermined.
- Such trees "T” representing a circuit and an ordered list is represented in Fig. 12.
- a first tree 1201 in “O” is arbitrarily labeled "A”.
- the other trees 1202-1209 are labeled "B” through “J”.
- a topological ordering of the trees that satisfies the rule is A, H, G, J, B,
- trees-A, J, and H are at the input boundary, no other trees can drive any of trees-A, J, or H So it is permissible under the rule to place tree-A at the head of the ordered list "0"
- the ordering of trees A, H, G, J, B, C, F, E, D allows each tree to be evaluated in an order where no tree will have a still-to-be-evaluated input tree
- the signal arrival times at the leaves of tree-A can b e postulated They represent primary inputs of the combinational part of the circuit, and such signal arrival times can be obtained from the circuit's environment or from user constraints
- Each leaf L of tree-A has no candidate matchings
- a load/delay curve can be attached to each leaf of tree-A
- Such load/delay curve is any load/delay curve associated with a postulated driver of a primary input that L represents
- a root of tree-A is its output net At least one candidate from the technology library must match, or the process must stop An aggregate load/delay curve can be computed for the root of tree-A e g , using an algorithm with a recursive procedure like this,
- this algorithm scans the matching candidates of N, and for each candidate "c” computes arrival times at the inputs of "c” under the assumption that "c” will be selected. It can then calculate a load/arnval curve at N from the known characteristics of "c” and consequently the signal amval times at the inputs of "c” The set of load/arrival curves is generated to represent each of all the possible matchings of cells to N
- the aggregate load/delay curve representations can be made to be efficient At any load there is a single optimal delay for a particular cell "c" All other members of "c" can do no better than to match the same arrival time for that load
- the load/arrival curve at N is preferably a function that maps loads to optimal members of "c” This function can be generated by taking the piecewise minimum of the aggregate load/delay curve.
- the load range for which a gate-G2 is best is from zero load to the breakpoint. Best is defined as having the shortest signal arrival time (least delay).
- the load range for which a gate-G1 is best is from the breakpoint to infinity.
- An aggregate curve at N can be used to find both an arrival time and a best gate to map to N, given the capacitive load being driven.
- the first tree to consider is one driven only by primary inputs. Any succeeding trees will either be driven by primary inputs, or by trees that have already been processed. So every tree-T to be considered in its order is guaranteed to be driven only by primary inputs or trees that have been previously considered. Every driver at the input of a tree-T will have a known Pareto- optimal load/arrival curve at the time tree-T is considered. A Pareto-optimal load/arrival curve is computed for each of the primary outputs of the circuit.
- a final step chooses a load for each output-O.
- load can be chosen arbitrarily, as long as it falls in the range of an aggregate load/arrival curve for output-O.
- the aggregate curve at output-0 maps both the arrival time and the optimal gate G to drive output-O. Setting the load dictates which library gate-G should be selected to drive output-O.
- the tree whose root is output-0 can be toured, beginning with the inputs of library gate-G.
- the input capacitances are known, because library gate-G is a known element in the technology library. This allows a simple look-up of the optimal gate to drive each of its inputs. This is repeated until there is a complete covering of tree-T.
- this covering process is applied to the ordered list of trees in reverse order, no tree will be considered unless all of the trees it drives have already been covered. The load on the root gate of that tree is considered. The covering process proceeds until all trees are covered. The technology mapping process is thus completed.
- the selection process sweeps an ordered list of trees. First forward and then backward. In the forward pass, a set of Pareto-optimal load/arrival curves are computed for each net node matching a library element. In the backward pass, the load/arrival curves and load values are used to select a best gate to drive each node considered. Only nodes that correspond to gate inputs are considered. Because of this, loads are always known so the process can be completed.
- FIG. 13 is an example that illustrates how control signals can dominate a critical-timing path. Control signals are shown in dashed lines, and critical-path signals are shown heavier lines.
- a circuit 1300 includes a control FSM 1302. A critical signal propagation timing path will exist between a set of input ports 1304 and a set of output ports 1306. The circuit 1300 includes a number of multiplexers 1308-1312 controlled by the control FSM 1302, a number of adders 1314-1316, and a number of latches 1318- 1321. A status signal 1322 is input by the control FSM 1302.
- the multiplexers 1308-1312 and the latches 1318-1321 depend on a set of control signals 1324-1333 output by the control FSM 1302.
- the ripple-through of signals from input to output will therefore be highly dependent on when the control signals 1324-1333 are issued and settle.
- the propagation delays of individual gates are relatively unimportant.
- a control-flow graph is a directed graph that describes the flow of control in a source- code HDL description. There exists a direct mapping from an HDL description to a unique control flow graph. There also exists a direct mapping from such unique control flow graph to a finite-state machine, e.g., in a "bubble graph” representation. A direct mapping can also be made from the unique control-flow graph to a "one-hot" FSM circuit. A control FSM can therefore be generated in two steps that are independent of any later-constructed schedule. The point is. control FSM timing can be determined during scheduling.
- a control-flow graph-G includes nodes "V connected by directed arcs "E".
- one node V in a control-flow graph-G is labeled "reset".
- Such node represents an initial point in control-flow graph-G.
- a "reset" node can have no in-arcs and only one out-arc. Any number of the other nodes in control-flow graph-G can be labeled as state nodes, or not labeled. These nodes must have at least one in-arc and at least one out-arc.
- arcs can be labeled with conditions and actions. These are parse trees, or lists of parse trees, and are analogous to the conditions and actions of an FSM.
- the condition labels reveal which way control branches will go, e.g., "if".
- the action labels describe the operations that will take place if the flow of control goes along the arc labeled with the actions.
- “Join nodes” are nodes with more than one in-arc, e.g., the "if” node of Fig. 15.
- "Fork nodes” are nodes with more than one out-arc, e.g., the "fi” node of Fig. 15.
- Other names or labels can be used that correspond to common programming constructs, e.g., "begin”, "loop", and "end”.
- a control-flow graph-G is typically constructed from a parsed HDL text in a step-by- step reduction of the parse tree. Particular parse tree structures are recognized, and then corresponding subgraphs of G are constructed.
- Verilog is used herein various example HDL and Verilog construct names, mappings can be made to VHDL and other imperative simulation-based HDL's allowing edge events.
- Fig. 14 the translation of a single process from a parse tree P 1400 to a control-flow graph-G 1402, begins with a simple graph 1404 having a reset node 1406 and a join node 1408 with a trivial self-loop 1410.
- a process is created using the "always" keyword, followed by a simple or composite "statement".
- the word “statement” is temporarily annotated onto the self-loop arc 1410 as its action, or operation. Where is no branch, so there is no "condition" label.
- the simple graph-G 1404 can be transformed into a more-elaborate control-flow graph-G 1402 by applying a procedure to statements annotated on the arcs.
- an arc that has a statement is replaced with two or more arcs and one or more nodes.
- the new arcs are then decorated with simpler statements and/or conditions.
- the new nodes may also be similarly labeled.
- Such procedure continues recursively until no more decomposable statements remain.
- the statements and conditions labeled onto the new arcs are particular subtrees of the statement's parse tree.
- Verilog has the Backus-Naur form (BNF) syntactic definition
- a Verilog sequential block "begin ... end” statement can be transformed into a control-flow graph arc-A between a source node-S and a sink node-T.
- the arc-A is disconnected from sink node-T, and the sequential block parse tree-P is removed from arc-A 1412.
- Two new nodes 1414 and 1416 are constructed, "begin” and "end”.
- the arrow-end of arc-A 1412 is connected to "begin” node 1414.
- a new arc-B 1418 is constructed, and its feather-end is connected to the "begin” node 1414, and its arrow end to the "end” node 1416.
- a new arc-C 1420 is constructed, and the feather end is connected to "end" node 1416, and its arrow end to a "loop" node 1422. All of the statements of the sequential block (e.g., the subtrees of P) are attached to arc-B 1418 as an ordered list of parse trees 1424.
- the name of the block must be saved in a table that maps the name to the node "end". This node may become a jump destination of a Verilog "disable" statement.
- Verilog conditionals "if...else” and "case...endcase" are handled roughly the same.
- a Verilog "if...else” statement starts with keyword “if”, followed by a expression E, a statement S1 to be executed if the condition is true, and an optional keyword “else” and its statement S2 to be executed if the condition is false.
- An example of an "if" statement with all the options, and its control flow graph reduction, is shown in Fig. 15.
- the "if" statement is reduced by introducing a fork node 1506 and a join node 1508, and arcs 1510 and 1512 connecting them, as well as a new arc 1514 that connects the join to T
- the condition cond" and its Boolean negation "icond 1 are annotated onto the conditional arcs 1510 and 1512 These signify the conditions under which each branch will be taken There might not be a statement associated with either the true or the false branch So nothing is annotated onto the arc
- the statements of the true (S1 ) and false (S2) branches are annotated onto the true and false arcs 1510 and 1512, respectively Case statements are handled in a similar way
- the difference is that the conditions may include a default, and the number of branches can be greater than two The default is the logical negation of the logical sum of all of the other conditions
- Loops in Verilog can take any of several forms These are shown in Figs 16A-16D, along with corresponding control flow graph reductions Additional parse trees may need to be constructed to make the loops operate correctly
- a new variable, an iteration counter is introduced Its value must be initialized before entenng the loop, and it's incremented each time the loop executes In a ' repeat while" or "for...loop", a condition must be attached to each of the two out-arcs of the node labeled "Iter"
- Verilog disable is, in effect, a jump to the end node of a labeled begin-end block If a disable is annotated onto an arc A going from node S to node T, the disable is reduced by disconnecting A from T
- the end node E corresponding to the block being disabled, is looked up in a table that maps block names to corresponding end nodes This table is constructed during a reduction of the labeled block A's arrow end is connected to E
- An event-control statement of the form " @(posedge clock)' is reduced by adding a single node and labeling it a state Statements that have no direct effect on flow of control, are not reduced, and are annotated onto their respective arcs
- An example of a simple Verilog HDL text 1702 suitable for high-level synthesis is shown at the left of Fig.
- a fully reduced control flow graph 1704 that corresponds to the text is shown on the right. Once the graph 1704 has been fully constructed, some basic pruning is needed so the efficiency of later steps can be improved without changing the semantics of the graph. Any nodes and arcs that are unreachable from the reset node can be removed. Sets of arcs resulting from branches and having no further graph structure can be collapsed together, and conditional parse trees can be re- annotated onto the control flow graph. Graph structures include loops, states, and disables. Alternatively, the conditionals from which these sprang can be detected as having no effect on the control flow graph other than the creation of surplus branches. In such case the reduction is simply not applied, and the conditionals are annotated as they stand.
- Simple nodes can be removed and their in-arcs and out-arcs merged, e.g., nodes with one in-arc and one out-arc that are not marked as states.
- a one-hot FSM has its states coded with a unary code. All states are represented b y binary numbers having all but one bit set false. Examples of a one-hot code include 0001 , 0010, 0100, 1000. All but one of the bits are zero and the single "hot bit" is one. Such code can be inverted without loss of the one-hot property, e.g., 1 1 10, 1101 , 101 1 , 01 1 1.
- a one-hot FSM is extracted from a control flow graph by noting that each state node of the control flow graph can be mapped in a one-to-one way onto a single state flip-flop.
- a one-hot FSM is constructed with as many flip-flops as there are state nodes in the control flow graph. Each flip-flop "F" is assigned to one state node. When F's output is 1 , we will infer that we are in the corresponding state.
- a table MAP is constructed which maps arcs of the control flow graph to output ports of the FSM, and is a one-to-one mapping. As many output ports on the FSM as there are arcs in the control flow graph are constructed.
- the table MAP functions b y mapping arcs onto output ports of the FSM.
- a location, MAP(A) maps a control flow graph arc A onto an output pin of the FSM.
- An inverse function, PAM is used to map an output pin of the FSM to a corresponding control flow graph arc.
- a function FLOP maps a state node to a flip-flop.
- Fig. 18 helps illustrate a procedure for building the function MAP
- a control flow graph 1802 is shown after mapping a reset arc to a FSM 1804 The out-arc of a reset node is assigned to an otherwise unassigned output pin P An input pin named "reset" is constructed and connected directly to P.
- MAP(A) to P is set All the state nodes of the control flow graph 1802 are considered next.
- Node N is assigned to an unassigned flip-flop F
- MAP(C) is assigned to the D-pin of F
- MAP(D) is assigned to the Q-pin of F
- the FLOP(N) is set to F This is repeated until all the state nodes are assigned All m-arcs and out-arcs of state nodes and the reset are then in MAP
- the next step is to look at the arcs going into state nodes Let C be such an arc
- the MAP(C) is connected to the D-pin of a state flip-flop A circuit is constructed that is driven by primary inputs and state flip-flops, and which drives MAP(C) A recursive procedure like that in Table I will build such a circuit
- Figs. 19A-19D show the various piecewise constructions by which a complete one- hot FSM can be constructed by the procedure of Table I.
- the OR-gate and the flip- flop have been named after the nodes with which they are associated.
- the AND- gates are named after the arcs with which they are associated.
- the conditions associated with the branch arcs A and B are treated as primary inputs cond(A) and cond(B). All branch conditionals are assumed to be computed elsewhere.
- Fig. 20 shows a Verilog text sample 2002, along with its corresponding control flow graph 2004 and one-hot FSM 2006.
- the redundant outputs associated with arcs connected by simple, non-state, non-fork, non-join, non-reset, nodes have been collapsed. Some control outputs have been removed for clarity.
- Winslett A Prescriptive Formal Model for Data-Path hardware, IEEE Transactions on CAD, Feb 1992, pages 158-184, describes a representation for datapath and control hardware that is similar in structure and function to the control flow graph and which can be used for register inferencing, scheduling, and allocation, a necessary subset of the representation's main capability, which is to supply information necessary for correcting all of the above after a user or other agency has "broken" the design b y manual editing; D. Knapp, Synthesis from Partial Structure in Design Methodologies for VLSI and Computer Architecture, D. A. Edwards, editor.
Abstract
Description
Claims
Applications Claiming Priority (15)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US574693 | 1990-08-29 | ||
US579825 | 1990-09-10 | ||
US13612699P | 1999-05-26 | 1999-05-26 | |
US13590299P | 1999-05-26 | 1999-05-26 | |
US13612799P | 1999-05-26 | 1999-05-26 | |
US136126P | 1999-05-26 | ||
US135902P | 1999-05-26 | ||
US136127P | 1999-05-26 | ||
US09/574,693 US6470486B1 (en) | 1999-05-26 | 2000-05-17 | Method for delay-optimizing technology mapping of digital logic |
US09/574,572 US6516453B1 (en) | 1999-05-26 | 2000-05-17 | Method for timing analysis during automatic scheduling of operations in the high-level synthesis of digital systems |
US574572 | 2000-05-17 | ||
US57742600A | 2000-05-22 | 2000-05-22 | |
US577426 | 2000-05-22 | ||
US09/579,825 US6782511B1 (en) | 1999-05-26 | 2000-05-25 | Behavioral-synthesis electronic design automation tool business-to-business application service provider |
PCT/US2000/014617 WO2000072185A2 (en) | 1999-05-26 | 2000-05-26 | Behavioral-synthesis electronic design automation tool and business-to-business application service provider |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1248989A2 true EP1248989A2 (en) | 2002-10-16 |
Family
ID=27568902
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP00936347A Withdrawn EP1248989A2 (en) | 1999-05-26 | 2000-05-26 | Behavioral-synthesis electronic design automation tool and business-to-business application service provider |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP1248989A2 (en) |
JP (1) | JP4495865B2 (en) |
CN (1) | CN1408092A (en) |
AU (1) | AU5167100A (en) |
WO (1) | WO2000072185A2 (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6961773B2 (en) | 2001-01-19 | 2005-11-01 | Esoft, Inc. | System and method for managing application service providers |
MXPA03007375A (en) * | 2001-02-16 | 2004-09-14 | United Parcel Service Inc | Systems for selectively enabling and disabling access to software applications over a network and methods for using same. |
EP1582959B1 (en) * | 2001-02-16 | 2007-07-18 | United Parcel Service Of America, Inc. | Systems for selectively enabling and disabling access to software applications over a network and methods for using same |
US7734715B2 (en) * | 2001-03-01 | 2010-06-08 | Ricoh Company, Ltd. | System, computer program product and method for managing documents |
JP2003067453A (en) * | 2001-08-27 | 2003-03-07 | Nec Corp | Method for promoting design |
US10489212B2 (en) | 2013-09-26 | 2019-11-26 | Synopsys, Inc. | Adaptive parallelization for multi-scale simulation |
US10417373B2 (en) | 2013-09-26 | 2019-09-17 | Synopsys, Inc. | Estimation of effective channel length for FinFETs and nano-wires |
US9881111B2 (en) | 2013-09-26 | 2018-01-30 | Synopsys, Inc. | Simulation scaling with DFT and non-DFT |
WO2015048437A1 (en) | 2013-09-26 | 2015-04-02 | Synopsys, Inc. | Mapping intermediate material properties to target properties to screen materials |
US10402520B2 (en) | 2013-09-26 | 2019-09-03 | Synopsys, Inc. | First principles design automation tool |
US10516725B2 (en) | 2013-09-26 | 2019-12-24 | Synopsys, Inc. | Characterizing target material properties based on properties of similar materials |
US10734097B2 (en) | 2015-10-30 | 2020-08-04 | Synopsys, Inc. | Atomic structure optimization |
US10078735B2 (en) | 2015-10-30 | 2018-09-18 | Synopsys, Inc. | Atomic structure optimization |
CN112199918B (en) * | 2020-10-20 | 2021-09-21 | 芯和半导体科技(上海)有限公司 | Method for reconstructing physical connection relation of general EDA model layout |
CN113158599B (en) * | 2021-04-14 | 2023-07-18 | 广州放芯科技有限公司 | Quantum informatics-based chip and chip-based EDA device |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS62202268A (en) * | 1986-02-28 | 1987-09-05 | Nec Corp | Circuit processor |
JPS6376065A (en) * | 1986-09-19 | 1988-04-06 | Nec Corp | Graphic structure data display system |
US5557531A (en) * | 1990-04-06 | 1996-09-17 | Lsi Logic Corporation | Method and system for creating and validating low level structural description of electronic design from higher level, behavior-oriented description, including estimating power dissipation of physical implementation |
US5787010A (en) * | 1992-04-02 | 1998-07-28 | Schaefer; Thomas J. | Enhanced dynamic programming method for technology mapping of combinational logic circuits |
US5544071A (en) * | 1993-12-29 | 1996-08-06 | Intel Corporation | Critical path prediction for design of circuits |
JPH08101861A (en) * | 1994-09-30 | 1996-04-16 | Toshiba Corp | Logic circuit synthesizing device |
US5535145A (en) * | 1995-02-03 | 1996-07-09 | International Business Machines Corporation | Delay model abstraction |
JP2856141B2 (en) * | 1996-04-01 | 1999-02-10 | 日本電気株式会社 | Delay information processing method and delay information processing apparatus |
GB2325996B (en) * | 1997-06-04 | 2002-06-05 | Lsi Logic Corp | Distributed computer aided design system and method |
JPH11282884A (en) * | 1998-03-30 | 1999-10-15 | Mitsubishi Electric Corp | Network cad system |
-
2000
- 2000-05-26 WO PCT/US2000/014617 patent/WO2000072185A2/en not_active Application Discontinuation
- 2000-05-26 EP EP00936347A patent/EP1248989A2/en not_active Withdrawn
- 2000-05-26 JP JP2000620508A patent/JP4495865B2/en not_active Expired - Fee Related
- 2000-05-26 CN CN00810690A patent/CN1408092A/en active Pending
- 2000-05-26 AU AU51671/00A patent/AU5167100A/en not_active Abandoned
Non-Patent Citations (1)
Title |
---|
See references of WO0072185A2 * |
Also Published As
Publication number | Publication date |
---|---|
JP2003500745A (en) | 2003-01-07 |
CN1408092A (en) | 2003-04-02 |
AU5167100A (en) | 2000-12-12 |
WO2000072185A3 (en) | 2001-11-15 |
JP4495865B2 (en) | 2010-07-07 |
WO2000072185A2 (en) | 2000-11-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6782511B1 (en) | Behavioral-synthesis electronic design automation tool business-to-business application service provider | |
Rabaey et al. | Fast prototyping of datapath-intensive architectures | |
Walker et al. | A survey of high-level synthesis systems | |
De Micheli et al. | The Olympus synthesis system | |
US7020856B2 (en) | Method for verifying properties of a circuit model | |
US6470486B1 (en) | Method for delay-optimizing technology mapping of digital logic | |
JP2001142937A (en) | Scheduling correctness checking method and schedule verifying method for circuit | |
WO2000072185A2 (en) | Behavioral-synthesis electronic design automation tool and business-to-business application service provider | |
WO2022068124A1 (en) | Instruction scheduling system and method for reconfigurable array processor | |
Bergamaschi et al. | High-level synthesis in an industrial environment | |
Camposano et al. | The IBM high-level synthesis system | |
Legl et al. | A Boolean approach to performance-directed technology mapping for LUT-based FPGA designs | |
Börger | Architecture design and validation methods | |
Hemani | Charting the EDA roadmap | |
US8904318B1 (en) | Method and apparatus for performing optimization using don't care states | |
US6516453B1 (en) | Method for timing analysis during automatic scheduling of operations in the high-level synthesis of digital systems | |
Mujumdar et al. | Incorporating performance and testability constraints during binding in high-level synthesis | |
Nourani et al. | False path exclusion in delay analysis of RTL structures | |
Bommu et al. | Retiming-based factorization for sequential logic optimization | |
Carmona et al. | Synthesis of asynchronous hardware from petri nets | |
Sadasue et al. | LLVM-C2RTL: C/C++ Based System Level RTL Design Framework Using LLVM Compiler Infrastructure | |
Ashar et al. | Verification of scheduling in the presence of loops using uninterpreted symbolic simulation | |
Uchevler et al. | Modelling and assertion-based verification of run-time reconfigurable designs using functional programming abstractions | |
Xu et al. | Layout-driven RTL binding techniques for high-level synthesis | |
Xu et al. | RTL synthesis with physical and controller information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20020801 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
AX | Request for extension of the european patent |
Free format text: AL;LT;LV;MK;RO;SI |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: FERNANDES, PRADEEP Inventor name: BRAUNE, BERND Inventor name: KNAPP, DAVID Inventor name: SCHMIDT, HANS-JOACHIM Inventor name: FRANK, ELOF |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: BRAUNE, BERND Inventor name: FRANK, ELOF Inventor name: SCHMIDT, HANS-JOACHIM Inventor name: KNAPP, DAVID Inventor name: FERNANDES, PRADEEP |
|
17Q | First examination report despatched |
Effective date: 20040212 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20050809 |