US8266561B2 - Systems and techniques for developing high-speed standard cell libraries - Google Patents
Systems and techniques for developing high-speed standard cell libraries Download PDFInfo
- Publication number
- US8266561B2 US8266561B2 US11/941,286 US94128607A US8266561B2 US 8266561 B2 US8266561 B2 US 8266561B2 US 94128607 A US94128607 A US 94128607A US 8266561 B2 US8266561 B2 US 8266561B2
- Authority
- US
- United States
- Prior art keywords
- cell
- cells
- computer
- time delay
- library
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/30—Circuit design
- G06F30/32—Circuit design at the digital level
- G06F30/33—Design verification, e.g. functional simulation or model checking
- G06F30/3308—Design verification, e.g. functional simulation or model checking using simulation
- G06F30/3312—Timing analysis
Definitions
- This invention relates to developing high-speed standard cell libraries. More particularly, this invention relates to systems and methods that can obtain a substantial speed improvement in the use of standard cell libraries.
- State-of-the-art design of integrated circuits includes specifying the functionality of the chip in a standard hardware programming language such as Verilog (a hardware description language used to design and document electronic systems), synthesizing/mapping the circuit description into basic gates of a standard cell library using computer-aided design (CAD) tools such as Synopsys' DesignCompiler, produced by Synopsys, Inc. of Mountain View, Calif., placing and routing the gate netlist using CAD tools such as Magma's BlastFusion, produced by Magma, Inc. of San Jose, Calif., and finally verifying proper connectivity (LVS) and functionality of the circuit.
- Verilog a hardware description language used to design and document electronic systems
- CAD computer-aided design
- Magma's BlastFusion produced by Magma, Inc. of San Jose, Calif.
- LVS proper connectivity
- FIG. 1 is a schematic diagram of one exemplary cell
- FIG. 2 is a flow diagram of a method according to the invention that can be used to optimize a cell library according to the invention
- FIG. 3 is a schematic diagram of a second exemplary cell
- FIG. 4 is a flow diagram of a method according to the invention that can be used to add extra drive strength to existing logic functions
- FIG. 5 shows a flow diagram of a method according to the invention that can be used to reduce timing slack in a circuit
- FIG. 6 shows a schematic diagram of one embodiment of a AND-OR-21 (two-to-one) gate
- FIG. 7 shows a schematic diagram of a second embodiment of a another embodiment of a AND-OR-21 (two-to-one) gate
- FIG. 8 shows a flow diagram of a method according to the invention for determining which implementations of a cell can be useful to a cell library
- FIG. 9 shows a flow diagram of a method according to the invention of merging cells.
- FIG. 10 is a schematic diagram of an illustrative single or multi-chip module of this invention in a data processing system.
- aspects described herein may be embodied as a method, a data processing system, or a computer program product. Accordingly, those aspects may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, such aspects may take the form of a computer program product stored by one or more computer-readable storage media having computer-readable program code, or instructions, embodied in or on the storage media. Any suitable computer readable storage media may be utilized, including hard disks, CD-ROMs, optical storage devices, magnetic storage devices, and/or any combination thereof.
- signals representing data or events as described herein may be transferred between a source and a destination in the form of electromagnetic waves traveling through signal-conducting media such as metal wires, optical fibers, and/or wireless transmission media (e.g., air and/or space).
- signal-conducting media such as metal wires, optical fibers, and/or wireless transmission media (e.g., air and/or space).
- This patent application describes a set of original techniques that, when applied to a standard cell library, such as the Broadcom Standard Cell Library manufactured by Broadcom Corporation of Irvine, Calif., for use on Broadcom integrated circuits, may result in a speed improvement of more than 10%.
- One goal of a high-speed library is to minimize the delay of circuits synthesized using the high-speed library.
- a secondary goal is to achieve these speeds at minimal area/power cost. In other words, it would be desirable to have the smallest physical implementation of the logical function for a given speed.
- Standard cells are designed so that they can be tiled next to each other in any suitable combination. This architecture restriction can impose a certain design style on the cells.
- CMOS Complementary Metal Oxide Semiconductor
- physical design rules require a certain distance between N-diffusion and P-diffusion islands (the areas that form the N and P transistors, respectively). This distance is larger than the necessary distance between distinct N-diffusion regions or distinct P-diffusion regions (in 65 nanometer low-power integrated circuits—0.32 micron for N to P vs. 0.13 micron for N to N or P to P). In order to allow close tiling of these cells, it is customary to pick a certain N-diffusion/P-diffusion template for all cells of the standard cell library.
- each individual cell may require a particular N-diffusion/P-diffusion area allowance, which might be quite different than the one for the entire library.
- N-diffusion/P-diffusion area allowance which might be quite different than the one for the entire library.
- One factor in library development includes isolating and/or selecting a preferably small set of commonly-occurring logic functions that are directly implementable in a single stage of CMOS logic.
- the set of commonly-occurring logic functions can be determined experimentally by collecting statistics of cell usage on the critical paths of circuits. Typically, complex logic gates do not appear on the critical path of circuits.
- a method according to the invention may provide a physical implementation that delivers a pre-established unity drive strength for the given height selected for the cell template.
- the method may also select an initial N-diffusion to P-diffusion area allowance, which in the case of the chosen cells could be one—i.e., the area allotted for the N-diffusion may be substantially equal to the area allotted for the P-diffusion.
- the initial allowance choice may only be a starting point for a selection technique according to the invention.
- the ultimate selection of area allotment between the N-diffusion and the P-diffusion may be independent of the initial selection, as will be explained in more detail below.
- a next step in the method according to the invention may be that a layout of the small set of commonly-occurring cells is generated under the pre-established conditions set forth above—i.e., conditions setting forth the drive strength for the given height and the area allotted for the N-diffusion with respect to the area allotted for the P-diffusion.
- FIG. 1 shows NAND cell 100 having a total exemplary cell height which is 2 microns.
- the optimum height for the N-diffusion area 102 can be 1.2 microns and the optimum height for the P-diffusion area 104 can be 0.8 microns.
- semicircles 106 and 108 which represent notches in the cell that dictate where the P Area with respect to N Area (P/N) on the cell will be allocated and whose shape allows each cell to fit easily with an abutting cell, may be designed to have midpoints that are both 1.2 microns from the bottom edge of the cell and the 0.8 microns from the top edge of the cell.
- FIG. 1 shows one exemplary cell. Other cells have different constraints with respect to the height of the N-diffusion area and the P-diffusion area.
- these cells may be extracted and parameterized—i.e., each cell is evaluated to determine which height is preferable for that cell—in order to obtain the resulting netlist.
- the netlist includes the most preferred heights for the N-diffusion and P-diffusion areas, and, consequently, the P/N allocations, for each of the cells.
- a set of “synthetic” libraries may be generated, where the area of the N-diffusion to P-diffusion allowance is varied, and, consequently, the P/N allocation is varied, while keeping the total cell height the same.
- These exemplary libraries change the P/N allocation while keeping the sum of the device heights constant.
- a large set of representative circuits may then be taken—which may form a group referred to herein as a library benchmark—and synthesized using the parameters set forth with respect to the various synthetic libraries.
- the average of the resulting circuits' speed is determined and, in response to the determination, a delay number is attached to each of the synthetic libraries.
- the synthetic library that resulted in the highest average circuit speed may then be selected.
- With the corresponding N-diffusion to P-diffusion allowance, a layout of the corresponding library is physically regenerated. Through this process, the single P/N allocation with the highest average circuit speed for the entire library is established. Specifically, this P/N allocation method ensures that the most-used cells are functioning at close to their respective optimum speeds.
- FIG. 2 shows a method according to the invention that can be used to optimize a cell library according to the invention.
- Step 210 shows selecting a preferably small set of commonly-occurring logic functions that are directly implementable in a single stage of CMOS logic.
- Step 220 shows obtaining a netlist of preferred area distributions for each of the small set of commonly-occurring logic functions.
- Step 230 shows selecting an initial N-diffusion to P-diffusion area allowance for a synthetic cell library.
- Step 240 shows using the initial area allowance to generate a library benchmark.
- Step 250 shows using the netlist to synthesize a set of cell libraries wherein the area of the N-diffusion to P-diffusion allowance is varied between the synthesized cell libraries.
- Step 260 shows testing the additional synthetic libraries and comparing their respective delays to the library benchmark.
- Step 270 shows attaching a delay number to each of the synthetic libraries based on the comparison.
- the cell libraries are ranked based on the respective delay numbers associated with each of the cell libraries.
- Step 280 shows determining the synthetic library that resulted in the highest average circuit speed, and, according to the corresponding N-diffusion to P-diffusion allowance, physically generating a layout of the corresponding library.
- CMOS circuits In static CMOS, both the rising and the falling transitions can become speed critical. Accordingly, it is important to optimize a cell in a library for minimal rise transition plus fall transition. Such a value may also be characterized alternatively as minimal average transition.
- FIG. 3 shows a cell that meets its speed minimum at distance x at 302 from the common allocation of the library.
- side sections 304 may be appended to cell 300 such that, while cell 300 may be somewhat wider than otherwise required, cell 300 can still be adapted to fit with other cells using the optimal P/N allocation as determined using method 200 above.
- FIG. 4 shows that, for each logic function, a higher drive cell may be added (preferably in pre-established drive increments), as shown in step 410 .
- a higher drive cell may be added (preferably in pre-established drive increments), as shown in step 410 .
- the naming convention for different drive strength uses the letter X followed by a number “n”: Xn, where n denotes the relative drive strength.
- n denotes the relative drive strength.
- the drive strength of a gate that is not folded and occupies the entire cell template size is X2. If that gate is folded once, for example, then its drive strength is denoted X4. If the relative drive strength “n” is not an integer, say it is “1.7”, the letter P may be used instead of the period; thus the name would be X1P7.
- FIG. 5 shows a method of reducing timing slack in a circuit.
- timing slack could be traded-off—via high/low skewed gates—by speeding up the slower path, for example by enlarging the slower path channel, while slowing down the faster one—e.g., by reducing the faster path channel—until there is no more slack to be taken away, as shown in step 520 .
- n denotes the ratio between the P transistors and N transistors.
- the locally P/N allocation optimal cell may be considered just another cell with a particular Yn value.
- two parameters have been obtained—one for the drive strength X and one for the P/N allocation Y. If all these cells were to be implemented in the standard cell library, the number of entries would be very large.
- Using a method according to the invention by only adding different Y entries for a handful of commonly used cells, most of the remaining slack of the circuit could be eliminated. Simulations have determined that the most common gates with different Y entries include the following: INVERTER gates, 2/3/4-input inverting NAND gates, and inverting AND-OR 21 gates.
- Certain gates are symmetric in their respective inputs—i.e., the computed logic function remains the same independent of the input permutation.
- an AND-OR-21 (two-to-one) gate 600 as shown in FIG. 6 is symmetric in its inputs, such that it computes the same function even if the two inputs are interchanged.
- this property is not true for many circuits.
- the arrow 601 shows one discharge scenario of gate 600 .
- Node 602 is pulled low when input signals (associated with transistor gates) 604 are high and at least one of the inputs at gates 606 or 608 is low.
- Node 602 is pulled high when inputs 604 are low and one of the inputs 606 or 608 is low as well.
- This cell topology is optimal for the case when input signals 604 are the critical signals coming into the gate. This is because input signals 604 are physically closest to the output node 602 for both rising and falling transitions. Furthermore, under the assumption that inputs 604 receive the critical signal, the widths of the N-transistors controlled by gates 606 and 608 could be shrunk to reduce the capacitance on output node 602 and thus further increase the operational speed of this gate.
- FIG. 7 below shows another scenario of discharge 701 based on a different implementation of an AND-OR-21 (two-to-one) gate 700 .
- input signal 708 is the critical one (note that this gate is symmetric in 706 and 708 —i.e., these gates can be interchanged without altering the value of the logic function; however, this is not true if one were to interchange 704 with any other input).
- the second topology is optimal for the case when 708 is late, as now it is 708 that is physically closest to output 702 .
- the N-transistor controlled by 704 can be shrunk to reduce the capacitance on the output 702 .
- Arrow 701 shows an optimal discharge path for circuit 700 .
- Some other examples that can take advantage of different variations of cells include, but are not limited to: inverting AND or inverting OR with 2/3/4 inputs having one of the inputs inverted, inverting OR-AND 21 gate, inverting AND-OR 31, and inverting OR-AND 31.
- FIG. 8 shows a flow diagram of a method according to the invention.
- Step 810 shows selecting a cell for analysis to determine which implementations of the cell may be useful, for example, to a cell library.
- Step 820 shows querying whether a predetermined cell is symmetric in its inputs.
- Step 830 shows that if the cell is not symmetric, determining which of the implementations of the cell may be useful for inclusion in a cell library.
- the number of combinations in which cells could be combined is quite large, considering that each cell already has 3 parameters X, Y and Z, as described above.
- a stage-ratio between two consecutives single-stage cells of between 2 and 4 cells works well in practice.
- the two cells are preferably not combined sub-optimally into one merged cell. Rather, the two cells are preferably re-optimized such that the delay through the merged cell when the input is rising is the same as the delay through the cell when the input is falling. This re-optimization often adds an extra 10% speed improvement to the compound cell, when compared to the suboptimal juxtaposition of those cells.
- FIG. 9 shows a method according to the invention of merging cells.
- Step 910 shows collecting and analyzing cell statistics on net lists synthesized using preferably only single-stage cells.
- Step 920 shows querying whether any of the single cells can be advantageously merged into a merged cell.
- the method preferably obtains a list of cells that can be merged into a single merged cell, at step 930 . This list of cells may be based on the statistics obtained in step 910 . From the list of cells that are to be merged, the method may re-optimize such merged cells such that the delay through the merged cell is substantially the same when the input is rising as when the input is falling.
- FIG. 10 shows a single or multi-chip module 1002 according to the invention, which can be one or more integrated circuits, in an illustrative data processing system 1000 according to the invention.
- Data processing system 1000 may include one or more of the following components: I/O circuitry 1004 , peripheral devices 1006 , a processor 1008 and memory 1010 . These components are coupled together by a system bus or other interconnections 1012 and are populated on a circuit board 1020 which is contained in an end-user system 1030 .
- System 1000 may be configured for use with high-speed standard cell laboratories according to the invention. It should be noted that system 1000 is only exemplary, and that the true scope and spirit of the invention should be indicated by the following claims.
- program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Design And Manufacture Of Integrated Circuits (AREA)
Abstract
Description
Claims (17)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/941,286 US8266561B2 (en) | 2007-09-21 | 2007-11-16 | Systems and techniques for developing high-speed standard cell libraries |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US97411607P | 2007-09-21 | 2007-09-21 | |
US11/941,286 US8266561B2 (en) | 2007-09-21 | 2007-11-16 | Systems and techniques for developing high-speed standard cell libraries |
Publications (2)
Publication Number | Publication Date |
---|---|
US20090083691A1 US20090083691A1 (en) | 2009-03-26 |
US8266561B2 true US8266561B2 (en) | 2012-09-11 |
Family
ID=40473065
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/941,286 Expired - Fee Related US8266561B2 (en) | 2007-09-21 | 2007-11-16 | Systems and techniques for developing high-speed standard cell libraries |
Country Status (1)
Country | Link |
---|---|
US (1) | US8266561B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8762904B2 (en) | 2012-03-28 | 2014-06-24 | Synopsys, Inc. | Optimizing logic synthesis for environmental insensitivity |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8015517B1 (en) * | 2008-06-06 | 2011-09-06 | Nangate A/S | Library sizing |
US8239799B2 (en) * | 2010-01-07 | 2012-08-07 | Freescale Semiconductor, Inc. | Placing filler cells in device design based on designation of sensitive feature in standard cell |
US9438237B1 (en) * | 2013-04-19 | 2016-09-06 | Pdf Solutions, Inc. | High-yielding standard cell library and circuits made therefrom |
TW202240455A (en) * | 2020-11-30 | 2022-10-16 | 美商新思科技股份有限公司 | Poly-bit cells |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6498515B2 (en) * | 1999-05-19 | 2002-12-24 | Matsushita Electric Industrial Co., Ltd. | Semiconductor integrated circuit and method for designing the same |
US20060215457A1 (en) * | 2005-03-25 | 2006-09-28 | Fujitsu Limited | Method of generating cell library data for large scale integrated circuits |
-
2007
- 2007-11-16 US US11/941,286 patent/US8266561B2/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6498515B2 (en) * | 1999-05-19 | 2002-12-24 | Matsushita Electric Industrial Co., Ltd. | Semiconductor integrated circuit and method for designing the same |
US20060215457A1 (en) * | 2005-03-25 | 2006-09-28 | Fujitsu Limited | Method of generating cell library data for large scale integrated circuits |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8762904B2 (en) | 2012-03-28 | 2014-06-24 | Synopsys, Inc. | Optimizing logic synthesis for environmental insensitivity |
Also Published As
Publication number | Publication date |
---|---|
US20090083691A1 (en) | 2009-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7523430B1 (en) | Programmable logic device design tool with simultaneous switching noise awareness | |
US7225423B2 (en) | Method for automated design of integrated circuits with targeted quality objectives using dynamically generated building blocks | |
US6721926B2 (en) | Method and apparatus for improving digital circuit design | |
US10318686B2 (en) | Methods for reducing delay on integrated circuits by identifying candidate placement locations in a leveled graph | |
US7739098B2 (en) | System and method for providing distributed static timing analysis with merged results | |
US8860458B2 (en) | Integrated circuits with logic regions having input and output bypass paths for accessing registers | |
US7849422B2 (en) | Efficient cell swapping system for leakage power reduction in a multi-threshold voltage process | |
US8316339B2 (en) | Zone-based leakage power optimization | |
Lu et al. | Flip-flop and repeater insertion for early interconnect planning | |
US8266561B2 (en) | Systems and techniques for developing high-speed standard cell libraries | |
US9811621B2 (en) | Implementing integrated circuit designs using depopulation and repopulation operations | |
JP2004178285A (en) | Parasitic element extraction device | |
US8667435B1 (en) | Function symmetry-based optimization for physical synthesis of programmable integrated circuits | |
Chen et al. | Simultaneous timing driven clustering and placement for FPGAs | |
US7133819B1 (en) | Method for adaptive critical path delay estimation during timing-driven placement for hierarchical programmable logic devices | |
Vishnu et al. | Clock tree synthesis techniques for optimal power and timing convergence in soc partitions | |
US9940422B2 (en) | Methods for reducing congestion region in layout area of IC | |
Choy et al. | Incremental layout placement modification algorithms | |
US8543963B2 (en) | Global leakage power optimization | |
US7006962B1 (en) | Distributed delay prediction of multi-million gate deep sub-micron ASIC designs | |
Calimera et al. | Design of a family of sleep transistor cells for a clustered power-gating flow in 65nm technology | |
US20200293707A1 (en) | Programmable integrated circuit underlay | |
US6757885B1 (en) | Length matrix generator for register transfer level code | |
JP4003071B2 (en) | Semiconductor integrated circuit design method and design apparatus | |
US20220114321A1 (en) | Systems And Methods For Generating Placements For Circuit Designs Using Pyramidal Flows |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: BROADCOM CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:PENZES, PAUL;REEL/FRAME:020126/0263 Effective date: 20071112 |
|
AS | Assignment |
Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH CAROLINA Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:037806/0001 Effective date: 20160201 Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:037806/0001 Effective date: 20160201 |
|
REMI | Maintenance fee reminder mailed | ||
LAPS | Lapse for failure to pay maintenance fees | ||
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20160911 |
|
AS | Assignment |
Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD., SINGAPORE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:041706/0001 Effective date: 20170120 Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:041706/0001 Effective date: 20170120 |
|
AS | Assignment |
Owner name: BROADCOM CORPORATION, CALIFORNIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS COLLATERAL AGENT;REEL/FRAME:041712/0001 Effective date: 20170119 |