CN113095033B - Superconducting RSFQ circuit layout method for dual-clock architecture - Google Patents

Superconducting RSFQ circuit layout method for dual-clock architecture Download PDF

Info

Publication number
CN113095033B
CN113095033B CN202110442343.3A CN202110442343A CN113095033B CN 113095033 B CN113095033 B CN 113095033B CN 202110442343 A CN202110442343 A CN 202110442343A CN 113095033 B CN113095033 B CN 113095033B
Authority
CN
China
Prior art keywords
column
layout
logic
units
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110442343.3A
Other languages
Chinese (zh)
Other versions
CN113095033A (en
Inventor
黄俊英
张阔中
叶笑春
张志敏
范东睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN202110442343.3A priority Critical patent/CN113095033B/en
Publication of CN113095033A publication Critical patent/CN113095033A/en
Application granted granted Critical
Publication of CN113095033B publication Critical patent/CN113095033B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • G06F30/392Floor-planning or layout, e.g. partitioning or placement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • G06F30/394Routing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • G06F30/398Design verification or optimisation, e.g. using design rule check [DRC], layout versus schematics [LVS] or finite element methods [FEM]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/60Superconducting electric elements or equipment; Power systems integrating superconducting elements or equipment

Abstract

There is provided a layout method of a superconducting RSFQ circuit for a dual-clock architecture, the total number of logic cells in the circuit excluding input IO and output IO being N, and an aspect ratio of a chip for laying out the circuit being α, the layout method comprising: performing initial layout on N logic units based on logic depth, including: calculating a reference height of a layout columnThe logic cells are arranged in order from a logic depth of 1 such that the cells of each logic depth are arranged in order of increasing vertical direction and the height of each column is not greater than H 0 The different logic depths are arranged starting from the new column; the number of units is less than H 0 Sequentially combining columns of (a) and the height of the combined columns is not greater than H 0 The method comprises the steps of carrying out a first treatment on the surface of the And removing the empty columns and outputting the initial coordinates of the N logic units on the chip and the columns capable of being laid out; the initial layout is perturbed and optimized based on the simulated annealing layout framework.

Description

Superconducting RSFQ circuit layout method for dual-clock architecture
Technical Field
The invention relates to the field of superconducting circuits, in particular to a superconducting RSFQ circuit layout method for a double-clock architecture.
Background
Superconducting single flux quantum (Single Flux Quantum, SFQ) circuit technology is listed by ITRS as a very promising next generation integrated circuit technology. The superconducting fast single flux quantum (Rapid Single Flux Quantum, RSFQ) circuit is one of the SFQ circuits and has the advantages of ultra-high speed and ultra-low power consumption. Studies have shown that simple RSFQ circuits fabricated using sub-micron josephson junction (Josephson Junction, JJ) technology can operate at frequencies up to 770GHz, which is difficult to achieve with semiconductor integrated circuits. Moreover, under the same process conditions, both logic gate delay and bit operation power consumption in RSFQ circuits are two orders of magnitude lower than corresponding semiconductor circuits.
The most basic device in the RSFQ circuit is a superconducting loop composed of JJ, which is a switching element. Unlike CMOS circuits, the storage component of RSFQ circuits is an inductance rather than a capacitance. Quantization of magnetic flux in superconducting ring to Φ=n=Φ 0 Wherein Φ 0 =2.07×10 -15 Wb. Information is stored in the form of flux quanta and transmitted in the form of SFQ voltage pulses. The presence of a pulse indicates a logic "1" and the absence indicates a logic "0". Unlike CMOS circuits, in RSFQ logic circuits, almost all logic cells require clock driving to propagate stored flux quanta to the output. Since an RSFQ logic gate can be considered a one-stage pipeline, an RSFQ circuit is a fully-gated pipeline for this purpose, while logic depth refers to the number of stages with clocked logic gates.
In order to fully exploit the ultra-high frequency (tens or hundreds of GHz) advantages of RSFQ devices, researchers have proposed clock mechanisms suitable for RSFQ circuits, including clock-following data (clock-following-data clock), zero-skew clock (zero-skew clock), and concurrent clock (clock-flow clock). The zero offset clock is a clock mechanism adopted in a semiconductor circuit, and the concurrent clock, i.e. the clock and the data flow in the same direction, is a clock mechanism capable of obtaining the highest circuit frequency.
To ensure that the RSFQ logic gate functions correctly, the logic depth of the logic gates to which all its inputs are connected should be the same, a constraint called path balancing. If the logic depths of the fanin gates are different, a Flip-flop (DFF) should be inserted at the output of the fanin gate having the smaller logic depth. Therefore, the conventional design method of the RSFQ circuit is to ensure the correct operation of the circuit by inserting a large number of flip-flops. Recently, researchers have proposed a new architecture for implementing RSFQ circuits using fast and slow clock signals, referred to as a dual clock architecture, see in particular chinese patent application publication CN112116094a. In this new architecture, the flow of data is controlled by a double clock so that the correct operation of the RSFQ circuit can be ensured without inserting any path balanced DFF. This new architecture can save a lot of circuit area and power consumption cost considering that the number of path balanced DFFs inserted in a typical RSFQ circuit is several times that of a normal logic gate.
On the one hand, although some researches are carried out on the layout method of the double-clock superconducting RSFQ circuit, the work is to use zero deviation clock of the semiconductor circuit, and the concurrent clock mechanism of the RSFQ circuit is not considered, so that the work frequency of the circuit after layout is not high enough. On the other hand, the traditional superconducting RSFQ circuit (i.e. the RSFQ circuit of a non-dual-clock architecture) layout method does not consider the circuit characteristics of large cell number difference of each logic depth in the dual-clock architecture, so that the circuit area overhead after layout is large. Therefore, none of the existing layout methods is suitable for a dual-clock architecture superconducting RSFQ circuit.
Disclosure of Invention
Based on the above-mentioned drawbacks of the prior art, the present invention provides a layout method for a superconducting RSFQ circuit of a dual-clock architecture, in which the total number of logic cells excluding input IO and output IO is N, the aspect ratio of the chip on which the circuit is laid out is a,
the layout method comprises the following steps:
performing initial layout on N logic units based on logic depth, including:
calculating a reference height of a layout column
The logic cells are arranged in order from a logic depth of 1 such that the cells of each logic depth are arranged in order of increasing vertical direction and the height of each column is not greater than H 0 The different logic depths are arranged starting from the new column;
the number of units is less than H 0 Sequentially combining columns of (a) and the height of the combined columns is not greater than H 0 The method comprises the steps of carrying out a first treatment on the surface of the And
removing the empty columns and outputting the initial coordinates of the N logic units on the chip and the columns capable of being laid out; and
the initial layout is perturbed and optimized based on a simulated annealing layout framework.
Preferably, the step of perturbing and optimizing the initial layout based on the simulated annealing layout framework comprises:
calculating the cost of the initial layout;
disturbing the initial layout to generate a new layout solution;
and calculating the cost of the new layout solution, and updating the coordinate values of the N logic units by using the layout solution with lower cost until the layout solution with the minimum cost is obtained as the final circuit layout.
Preferably, the initial layout further comprises:
for each logical depth i, the cells that calculate the logical depth require columns of the full-height layoutWherein blk_num [ i ]]Is the number of cells with logical depth i, each of the C columns starting from the current column is arranged with H 0 The number of cells in each column is updated to H by the cells with logical depth i which are not laid out 0 And updates the current column to current column +c.
Preferably, the initial layout further comprises:
the residual undeployed blk_num [ i ] with the logic depth of i]%H 0 The units are arranged in the current column, and the number of units in the current column is updated to blk_num [ i ]]%H 0 And updates the current column to current column +1.
Preferably, the initial layout further comprises:
the number of units is less than H 0 The column numbers of the columns of (1) are stored in an array, and the array [ i+1 ] is stored]~array[i+j]The cell position of the column is adjusted to the array [ i ]]Column, and array [ i+1 ]]~array[i+j]The number of units in the column is set to 0, where array [ i ] before merging]Column, array [ i+1 ]]Array [ i+j ]]The sum of the number of units in the row is equal to or less than H 0 < array [ i ] before merging]Column, array [ i+1 ]]Array, & array [ i+j ]]A row of,array[i+j+1]The sum of the number of units in the row.
Preferably, the step of perturbing the initial layout further comprises:
the input IOs are arranged at the grid point positions on the left side of the chip, and the output IOs are arranged at the grid point positions on the right side of the chip.
Preferably, the step of perturbing the initial layout further comprises:
when a logic cell can be laid out in a plurality of columns, a column is randomly selected from columns containing only cells of the same logic depth, then a lattice point position is randomly selected on the selected column, the logic cell is exchanged with the cells of the lattice point, and new coordinates are determined.
Preferably, the step of perturbing the initial layout further comprises:
when the number of the column which can be laid out of a certain logic unit is 1 and the number of the units with the same logic depth as the certain logic unit is larger than 1, exchanging the certain logic unit with the units with the same logic depth as the certain logic unit with a certain probability P, and determining a new coordinate, wherein P is more than or equal to 0 and less than or equal to 1.
Preferably, the step of perturbing the initial layout further comprises:
when the number of the column which can be laid out by a certain logic unit is 1 and the number of the units with the same logic depth as the certain logic unit is larger than 1, when two or more macro modules with the same number of the units and different logic depths exist in the layout column, the two macro modules are exchanged integrally with a certain probability of 1-P, and new coordinates of all units in the macro modules are determined, wherein P is more than or equal to 0 and less than or equal to 1.
Preferably, the step of perturbing the initial layout further comprises:
when the number of the column which can be laid out of a certain logic unit is 1 and the number of the units with the same logic depth as the certain logic unit is 1, if a space exists on the layout column, the certain logic unit is moved to a randomly selected space, and a new coordinate is obtained.
The present invention also provides a computer readable storage medium having embodied thereon a computer program executable by a processor to perform the steps of a superconducting RSFQ circuit layout method for a dual-clock architecture as described above.
The present invention also provides an electronic device including: one or more processors; and a memory, wherein the memory is to store one or more executable instructions; the one or more processors are configured to implement the steps of one of the superconducting RSFQ circuit layout methods for a dual-clock architecture described above via execution of the one or more executable instructions.
The superconducting RSFQ circuit layout method for the double-clock architecture adopts the concurrent clock to improve the working frequency of the circuit, and provides an initial layout method based on logic depth and a layout disturbance solving method meeting logic depth constraint to reduce the area of the circuit. Compared with the existing superconducting RSFQ circuit layout method, the superconducting RSFQ circuit layout method for the double-clock architecture provided by the invention can obtain a layout result with a better area under the condition of meeting logic depth constraint, and reduce the area of the circuit. Compared with a zero deviation clock, the layout result of the invention considers the logic depth of the units, so that the wiring stage after layout is easier to realize the time sequence constraint under the concurrent clock, thereby improving the working frequency of the circuit.
Drawings
FIG. 1A is a schematic diagram of a concurrent clock distribution network;
FIG. 1B is an example of two logic gates in the concurrent clock distribution network of FIG. 1A;
FIG. 2 is a schematic diagram of concurrent clock timing constraints;
FIG. 3 is a schematic diagram of a grid-based chip layout area in accordance with one embodiment of the invention;
FIG. 4 is a schematic diagram of a circuit gate level netlist of a 4-bit adder circuit according to one embodiment of the invention;
FIG. 5 is a schematic diagram of the layout result of the 4-bit adder circuit of FIG. 4 using a conventional concurrent clocked superconducting RSFQ layout algorithm;
FIG. 6 is a logic depth based dual clock superconducting RSFQ circuit initial layout algorithm flow chart of one embodiment of the present invention;
FIG. 7 is one example of an initial layout of the 4-bit adder circuit in FIG. 4;
FIG. 8 is a schematic diagram of a net bounding box with 10 end points;
fig. 9 is a final layout schematic of the 4-bit adder circuit of fig. 4.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail by means of specific embodiments with reference to the accompanying drawings.
Fig. 1A is a schematic diagram of a concurrent clock distribution network in which clock sources and data are transmitted to logic gates through different signal networks. Fig. 1B is an example of two of the logic gates in the concurrent clock distribution network of fig. 1A, illustrating the timing of the concurrent clocks. As shown in fig. 1B, gate 1 and gate 2 are RSFQ logic gates, and black dots 101 and 102 are Splitters (SPL) for short. Data is transferred to gate 1 and after a clock pulse reaches gate 1, data is output to gate 2. The clock pulse is transmitted to SPL 101, and assuming that the time at which SPL 101 outputs the clock pulse is 0, the clock pulse arrival time (Tclk) of gate 2 is equal to the sum of the delays of line 2, SPL 102 and line 3; the data pulse arrival time (Tdata) of gate 2 is equal to the sum of the delays of line 1, gate 1 and line 4.
For concurrent clock timing, the data pulse arrives later than the clock pulse, i.e., tdata > Tclk, and gate 2 processes and outputs the data pulse when the next clock pulse arrives at gate 2. Thus, if data is transferred in an RSFQ circuit with N stages of logic gates, N clock cycles (n+1 clock pulses) are required. In this way, data can be processed sequentially and consecutively between the gate stage pipelines. In an RSFQ circuit, concurrent clock timing can achieve higher circuit frequencies.
FIG. 2 is a schematic diagram of concurrent clock timing constraints. In the timing design of the concurrent clocks, the timing constraints shown in fig. 2 need to be satisfied. Wherein t is c Is the time of arrival of the clock at the logic gate, t d Is the time of arrival of the data at the logic gate, t cycle Is the clock period, t hold Is the hold time (hold time) of the logic gate, t setup Is the setup time of the logic gate. In timing design with concurrent clocks, each logic gate must satisfy a timing constraint: t is t c +t hold <t d <t c +t cycle –t setup . In superconducting RSFQ circuit layout, therefore, reasonable logic cell layout and wiring are required to meet the timing constraints described above. The layout is more compact and the bus length is shorter while the time sequence constraint is satisfied, so that the area of the circuit is reduced. The superconducting RSFQ circuit layout method for the double-clock architecture can realize layout based on logic depth, so that the final layout result is easy to meet concurrent clock time sequence constraint, and a layout result with a better area can be obtained.
Layout is the process of determining the physical location of a logic cell in a circuit on a chip, which generally includes two inputs: 1) The process library file is used for describing the shape, the size, the port position, the time sequence parameters and the like of the logic unit; 2) A circuit gate level netlist describing the connection relationships between logic cells in a circuit. The output of the layout is the specific coordinate location of the logic cells on the chip. The invention adopts a layout mode based on grid points, the logic units can only be placed at the positions of the grid points, and the areas between the grid points are used for wiring of the circuit.
FIG. 3 is a schematic diagram of a grid-based chip layout area in accordance with one embodiment of the invention. The layout in fig. 3 includes three layout columns, layout column 0, layout column 1, and layout column 2, each layout column including 3 grid points. The box in fig. 3 represents a lattice point for placement of RSFQ logic cells (i.e., RSFQ logic devices). The positions of the grid points are located by coordinates (x, y) in a planar rectangular coordinate system. For example, the lattice point position in the lower left corner is (0, 0), which means that the lattice point is at the lattice point position where x=0, and y=0.
FIG. 4 is a schematic diagram of a circuit gate level netlist of a 4-bit adder including circuit inputs (cin, a0-a3, b0-b 3), i.e., input IOs, according to one embodiment of the invention; circuit output (s 0)S 4), i.e. output IO; and logic cells (g 0-g 19). The arrowed lines represent the connection relationships between the logic cells in the circuit. The 4-bit adder circuit in fig. 4 employs a dual clock architecture, so DFF devices do not need to be inserted to achieve clock alignment. In the present invention, the logic depth (level) refers to the number of stages of clocked logic gates. The logic depth level (gi) of a logic cell gi is equal to the cell g to which all its inputs are connected si Is added by 1, i.e., level (gi) =1+max { level (g) si ) }. Wherein the logic depth of the input IO is defined to be 0. For example, since all inputs of g0 are input IO, i.e., max { level (g s0 ) Level (g 0) =1; as another example, g8 has two inputs, a first input from input IO and a second input from g0, i.e., max { level (g s8 ) Level (g 8) =2, and so on. The 20 logic cells of the gate level netlist in FIG. 4 are divided into 9 logic depths, level 1 through level 9. The maximum logic depth is level 9. The number of logical units of level 1 is 8, i.e. g0-g7, the number of logical units of level 2 is 2, i.e. g8-g9, and so on. For clarity, the logic cells in fig. 4 are each represented by a circle, which may represent different RSFQ logic devices.
Fig. 5 is a schematic diagram of the layout result of the 4-bit adder circuit of fig. 4 using a conventional concurrent clocked superconducting RSFQ layout algorithm. The connecting lines in the figure represent only the connection relations of signals. As can be seen from fig. 5, the conventional method is to place the units with the same logic depth in the same layout column, and if the conventional method is directly applied to a 4-bit adder circuit adopting a dual clock architecture, the circuit area is large, which causes space waste.
The invention provides a superconducting RSFQ circuit layout method for a double-clock architecture, which adopts a zero-deviation clock mechanism in a semiconductor circuit in the existing research in the direction. Meanwhile, the invention allows the units with different logic depths to be placed in the same layout column, so that the area cost of the circuit can be reduced.
In the prior art, the simulated annealing layout framework comprises the following steps: firstly, generating an initial layout, and calculating an initial temperature according to the initial layout; then enters an annealing stage and is divided into an inner loop and an outer loop. The inner loop realizes the disturbance of the solution by executing the exchange of the logic units for a plurality of times at the same temperature, thereby obtaining a new solution, the outer loop judges whether the algorithm exit criterion is satisfied, and if not, the temperature is updated according to the annealing table. The invention provides a new initial layout method and a disturbance method of a layout solution aiming at the characteristics of a double-clock superconducting circuit on the basis of a simulated annealing layout frame.
The initial layout method of the dual-clock superconducting circuit based on the logic depth and the layout solution disturbance method meeting the constraint of the logic depth according to the present invention will be described in detail with reference to fig. 6, algorithm 1 and algorithm 2. Fig. 6 is a logic depth based dual-clock superconducting RSFQ circuit initial layout algorithm flow chart of one embodiment of the present invention, algorithm 1 describing the logic depth based dual-clock superconducting circuit initial layout of one embodiment of the present invention.
Algorithm 1 Dual clock superconducting Circuit initial layout based on logic depth
Input: the process library file, the circuit gate level netlist G (V, E), V represents the logic cells and E represents the signal connections between the cells.
And (3) outputting: initial coordinate position of logic unit on chip, column information that logic unit can be laid out.
As shown in algorithm 1 and fig. 6, the total number of logic cells in the circuit excluding input IO and output IO is represented by N, H 0 Representing the reference heights of the layout columns, the height of each column does not exceed H in the initial layout of algorithm 1 0 . Assuming that the aspect ratio of the chip to be laid out is α, α×h 0 *H 0 =n, thusWhen it is desired that the chip after layout is close to square, a=1,algorithm 1 is α=1, +.>An example is described. The logic depths of all logic cells are initialized, for example, the logic depth of the input IO cell is initialized to 0, and the logic depths of other logic cells are initialized to-1 (step S100). current column is used to mark the current layout column with an initial value of 0.blk_num_at_column [ current_column ]]For storing the number of cells that the current layout column has laid out, the initial value is null. Calculating the logic depth of each logic unit from the input IO unit according to the connection relation of the circuit netlist to obtain the maximum logic depth L of all the logic units max And stores the unit number with logical depth i to blk_num [ i ]]In (step S101 and subroutine computer_block_logic_level (G) of line 2 of algorithm 1), for example, refer to fig. 4, blk_num [2 ]]=[g8,g9]。
According to the number of the units (blocks) with the logic depth of i, the columns in which the units with the logic depth of i can be laid out are determined. Starting from a logical depth i=1 until the logical depth is i=l max Column c= [ blk_num [ i ] of full height layout is needed to calculate level i block for each logical depth i]/H 0 ]I.e. the number of blocks in each of columns C is H 0 (step S103). Starting from the current layout column, up to the current_column+C-1 column, each column is arranged with H 0 A block with a level i not laid out (step S105), wherein the H 0 The blocks are randomly selected from blocks with a level of i which is not laid out, and the block number of each column is updated, namely, block_num_at_column [ j ]]+=h0. The current layout column is then updated to current_column+c. The remaining number of undeployed cells with logical depth i is blk_num [ i ]]%H 0 The non-laid blk_num [ i ]]%H 0 Block placement with level iPlaced in the current column, and placed sequentially upwards in ascending order of y-coordinate, wherein the non-laid blk_num [ i ]]%H 0 The order between blocks is also random. Updating current block number blk_num_atcolumn [ current_column ]]+=blk_num[i]%H 0 The current layout column is then updated to current_column+1 (step S107).
The layout completed at this time has a cell count smaller than H 0 Are all laid out in separate columns, and blk_num [ i ]]%H 0 The remaining cells obtained are also laid out in separate columns, the number of cells in these columns being relatively small and possibly much smaller than H 0 . Therefore, it is necessary to merge columns and merge columns having a small number of units.
As shown in algorithm lines 13-27, starting from column 0, if the number of blocks with column number col < H 0 The column number col of the column is stored in the array until current_column-1 (steps S108 to S110). Therefore, all blocks < H are stored in the array 0 Is a column number of (c). Combining multiple columns, i.e. array [ i+1 ]]~array[i+j]The block position of the column is adjusted to array [ i ]]Columns such that array [ i ]]The number of blocks of a column does not exceed H 0 And is closest to H 0 (step S112). Wherein is closest to H 0 Refers to array [ i ] before merging]Column, array [ i+1 ]]Column … array [ i+j ]]The sum of the number of units in the row is equal to or less than H 0 <Array [ i ] before merging]Column, array [ i+1 ]]Columns, … array [ i+j ]]Column, array [ i+j+1 ]]The sum of the number of units in the row. Updating array [ i ] after merging]The number of blocks in a column, and array [ i+1 ]]~array[i+j]The block number of columns is set to 0, i=i+j+1, until i=array.size () -1. From column 0, if the number of blocks in col columns is not equal to 0, the number of columns from column 0 to col column with block number of 0 is calculated and denoted as S, the x-coordinates of all blocks in col column are subtracted by S until the last column (i.e., current_column-1), and the x-coordinates of all blocks are updated (steps S114-S116).
Finally, the initial coordinates of the logic units except the input IO and the output IO on the chip and the column in which the logic units can be laid out are output. Stored in map < block_id, t_block_inf >, wherein block_id is used for storing the number of the logic unit, t_block_inf is used for storing the name of the logic unit, net, x coordinate, y coordinate connected with the logic unit, and column capable of being laid out by the logic unit.
The initial layout method of the superconducting RSFQ circuit for the dual-clock architecture of the present invention will be described with reference to a specific example of the adder of fig. 4. In this example, N is 19, H assuming α=1 0 5. The initial layout process is as follows:
the first step: when i=1, blk_num [1 ] is determined]=8, c=1, so 5 cells in g0-g7 are randomly selected to be arranged to column 0, the positions of these 5 cells are random, current_column=1, blk_num [ i ]]%H 0 =3, the remaining 3 cells in g0-g7 are randomly arranged in the 1 st column y=0, 1,2 positions; current_column=2 at this time; the placeable columns of g0-g7 are column 0 and column 1;
and a second step of: when i=2, blk_num [2 ] is determined]=2,C=0,blk_num[2]%H 0 =2, g8-g9 2 units are randomly arranged at the position of column 2 y=0, 1, i.e. the order of g8 and g9 2 units is random; current_column=3 at this time;
and a third step of: when i=3, blk_num [3 ] is determined]=1,C=0,blk_num[3]%H 0 =1, g10 is arranged at the position of 3 rd column y=0; current_column=4 at this time;
fourth step: when i=4, blk_num [4 ] is determined]=2,C=0,blk_num[4]%H 0 =2, g11-g12 cells are randomly arranged at the positions of column 4 y=0, 1; current_column=5 at this time; similarly, g13 is arranged at the position of 5 th column y=0, g14 and g15 are arranged randomly at the position of 6 th column y=0, 1, g16 is arranged at the position of 7 th column y=0, g17 and g18 are arranged randomly at the position of 8 th column y=0, 1, and g19 is arranged at the position of 9 th column y=0.
Then, the rows are merged, and the initial layout after merging is shown in fig. 7. The number of units in column 0 is 5 and thus no merging is required, the number of units in column 1 is 3, the number of units in column 2 is 2, therefore columns 1 and 2 are merged to column 1, the number of units in column 2 is set to 0, and so on, columns 3, 4 and 5 are merged to column 3, the numbers of units in columns 4 and 5 are set to 0, columns 6, 7 and 8 are merged to column 6, and the numbers of units in columns 7 and 8 are set to 0. The x-coordinates of all cells are then subtracted by the number of columns with a cell number of 0 in all columns before it to remove columns with a cell number of 0 and the x-coordinates of all cells are updated. One example of an initial layout after merging is shown in fig. 7, but it should be noted that fig. 7 is only one example of an initial layout, and the initial layout is not unique because the layout of cells in the same logic depth is random during the layout process.
The initial layout method of the circuit based on the logic depth comprises the steps of firstly determining the initial height of a layout column according to the total number of logic units in the circuit, determining the layout column which can be allocated by the logic units with the same logic depth by judging the relation between the number of the logic units with the same logic depth and the initial height of the layout column, and if the same layout column contains a plurality of logic depth units, arranging the units according to the same sequence as the sequence of the logic depths so as to facilitate wiring under a concurrent clock; then randomly selecting a position for the cell within the layout column to obtain an initial layout that satisfies the gate level flow characteristics of the superconducting circuit.
A cost (cost) of the initial layout is calculated from the initial coordinate positions. The cost of the invention is equal to the sum of the border wire lengths of all nets, i.e
Where i represents net number, num_nets is net number, and for each net i, bb x (i) And bb y (i) Representing the horizontal and vertical spans of its bounding box, respectively. Fig. 8 shows a schematic diagram of a net bounding box with 10 end points. Wherein the dashed box includes 42 grid points, the dark boxes represent the end points, and there are 10 end points in fig. 8. Horizontal span bb between endpoints x (i)=X max -X min +1=7, vertical span bb y (i)=Y max -Y min +1=6. When the net end number is greater than 3, the border wire length model underestimates the length of the net and therefore will generally determineThe value of the sense compensation factor q (i) depends on the number of net i's end points. Usually, the compensation factor q (i) is calculated in advance and stored in an array, and different q (i) values can be obtained according to the corresponding endpoint number. Although the line length is used as the optimization target in the present invention, the present invention is not limited thereto, and other optimization targets such as delay, power consumption, etc. may be defined as needed in practical applications.
After obtaining the cost of the initial layout, perturbing the layout solution by the algorithm 2 to obtain the layout solution with the minimum cost as the final layout.
Algorithm 2. Layout solution disturbance method meeting logic depth constraint
Input: the gate level netlist G (V, E), V representing the logic cells, E representing the signal connections between the cells, initial layout results.
And (3) outputting: novel layout solution S new
As shown in algorithm 2, a new coordinate location is found for all logic cells in the circuit. The total number of logic cells in the circuit, including input IO and output IO, is denoted by M. If the cell is an input IO, a trellis point location is selected on the left side of the chip, and if the cell is an output IO, a trellis point location is selected on the right side of the chip (lines 1-6 of Algorithm 2). If it is another cell, it needs to be processed separately according to the number of the cell layout columns.
If the cell (block j) can be laid out in multiple columns (greater than 1 column), the columns may contain only cells of the same logic depth, or cells of different logic depths. At this time, the present invention performs disturbance of the layout solution on those columns containing only cells of the same logic depth, first randomly selecting one column among columns containing only cells of the same logic depth,then randomly selecting a lattice point position on the selected column, exchanging the block j with the block of the lattice point, and determining their x new 、y new Coordinates (algorithm 2 line 8).
If the number of columns in which block j can be laid out is 1 and the number of cells having the same logical depth as block j is greater than 1 (algorithm 2, line 9), a random number R between 0 and 1 can be generated with a certain probability P (0.ltoreq.P.ltoreq.1), i.e. R<P, exchanging the blocks j with the blocks with the same logic depth to determine x new 、y new Coordinates (algorithm 2 line 10); when R is more than or equal to P, if two or more macro blocks with equal block numbers and unequal logic depths exist in the layout column, the two macro blocks are exchanged integrally, and x of all blocks in the macro blocks is determined new 、y new Coordinates (algorithm 2 line 11). Here, the macroblock refers to a circuit block formed by a plurality of blocks located in the same layout column and having the same logic depth, and for example, g11 and g12 in fig. 7 may be one macroblock. Further, a plurality of consecutive empty dots may be regarded as one macroblock, and the positions are exchanged with the same number of macroblocks in the same column. If the number of units with the same logic depth as block j is also 1, namely the units with the same logic depth as block j are only self, if the layout has a vacancy, the units are moved to a randomly selected vacancy to obtain a new coordinate x new And y new Coordinates (algorithm 2 line 12).
In algorithm 2, different probability values (i.e., different P values) can equalize the computation time and the convergence of the cost. It is assumed that the number of cells with a logical depth of 1 is 2, the number of cells with a logical depth of 2 is 2, and the number of cells with a logical depth of 3 is 1 in the same column. Then the 2 units with logic depth 1 can be interchanged with each other, and the macro block composed of the 2 units with logic depth 1 can be interchanged with the macro block composed of the 2 units with logic depth 2 as a whole. The overall exchange makes the cost reduce faster, but the calculation time is longer, so the internal exchange is judged by the probability P, and different probability values can balance the calculation time and the convergence speed of the cost compared with the time of the overall exchange of the macro module.
Then, a new layout solution S is calculated according to equation (1) new Is a cost of (c). If the cost of the initial layout is lower, the coordinate value of the logic unit obtained by the initial layout is unchanged, if the new layout solution S new Lower cost, according to the new layout solution S new Updating the coordinate values of the logic unit. All the placeable x and y coordinates are cycled through to get the cost-minimum layout solution as the final layout. Fig. 9 is a schematic diagram of the final layout of the 4-bit adder circuit of fig. 4, which can be obtained by location exchange based on fig. 7.
It should be noted that the cells are arranged in the same order as the order of the logic depths by taking the column direction (i.e., the vertical direction) as an example in the above-described embodiment of the present invention, but it should be understood by those of ordinary skill in the art that the present invention should not be limited thereto, and that the cells may be arranged in the same order as the order of the logic depths in the row direction (i.e., the horizontal direction) as well. That is, in the layout process of the present invention, columns may be replaced with rows, and vertical directions may be replaced with horizontal directions, and the final layout result on the chip corresponds to a result obtained by performing layout in the column direction rotated 90 degrees clockwise. The direction of the layout can be chosen by a person skilled in the art according to the actual application, these choices and modifications being within the scope of the invention.
According to the layout disturbance solving method meeting logic depth constraint, through the constraint circuit, the input units can be placed on the left side of the chip only in one row, and the output units can be placed on the right side of the chip only in one row, so that output can reach in the same clock period as much as possible. When the position of the logic unit is optimized, the logic unit is only allowed to be placed in a layout column corresponding to the logic depth, and the bus length is used as an optimization target, so that a layout result with a better area can be obtained, and meanwhile, the constraint of the logic depth is met. In the present invention, the line length is used as an optimization target, but other optimization targets such as delay, power consumption, and the like may be defined as needed.
Compared with the existing superconducting RSFQ circuit layout method, the superconducting RSFQ circuit layout method for the double-clock architecture provided by the invention can obtain a layout result with a better area under the condition of meeting logic depth constraint, and reduce the area of the circuit; and the logic depth of the units is considered, so that wiring under concurrent clocks is easier to realize, and the working frequency of the circuit is improved.
The present invention also provides a computer readable storage medium having embodied thereon a computer program executable by a processor to perform the steps of the superconducting RSFQ circuit layout method for a dual-clock architecture described above.
The present invention also provides an electronic device including: one or more processors; and a memory, wherein the memory is to store one or more executable instructions; the one or more processors are configured to implement the steps of the superconducting RSFQ circuit layout method for a dual-clock architecture described above via execution of the one or more executable instructions.
Finally, it should be noted that the above examples are only for explaining the technical solution of the present invention and are not limiting. Although the invention has been described in detail with reference to examples, it will be understood by those of ordinary skill in the art that the specific examples described herein are intended to be illustrative only and are not to be construed as limiting the scope of the invention. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention as set forth in the appended claims.

Claims (9)

1. A layout method for a superconducting RSFQ circuit of a dual-clock architecture, the total number of logic units except input IO and output IO in the circuit is N, the aspect ratio of a chip for laying out the circuit is alpha,
the layout method comprises the following steps:
performing initial layout on N logic units based on logic depth, including:
calculating a reference height of a layout column
Slave logicThe logic cells are sequentially arranged starting with the edit depth of 1, so that the cells of each logic depth are sequentially arranged in an increasing order in the vertical direction, and the height of each column is not more than H 0 The different logic depths are arranged starting from the new column;
the number of units is less than H 0 Sequentially combining columns of (a) and the height of the combined columns is not greater than H 0 The method comprises the steps of carrying out a first treatment on the surface of the And
removing the empty columns and outputting the initial coordinates of the N logic units on the chip and the columns capable of being laid out; and
disturbing and optimizing the initial layout based on a simulated annealing layout framework;
wherein the initial layout further comprises:
for each logical depth i, the cells that calculate the logical depth require columns of the full-height layoutWherein blk_num [ i ]]Is the number of cells with logical depth i, each of the C columns starting from the current column is arranged with H 0 The number of cells in each column is updated to H by the cells with logical depth i which are not laid out 0 Updating the current column to be the current column +C;
the residual undeployed blk_num [ i ] with the logic depth of i]%H 0 The units are arranged in the current column, and the number of units in the current column is updated to blk_num [ i ]]%H 0 Updating the current column to be the current column +1;
the number of units is less than H 0 The column numbers of the columns of (1) are stored in an array, and the array [ i+1 ] is stored]~array[i+j]The cell position of the column is adjusted to the array [ i ]]Column, and array [ i+1 ]]~array[i+j]The number of units in the column is set to 0, where array [ i ] before merging]Column, array [ i+1 ]]Column, …, array [ i+j ]]The sum of the number of units in the row is equal to or less than H 0 <Array [ i ] before merging]Column, array [ i+1 ]]Column, …, array [ i+j ]]Column, array [ i+j+1 ]]The sum of the number of units in the row.
2. A superconducting RSFQ circuit layout method for a dual-clock architecture according to claim 1 wherein said step of perturbing and optimizing said initial layout based on a simulated annealing layout framework comprises:
calculating the cost of the initial layout;
disturbing the initial layout to generate a new layout solution;
and calculating the cost of the new layout solution, and updating the coordinate values of the N logic units by using the layout solution with lower cost until the layout solution with the minimum cost is obtained as the final circuit layout.
3. A superconducting RSFQ circuit layout method for a dual-clock architecture according to claim 2 wherein said step of perturbing said initial layout further comprises:
the input IOs are arranged at the grid point positions on the left side of the chip, and the output IOs are arranged at the grid point positions on the right side of the chip.
4. A superconducting RSFQ circuit layout method for a dual-clock architecture according to claim 3 wherein said step of perturbing said initial layout further comprises:
when a logic cell can be laid out in a plurality of columns, a column is randomly selected from columns containing only cells of the same logic depth, then a lattice point position is randomly selected on the selected column, the logic cell is exchanged with the cells of the lattice point, and new coordinates are determined.
5. A superconducting RSFQ circuit layout method for a dual-clock architecture according to claim 3 wherein said step of perturbing said initial layout further comprises:
when the number of the column which can be laid out of a certain logic unit is 1 and the number of the units with the same logic depth as the certain logic unit is larger than 1, exchanging the certain logic unit with the units with the same logic depth as the certain logic unit with a certain probability P, and determining a new coordinate, wherein P is more than or equal to 0 and less than or equal to 1.
6. A superconducting RSFQ circuit layout method for a dual-clock architecture according to claim 3 wherein said step of perturbing said initial layout further comprises:
when the number of the column which can be laid out by a certain logic unit is 1 and the number of the units with the same logic depth as the certain logic unit is larger than 1, when two or more macro modules with the same number of the units and different logic depths exist in the layout column, the two macro modules are exchanged integrally with a certain probability of 1-P, and new coordinates of all units in the macro modules are determined, wherein P is more than or equal to 0 and less than or equal to 1.
7. A superconducting RSFQ circuit layout method for a dual-clock architecture according to claim 3 wherein said step of perturbing said initial layout further comprises:
when the number of the column which can be laid out of a certain logic unit is 1 and the number of the units with the same logic depth as the certain logic unit is 1, if a space exists on the layout column, the certain logic unit is moved to a randomly selected space, and a new coordinate is obtained.
8. A computer readable storage medium having embodied thereon a computer program executable by a processor to perform the steps of the method of any of claims 1-7.
9. An electronic device, comprising: one or more processors; and a memory, wherein the memory is to store one or more executable instructions; the one or more processors are configured to implement the steps of the method of one of claims 1-7 via execution of the one or more executable instructions.
CN202110442343.3A 2021-04-23 2021-04-23 Superconducting RSFQ circuit layout method for dual-clock architecture Active CN113095033B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110442343.3A CN113095033B (en) 2021-04-23 2021-04-23 Superconducting RSFQ circuit layout method for dual-clock architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110442343.3A CN113095033B (en) 2021-04-23 2021-04-23 Superconducting RSFQ circuit layout method for dual-clock architecture

Publications (2)

Publication Number Publication Date
CN113095033A CN113095033A (en) 2021-07-09
CN113095033B true CN113095033B (en) 2023-07-21

Family

ID=76679762

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110442343.3A Active CN113095033B (en) 2021-04-23 2021-04-23 Superconducting RSFQ circuit layout method for dual-clock architecture

Country Status (1)

Country Link
CN (1) CN113095033B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114296685B (en) * 2021-12-30 2023-06-09 北京中科睿芯科技集团有限公司 Approximate adder circuit based on superconducting SFQ logic and design method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007193671A (en) * 2006-01-20 2007-08-02 Hitachi Ltd Cell arrangement program for semiconductor integrated circuit
CN102890729A (en) * 2011-07-18 2013-01-23 中国科学院微电子研究所 Method for carrying out layout wiring on high fan-out programmable gate array
CN103914587A (en) * 2014-03-03 2014-07-09 西安电子科技大学 Field-programmable gate array (FPGA) layout method based on simulated annealing/tempering
CN111914500A (en) * 2020-07-23 2020-11-10 清华大学 Rapid single-flux quantum RSFQ circuit layout method and device
WO2020224035A1 (en) * 2019-05-08 2020-11-12 深圳职业技术学院 Digital integrated circuit layout method based on discrete optimization and terminal device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11645512B2 (en) * 2019-04-30 2023-05-09 Baidu Usa Llc Memory layouts and conversion to improve neural network inference performance

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007193671A (en) * 2006-01-20 2007-08-02 Hitachi Ltd Cell arrangement program for semiconductor integrated circuit
CN102890729A (en) * 2011-07-18 2013-01-23 中国科学院微电子研究所 Method for carrying out layout wiring on high fan-out programmable gate array
CN103914587A (en) * 2014-03-03 2014-07-09 西安电子科技大学 Field-programmable gate array (FPGA) layout method based on simulated annealing/tempering
WO2020224035A1 (en) * 2019-05-08 2020-11-12 深圳职业技术学院 Digital integrated circuit layout method based on discrete optimization and terminal device
CN111914500A (en) * 2020-07-23 2020-11-10 清华大学 Rapid single-flux quantum RSFQ circuit layout method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Rongliang Fu et al..《Design Automation Methodology from RTL to Gate-level Netlist and Schematic for RSFQ Logic Circuits》.《GLSVLSI '20: Proceedings of the 2020 on Great Lakes Symposium on VLSI》.2020,145-150. *

Also Published As

Publication number Publication date
CN113095033A (en) 2021-07-09

Similar Documents

Publication Publication Date Title
Shahsavani et al. An integrated row-based cell placement and interconnect synthesis tool for large SFQ logic circuits
JP5143900B2 (en) Arbitrary qubit manipulation with a common coupled resonator
US7210112B2 (en) Element placement method and apparatus
CN110377922A (en) Retention time fault restorative procedure, device and equipment
US10769344B1 (en) Determining timing paths and reconciling topology in a superconducting circuit design
JPH09212533A (en) Device and method for optimizing logic circuit
CN113095033B (en) Superconducting RSFQ circuit layout method for dual-clock architecture
Murai et al. Development and demonstration of routing and placement EDA tools for large-scale adiabatic quantum-flux-parametron circuits
Jokar et al. DigiQ: A scalable digital controller for quantum computers using SFQ logic
US7062725B2 (en) Computer aided design system and computer-readable medium storing a program for designing clock gated logic circuits and gated clock circuit
US11877522B2 (en) Determining critical timing paths in a superconducting circuit design
Lin et al. qGDR: A via-minimization-oriented routing tool for large-scale superconductive single-flux-quantum circuits
Torres et al. Evaluating the impact of interconnections in Quantum-dot Cellular Automata
CN104992032B (en) The modification method of retention time in a kind of multiple voltage domain design
US11671102B2 (en) Scheduling of tasks for execution in parallel based on geometric reach
Liolis et al. Synchronization in quantum-dot cellular automata circuits and systems
US11030369B2 (en) Superconducting circuit with virtual timing elements and related methods
Janez et al. Automatic design of optimal logic circuits based on ternary quantum-dot cellular automata
WO2024051637A1 (en) Method and system for determining operating frequencies of plurality of quantum bits
Pan et al. Cultural algorithm for minimization of binary decision diagram and its application in crosstalk fault detection
Lin et al. Wire retiming as fixpoint computation
JP2009188093A (en) Method and device for designing semiconductor integrated circuit, and program
Chakravarthi et al. Clock Tree Synthesis (CTS) in SoC Physical Design
Marakkalage et al. Fanout-Bounded Logic Synthesis for Emerging Technologies
Zhao Exploration of high-throughput based on ring-structure for wavelength-routed optical networks-on-chips

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant