US20230045265A1 - Fast clocked storage element - Google Patents
Fast clocked storage element Download PDFInfo
- Publication number
- US20230045265A1 US20230045265A1 US17/551,610 US202117551610A US2023045265A1 US 20230045265 A1 US20230045265 A1 US 20230045265A1 US 202117551610 A US202117551610 A US 202117551610A US 2023045265 A1 US2023045265 A1 US 2023045265A1
- Authority
- US
- United States
- Prior art keywords
- latch
- clocked
- transistor
- channel transistor
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K3/00—Circuits for generating electric pulses; Monostable, bistable or multistable circuits
- H03K3/02—Generators characterised by the type of circuit or by the means used for producing pulses
- H03K3/353—Generators characterised by the type of circuit or by the means used for producing pulses by the use, as active elements, of field-effect transistors with internal or external positive feedback
- H03K3/356—Bistable circuits
- H03K3/3562—Bistable circuits of the master-slave type
- H03K3/35625—Bistable circuits of the master-slave type using complementary field-effect transistors
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K3/00—Circuits for generating electric pulses; Monostable, bistable or multistable circuits
- H03K3/02—Generators characterised by the type of circuit or by the means used for producing pulses
- H03K3/027—Generators characterised by the type of circuit or by the means used for producing pulses by the use of logic circuits, with internal or external positive feedback
- H03K3/037—Bistable circuits
- H03K3/0372—Bistable circuits of the master-slave type
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K3/00—Circuits for generating electric pulses; Monostable, bistable or multistable circuits
- H03K3/01—Details
- H03K3/012—Modifications of generator to improve response time or to decrease power consumption
Definitions
- This disclosure describes a clocked storage element, sometimes referred to as “flip-flop”, used for temporarily storing information in digital systems.
- Clocked storage elements are essential in constructing the Finite-State Machine (FSM) which is the core of every digital system.
- FSM Finite-State Machine
- a few important characteristics of the clocked storage element include low “insertion delay” (Data-to-Q delay), low power consumption and small footprint (area).
- Clocked storage elements are very important elements in a digital system. They may take up to 20% of the clock cycle from the useful time allotted for computation. In addition, they may contribute to a quarter of the power consumed in the digital system, in dynamic power and more in the static power. The area taken by clocked storage elements similarly contributes to the total chip area, where chip area is directly proportional to the cost, performance, power, and the total amount of functionality that the chip can provide. Thus, there has been a continuous effort to design clocked storage elements which are: smaller, faster and less power consuming.
- a technology is described for implementation of clocked storage elements that according to various aspects, are compact and fast, and allow for flexible layouts and configurations.
- Embodiments are described having an insertion delay less than 50 picoseconds, and less than 40 picoseconds.
- a clocked storage element comprises a first latch having an input data node, a clock input node and a first latch output data node, the first latch having a current path consisting of two p-channel transistors between the first latch output data node and a VDD supply line, and two n-channel transistors between the first latch output data node and a VSS supply line; and a second latch having an input connected to the first latch output data node, a clock input node and a second latch output data node, the second latch having a current path consisting of, two n-channel transistors between the first latch output data node and a VSS supply line, and two p-channel transistors between the second latch output data node and the VDD supply line.
- a clocked storage element comprises a first latch having an input data node, a clock input node and a first latch output data node; and a second latch having an input connected to the first latch output data node, a clock input node and a second latch output data node, wherein a critical timing path from the input data node of the first latch to the second latch output data node has only two transistor path delays, and two transistors in the path of the first latch output to the second latch data node
- a clocked storage element comprising of a first latch and a second latch does not require the clock input to be inverted. That is, the first latch and second latch have respective clock input nodes which receive the clock signal with the same polarity.
- an integrated circuit having a rising edge clocked storage element having a master latch with a first circuit configuration (e.g., a merged OR-NAND configured transistor stack and a NAND transistor stack configured as feedback) and a slave latch with a second circuit configuration (e.g., a merged OR-NAND configured transistor stack and a NAND transistor stack configured as feedback), and a negative edge clocked storage element having a master latch with the second circuit configuration and a slave latch with a first circuit configuration.
- a master latch with a first circuit configuration e.g., a merged OR-NAND configured transistor stack and a NAND transistor stack configured as feedback
- a slave latch with a second circuit configuration e.g., a merged OR-NAND configured transistor stack and a NAND transistor stack configured as feedback
- a negative edge clocked storage element having a master latch with the second circuit configuration and a slave latch with a first circuit configuration.
- FIG. 1 illustrates a Master-Slave latch constructed by combining the two individual latches, with inverters added at the input and the output.
- FIG. 2 illustrates “rising edge” Master-Slave latch constructed by swapping the “master” and “slave” latches of the FIG. 1 .
- FIG. 3 illustrates a Master-Slave latch like that of FIG. 1 , where the input inverter is being replaced by an arbitrary function.
- FIG. 4 illustrates transistor schematic diagram of the falling-edge M-S latch of FIG. 1 .
- FIGS. 5 is timing diagram for operation of the circuit of FIG. 4 .
- FIGS. 6 is timing diagram showing D to Q delay (insertion delay) of the circuit of FIG. 4 .
- FIG. 7 illustrates transistor schematic diagram of the rising-edge M-S latch of FIG. 2 .
- FIG. 8 is a timing diagram for operation of the circuit of FIG. 7 .
- FIGS. 1 to 8 A detailed description of embodiments of the technology is provided with reference to FIGS. 1 to 8 .
- FIG. 1 is a logic diagram of a clocked storage element.
- the clocked storage element configured as a falling-edge triggered flip-flop, has a buffered input receiving a data signal D and a buffered output producing an output signal Q.
- the data signal D is applied to the input of an inverter 101 acting as a buffer.
- the output of the inverter 101 is a data signal DO which can be considered the input of a first latch in the clocked storage element.
- the first latch is implemented using a first circuit configuration, which includes a first transistor stack 110 A and a second transistor stack 110 B.
- the first transistor stack 110 A implements a merged AND-NOR gate 102 , 103 and generates a first latch output data signal D 1 at a first latch output data node.
- the second transistor stack 110 B implements a NOR gate 104 , which generates a first feedback signal FB 1 .
- the inputs to the merged AND-NOR gate 102 , 103 include the data signal D 0 and a clock signal CLK logically as inputs to the AND function.
- the output of the AND function is logically applied as input to the NOR function.
- the first feedback signal FB 1 is applied logically as input to the NOR function.
- the inputs to the NOR gate 104 in the second transistor stack include the first latch output data signal D 1 and the clock signal CLK.
- the second latch is implemented using a second circuit configuration, which includes a third transistor stack 111 A and a fourth transistor stack 111 B.
- the third transistor stack 111 A implements a merged OR-NAND gate 105 , 106 and generates a data output signal D 2 at a second latch output data node, which is applied as an input to inverter 108 .
- the output of the inverter 108 is the buffered output signal Q.
- the fourth transistor stack 111 B implements a NAND gate 107 which generates a second feedback signal FB 2 .
- the inputs to the merged OR-NAND gate 105 , 106 include the first latch output data signal D 1 and the clock signal CLK logically as inputs to the OR function, the output of which is logically applied as input to the NAND function.
- the second feedback signal FB 2 is also logically applied as an input to the NAND function.
- the inputs to the NAND gate 107 in the fourth transistor stack include the output data signal D 2 and the clock signal CLK.
- the critical timing path between the input signal D 0 and the output data signal D 2 traverses only two transistor stacks 110 A, 111 A.
- a critical timing path can established using techniques described herein that has only four transistor delays from D data input to Q output, one in each stack, during some conditions.
- embodiments as described herein implement the transistor stack 110 A of the AND-NOR gate such that it includes a clocked pull-up current path consisting of two p-channel transistors between the first latch output data node (signal D 1 ) and a VDD supply line, and a pull-down current path consisting of two n-channel transistors between the first latch output data node and VSS supply line.
- embodiments described herein implement the transistor stack 111 A of the OR-NAND gate such that it includes a clocked pull-up current path consisting of two p-channel transistors between the first latch output data node (signal D 2 ) and a VDD supply line, and a pull-down current path consisting of two n-channel transistors between the first latch output data node (signal D 2 ) and VSS supply line
- the two p-channel transistors in the clocked pull-up current path of the first latch and the two p-channel transistors in the clocked pull-up current path of the second latch have channel lengths of about 7 nm or less, manufacturable for example using so-called 7 nanometer or 5 nanometer nodes.
- FIG. 1 implements a clocked storage element which triggers the transition of the output data signal D 2 on the negative, or falling, edge of the clock signal CLK.
- an inverse/CLK of the clock signal CLK can be applied instead.
- the polarity of CLK signal applied on the latch clock input nodes of the first and second latches is the same.
- FIG. 2 implements a clocked storage element, configured as a rising-edge triggered flip-flop, which triggers the transition of the output data signal D 2 on the rising edge of the clock signal CLK, without an added clock signal inverter.
- the clocked storage element has a buffered input receiving a data signal D and a buffered output producing an output signal Q.
- the data signal D is applied to the input of an inverter 201 acting as a buffer.
- the output of the inverter 201 is a data signal D 0 which can be considered the input of the first latch.
- the output Q is produced by an inverter 208 connected to the D 2 signal of the second latch.
- the D 2 signal can be considered the output of the second latch.
- the first latch has the second circuit configuration as described with reference to FIG. 1 , including the third transistor stack 111 A that implements a merged OR-NAND gate 202 , 203 and generates a first latch output data signal D 1 at a first latch output data node.
- the first latch in this embodiment includes the fourth transistor stack 111 B which implements a NAND gate 204 which generates a first feedback signal FB 1 .
- the inputs to the merged OR-NAND gate 202 , 203 include the data signal D 0 and a clock signal CLK applied logically as inputs to the OR function.
- the output of the OR function is applied logically as input to the NAND function.
- the first feedback signal FB 1 is applied logically as input to the NAND function.
- the inputs to the NAND gate 204 in the second transistor stack include the first latch output data signal D 1 and the clock signal CLK.
- the second latch is implemented using the first circuit configuration as described above, including the first transistor stack 110 A and the second transistor stack 110 B.
- the first transistor stack 110 A implements a merged AND-NOR gate 205 , 206 and generates a data output signal D 2 at a second latch output data node, which is applied as an input to inverter 208 .
- the output of the inverter 208 is the buffered output signal Q.
- the second transistor stack 110 B implements a NOR gate 207 which generates a second feedback signal FB 2 .
- the inputs to the merged AND-NOR gate 205 , 206 include the first latch data signal D 1 and the clock signal CLK applied logically as inputs to the AND function, the output of which is applied logically as input to the NOR function.
- the second feedback signal FB 2 is also applied logically as an input to the NOR function.
- the inputs to the NOR gate 207 in the fourth transistor stack include the output data signal D 2 and the clock signal CLK.
- a critical timing path between the input signal D 0 and the output data signal D 2 traverses only two transistor stacks.
- a critical timing path, from data to output traverses only four transistor gate delays.
- embodiments as described herein implement the transistor stack forming the OR-NAND gate 202 , 203 such that it includes a clocked pull-up current path consisting of two p-channel transistors between the first latch output data node (signal D 1 ) and a VDD supply line, and a clocked pull-down current path consisting of two n-channel transistors between the first latch output data node (signal D 1 ) and VSS supply line.
- embodiments described herein implement the transistor stack forming the AND-NOR gate 205 , 206 such that it includes a clocked pull-up current path consisting of two p-channel transistors between the output data node (signal D 2 ) and a VDD supply line, and a clocked pull-down current path consisting of two n-channel transistors between the first latch ouput data (signal D 2 ) node and VSS supply line.
- the two p-channel transistors in the clocked pull-up current path of the first latch and the two p-channel transistors in the clocked pull-up current path of the second latch have channel lengths of about 7 nm or less.
- the input data signal D is applied through inverters 101 , 102 as input data signal D 0 .
- other functional circuits can be utilized instead of the inverters, as illustrated schematically in FIG. 3 .
- the circuit shown in FIG. 3 is the same as that as FIG. 1 , except that the inverter 101 is replaced with a functional block 310 .
- the same reference numerals are applied in FIG. 3 as in FIG. 1 for like elements.
- the functional element 310 shown in FIG. 3 is a combination of a NAND and NOR gates.
- FIG. 4 is a transistor schematic diagram of a clocked storage element like that of FIG. 1 .
- the input D is applied through inverter 400 to the input data node for signal D 0 (first stack input data node).
- the output Q is driven by the output inverter 410 , which receives as input the second latch output data signal D 2 (second stack output data node).
- Other types of circuitry can be used to buffer the inputs and outputs of the clocked storage element of FIG. 4 .
- the embodiment shown in FIG. 4 includes a first transistor stack 401 (like 110 A), a second transistor stack 402 (like 110 B), a third transistor stack 403 (like 111 A) and a fourth transistor stack 404 (like 111 B).
- a transistor stack as the term is used herein includes a pull-up circuit path between a VDD supply line and an output data node, and a pull-down circuit path between the same output data node and a VSS supply line.
- the first transistor stack 401 includes a first p-channel transistor P 1 and a second p-channel transistor P 2 connected in series between a VDD supply line and a first latch output data node (signal D 1 ), a first n-channel transistor N 1 and a second n-channel transistor N 2 connected in series between the first latch output data node (signal D 1 ) and a VSS supply line, a third p-channel transistor P 3 connected in parallel with the first p-channel transistor Pb and a third n-channel transistor N 3 connected in parallel with the first and second n-channel transistors N 1 , N 2 .
- the first p-channel transistor P 1 and first n-channel transistor N 1 have gates connected to a data input node (signal D), and the third p-channel transistor P 3 and the second n-channel transistor N 2 have gates connect to a clock input node CLK.
- the pull-up circuit in the stack 401 includes two current paths, P 2 -P 3 and P 2 -P 1 . These current paths each consist of only two p-channel transistors.
- the pull-down circuit in the stack 401 includes two current paths, N 1 -N 2 and N 3 .
- the N 1 -N 2 current path is the longest current path and consists of only two n-channel transistors.
- the first transistor stack 401 implements a function (D 0 AND CLK) NOR FB 1 , as illustrated in FIG. 1 .
- the second transistor stack 402 includes a fourth p-channel transistor P 4 and a fifth p-channel transistor P 5 connected in series between the VDD supply line and a first stack feedback node (signal FB 1 ), and a fourth n-channel transistor N 4 and a fifth n-channel transistor N 5 connected in parallel between the first stack feedback node (signal FB 1 ) and the VSS supply line.
- the fourth p-channel transistor P 4 and the fourth n-channel transistor N 4 have gates connected to the clock input node CLK
- the fifth p-channel transistor P 5 and the fifth n-channel transistor N 5 have gates connected to the first latch output data node (signal D 1 ).
- the second p-channel transistor P 2 and the third n-channel transistor N 3 in the first stack 401 have gates connected to the first stack feedback node FB 1 .
- the second transistor stack 402 implements a function (D 1 NOR CLK), as illustrated in FIG. 1 .
- the third transistor stack 403 includes a sixth p-channel transistor P 6 and a seventh p-channel transistor P 7 connected in series between a VDD power supply line and a data output node (signal D 2 ) (D 2 is also a third stack data output node), a sixth n-channel transistor N 6 and a seventh n-channel transistor N 7 connected between the data output node and a VSS supply line.
- An eighth p-channel transistor P 8 is connected in parallel with the sixth and seventh p-channel transistors P 6 , P 7 .
- An eighth n-channel transistor N 8 is connected in parallel with the seventh n-channel transistor N 7 .
- the seventh p-channel transistor P 7 and seventh n-channel transistor N 7 have gates connected to the first stack output data node (signal D 1 ).
- the sixth p-channel transistor P 6 and the eighth n-channel transistor N 8 have gates connect to the clock input node.
- the pull-up circuit in the stack 403 includes two current paths, P 7 -P 6 and P 8 .
- the P 7 -P 6 current path is the longest current path and consists of only two p-channel transistors.
- the pull-down circuit in the stack 401 includes two current paths, N 6 -N 7 and N 6 -N 8 . These current paths each consist of only two n-channel transistors.
- the third transistor stack 403 implements a function (D 1 OR CLK) NAND FB 2 , as illustrated in FIG. 1 .
- the fourth transistor stack 404 includes a ninth p-channel transistor P 9 and a tenth p-channel transistor P 10 connected in parallel between a VDD power supply line and a third stack feedback node (signal FB 2 ). Also, the fourth transistor stack 404 includes a ninth n-channel transistor N 9 and a tenth n-channel transistor N 10 connected in series between the third stack feedback node (signal FB 2 ) and the VSS supply line.
- the ninth p-channel transistor P 9 and the tenth n-channel transistor N 10 have gates connect to the clock input node CLK, and the tenth p-channel transistor P 10 and the ninth n-channel transistor N 9 have gates connect to the data output node (signal D 2 ).
- the fourth transistor stack 404 implements a function (D 2 NAND CLK), as illustrated in FIG. 1 .
- the circuit illustrated in FIG. 4 excluding the input buffer 401 and the output buffer 410 , consists of 20 CMOS transistors.
- the third p-channel transistor P 3 in the first transistor stack 401 , and the sixth p-channel transistor P 6 in the third transistor stack 403 are combined and implemented as a single transistor (represented by box 410 ).
- the circuit shown in FIG. 4 can be implemented in an embodiment consisting of 19 CMOS transistors.
- the circuit 403 When the clock transitions from 1 -to-0, the circuit 403 will pass the change on D 1 line to D 2 .
- the time for this change to propagate to D 2 will be the time from the clock transition 1-to-0 to the time D 2 changes its value. This is designated as CLK-to-Q delay, taw (as D 2 is representing the Q signals when input and output inverters are removed).
- the portion of the delay a signal travels through the latch (designated as “insertion delay”) is the sum of the setup time U and CLK-to-Q delay, i.e. this is the time from the latest allowed change on the input data D to the change of the output Q and is designated as DQ delay (tD Q ), or insertion delay.
- FIG. 5 is a timing diagram based on simulation illustrating operation of the circuit of FIG. 4 for a condition in which the input data signal D 0 transitions from high to low while the clock signal CLK is high. It is noted that the first latch in the circuit of FIG. 4 is transparent while the clock is high but generates an inverted output D 1 . Also, the second latch in the circuit of FIG. 4 is transparent while the clock is low, generating the clocked output D 2 on the falling edge of the clock signal CLK.
- FIG. 5 the signal names are shown on the left, and match the corresponding signal names shown in FIG. 4 .
- the internal data signal D 1 falls or is set low on the first rising edge of the clock signal CLK because transistors N 1 and N 2 turn on while transistors P 1 and P 3 turn off. While D 1 remains low, the first feedback signal FB 1 is an inverse of the clock signal CLK, controlled by the clock signal CLK on the gates of transistors P 4 and N 4 .
- D 1 is held low while D 0 is high by the feedback signal FB 1 on the gate of transistor N 3 , because the feedback signal FB 1 is held high by the low clock CLK on the gate of transistor P 4 and low D 1 on the gate of transistor P 5 .
- the second feedback signal FB 2 follows the inverse of the clock signal CLK while D 2 is high turning on transistor N 9 , as a result of transistors P 9 and N 10 .
- the first latch output data signal D 1 transitions high on the next rising edge of the clock signal CLK. This causes the first feedback signal FB 1 to go low and remain low as long as D 1 is high, as result of transistor N 5 .
- the output data signal D 2 remains high until the next falling edge of the clock signal CLK, because the second feedback signal FB 2 is low.
- the second feedback signal FB 2 transitions high turning on transistor N 6 and N 7 , the data signal D 2 transitions low, capturing the input data signal D 0 .
- the second feedback signal FB 2 is held high.
- FIG. 6 illustrates simulation result for the circuit of FIG. 4 .
- the functionality of the circuit of FIG. 4 is demonstrated by running it on the HSPICE circuit simulator utilizing 5 nm technology node transistor parameters under the worse environmental conditions and extracted parasitic parameters from the technology.
- the insertion delay of the clocked storage element, D-to-Q is determined by changing the data signal D closer to the falling edge of the clock until the output Q fails.
- the last stable D-Q transition simulated shows D-to-Q delay tD Q equal to about 39 pS.
- embodiments of the present technology achieve insertion delays less 50 pS, or less than 40 pS, for accessible technology nodes which is substantially faster than comparable clocked storage elements implemented in the same 5 nm technology.
- FIG. 7 is a transistor schematic diagram of a clocked storage element like that of FIG. 2 .
- the input D is applied through inverter 700 to the input data node for signal D 0 .
- the output Q is driven by the output inverter 710 , which receives as input the second latch output data signal D 2 .
- Other types of circuitry can be used to buffer the inputs and outputs of the clocked storage element of FIG. 7 .
- the embodiment shown in FIG. 7 includes a first transistor stack 701 (like 111 A), a second transistor stack 702 (like 111 B), a third transistor stack 703 (like 110 A) and a fourth transistor stack 704 (like 110 B).
- a transistor stack as the term is used herein includes a pull-up circuit path between a VDD supply line and an output data node, and a pull-down circuit path between the same output data node and a VSS supply line.
- the first transistor stack 701 is like the third transistor stack 403 of FIG. 4 , and the transistors have the same labels.
- the second transistor stack 702 is like the fourth transistor stack 404 of FIG. 4 , and the transistors have the same labels.
- the third transistor stack 703 is like the first transistor stack 401 of FIG. 4 , and the transistors have the same labels.
- the fourth transistor stack 704 is like the second transistor stack 402 of FIG. 4 , and the transistors have the same labels.
- the first transistor stack 701 implements a function (D 0 OR CLK) NAND FB 1 , as illustrated in FIG. 2 .
- the second transistor stack 702 implements a function (D 1 NAND CLK), as illustrated in FIG. 2 .
- the third transistor stack 703 implements a function (D 1 AND CLK) NOR FB 2 , as illustrated in FIG. 2 .
- the fourth transistor stack 704 implements a function (D 2 NOR CLK), as illustrated in FIG. 2 .
- FIG. 8 is a timing diagram showing operation of the clocked storage element of FIG. 7 , based on simulations assuming a 5 nm manufacturing node, like the simulation used to produce FIG. 5 .
- This disclosure describes various embodiments of a clocked storage element where signal from the input D to the output Q, traverses a two logic blocks, each of which is implemented using a single transistor stack. Further, two possible configurations are selected in such a way that the complementary clock signals are selected. This allows for achieving a Master-Slave function without the need to invert the clock signal, as commonly implemented.
- the data insertion point, and the feedback logic are selected in a way which is implementable as a single logic block. This process is applied in both latch structures: OR-NAND and AND-NOR.
- the selection of the logic blocks is made so that they do not to contain more than two PMOS or two NMOS transistors in the path to the supply voltage VDD or VSS (ground). This is the minimal transistor stack necessary to implement the given function.
- the resistance of the PMOS transistor is roughly equivalent to the resistance of the NMOS transistor of the same sizes, when in the saturation. This fact is used to the advantage in generating the logic structure employed in both latch structures, as the new technology does not favor NMOS transistor path over PMOS any longer.
- the PMOS transistors connected to the clock signal can be combined to form a single transistor and shared between the two latches (the third p-channel transistor P 3 in the first transistor stack 401 , and the sixth p-channel transistor P 6 in the third transistor stack 403 ).
- This combined PMOS transistor (P 3 /P 6 ) is made larger, and both effectively shortens the path to power supply and reduces the number of transistors.
- the size of the clocked storage element is roughly proportional to the number of transistors used to build the clocked storage element. Therefore, minimizing the number of transistors does impact the area in a beneficial way.
- the objective in designing the clocked storage element is so that Data-to-Output (Q), D-Q, delay is smallest.
- Q Data-to-Output
- D-Q Data-to-Output
- This objective will be achieved if, among other criteria, there is the most direct path from the input D to the output Q.
- most direct path we understand the smallest number of transistor stacks implementing the logic, or complex logic gates, be traversed, and that those transistor stacks are of the least complexity if possible.
- the third objective of the lowest power consumption is usually achieved if the number of active components is minimized. There are also other factors, such as switching activity of the nodes, charging, and discharging of the nodes etc., that do affect power consumption.
- the cell library can be applied by electronic design automation tools in the implementation of an integrated circuit.
- An integrated circuit on a single chip can include both a rising edge clocked storage element ( FIGS. 1 and 4 ) and a falling edge clocked storage element ( FIGS. 2 and 7 ) implemented as described herein, which have clock signals with the same polarity applied to the corresponding clock input nodes.
- the circuit of FIGS. 4 and 7 , and other embodiments of the present technology can be embodied in a computer readable form using a hardware description language such as Verilog and VHDL, and stored in non-transitory data storage medium or media, and used for example as an entry in a cell library.
- Embodiments can include a latch with the first circuit configuration and a latch with the second circuit configuration, as independently placeable circuit elements in the cell library.
- a design tool can be configured to place and route the first and second circuit configurations as master or as slave as desired for a particular use of the clocked storage element.
- this independent placement ability enables placement of the first and second latches of a clocked storage element according to placements of the source or producer of the input data (D) and the destination or consumer of the output data (Q), which placements may not be adjacent in some situations.
- the first latch comprises a circuit cell in a cell library and the second latch comprises a second circuit cell in a cell library, and the first latch and second latch are placed as separate cells by a place and route tool.
- the use of the logic synthesis allows for automatic optimal transistor sizing of transistors used in the standard cell libraries to achieve the fastest D-Q path of described clocked storage element, or lowest power consumption, or both depending on the design point.
- the use of the logic synthesis allows for separating the first and second latches (i.e., Master and Slave logic blocks) and placing them in the most appropriate places on the chip, which is determined by the Place and Route (PnR) Computer Aided Design (CAD) tools. This ability to separately place the first and second latches achieving the optimal PnR solution.
- PnR Place and Route
- CAD Computer Aided Design
- the Data input inverter can be replaced with another functional block, combining the latching function with the logic function, thus enhancing the utilization of the clocked storage element.
- block 310 represents multiplexer function which is commonly used in conjunction with the latch.
- VDD and VSS are voltages on upper and lower supply voltage lines in the circuit, referred to herein as a VDD supply line and VSS supply line, respectively.
- VDD is a positive voltage and VSS is ground.
- VSS is any voltage less than VDD. In some cases, VSS may be a negative voltage.
- the letters DD and SS are used for historical reasons and do not imply that the supply lines are connected to the drain or source. For example, in the circuit of FIG. 4 , VDD is the voltage on the VDD supply line connected to the sources of p-channel transistors.
Abstract
Description
- This application claims the benefit of U.S. Provisional Patent Application No. 63/230,782 filed 8 Aug. 2021; which application is incorporated herein by reference.
- This disclosure describes a clocked storage element, sometimes referred to as “flip-flop”, used for temporarily storing information in digital systems. Clocked storage elements are essential in constructing the Finite-State Machine (FSM) which is the core of every digital system. A few important characteristics of the clocked storage element include low “insertion delay” (Data-to-Q delay), low power consumption and small footprint (area).
- Clocked storage elements are very important elements in a digital system. They may take up to 20% of the clock cycle from the useful time allotted for computation. In addition, they may contribute to a quarter of the power consumed in the digital system, in dynamic power and more in the static power. The area taken by clocked storage elements similarly contributes to the total chip area, where chip area is directly proportional to the cost, performance, power, and the total amount of functionality that the chip can provide. Thus, there has been a continuous effort to design clocked storage elements which are: smaller, faster and less power consuming.
- A technology is described for implementation of clocked storage elements that according to various aspects, are compact and fast, and allow for flexible layouts and configurations. Embodiments are described having an insertion delay less than 50 picoseconds, and less than 40 picoseconds.
- According one aspect of the technology, a clocked storage element comprises a first latch having an input data node, a clock input node and a first latch output data node, the first latch having a current path consisting of two p-channel transistors between the first latch output data node and a VDD supply line, and two n-channel transistors between the first latch output data node and a VSS supply line; and a second latch having an input connected to the first latch output data node, a clock input node and a second latch output data node, the second latch having a current path consisting of, two n-channel transistors between the first latch output data node and a VSS supply line, and two p-channel transistors between the second latch output data node and the VDD supply line.
- According another aspect of the technology embodiment, a clocked storage element comprises a first latch having an input data node, a clock input node and a first latch output data node; and a second latch having an input connected to the first latch output data node, a clock input node and a second latch output data node, wherein a critical timing path from the input data node of the first latch to the second latch output data node has only two transistor path delays, and two transistors in the path of the first latch output to the second latch data node The total delay between the input data node to the second latch output no greater than four signal passes, including a signal pass through two p-channel transistors to pull up the latch output data node of one of the first and second latches, and a signal pass through two n-channel transistors to pull down the latch output data node of the other of the first and second latches.
- According to another aspect of the technology embodiment, a clocked storage element comprising of a first latch and a second latch does not require the clock input to be inverted. That is, the first latch and second latch have respective clock input nodes which receive the clock signal with the same polarity. One advantage of this feature arises in connection with insertion delay, because a margin to account for the settlement of signals on the output of a clock inverter otherwise required to drive one of the latches, is not involved.
- Also, described is an integrated circuit having a rising edge clocked storage element having a master latch with a first circuit configuration (e.g., a merged OR-NAND configured transistor stack and a NAND transistor stack configured as feedback) and a slave latch with a second circuit configuration (e.g., a merged OR-NAND configured transistor stack and a NAND transistor stack configured as feedback), and a negative edge clocked storage element having a master latch with the second circuit configuration and a slave latch with a first circuit configuration.
- Other aspects and advantages of the present technology can be seen on review of the drawings, the detailed description and the claims, which follow.
-
FIG. 1 illustrates a Master-Slave latch constructed by combining the two individual latches, with inverters added at the input and the output. -
FIG. 2 illustrates “rising edge” Master-Slave latch constructed by swapping the “master” and “slave” latches of theFIG. 1 . -
FIG. 3 illustrates a Master-Slave latch like that ofFIG. 1 , where the input inverter is being replaced by an arbitrary function. -
FIG. 4 illustrates transistor schematic diagram of the falling-edge M-S latch ofFIG. 1 . -
FIGS. 5 is timing diagram for operation of the circuit ofFIG. 4 . -
FIGS. 6 is timing diagram showing D to Q delay (insertion delay) of the circuit ofFIG. 4 . -
FIG. 7 illustrates transistor schematic diagram of the rising-edge M-S latch ofFIG. 2 . -
FIG. 8 is a timing diagram for operation of the circuit ofFIG. 7 . - A detailed description of embodiments of the technology is provided with reference to
FIGS. 1 to 8 . -
FIG. 1 is a logic diagram of a clocked storage element. In the illustrated example, the clocked storage element, configured as a falling-edge triggered flip-flop, has a buffered input receiving a data signal D and a buffered output producing an output signal Q. In this example, the data signal D is applied to the input of aninverter 101 acting as a buffer. The output of theinverter 101 is a data signal DO which can be considered the input of a first latch in the clocked storage element. - The first latch is implemented using a first circuit configuration, which includes a
first transistor stack 110A and asecond transistor stack 110B. Thefirst transistor stack 110A implements a merged AND-NORgate second transistor stack 110B implements aNOR gate 104, which generates a first feedback signal FB1. - The inputs to the merged AND-
NOR gate NOR gate 104 in the second transistor stack include the first latch output data signal D1 and the clock signal CLK. - The second latch is implemented using a second circuit configuration, which includes a
third transistor stack 111A and afourth transistor stack 111B. Thethird transistor stack 111A implements a merged OR-NAND gate inverter 108 is the buffered output signal Q. Thefourth transistor stack 111B implements aNAND gate 107 which generates a second feedback signal FB2. - The inputs to the merged OR-
NAND gate NAND gate 107 in the fourth transistor stack include the output data signal D2 and the clock signal CLK. - As seen, the critical timing path between the input signal D0 and the output data signal D2 traverses only two
transistor stacks - Also, embodiments as described herein implement the
transistor stack 110A of the AND-NOR gate such that it includes a clocked pull-up current path consisting of two p-channel transistors between the first latch output data node (signal D1) and a VDD supply line, and a pull-down current path consisting of two n-channel transistors between the first latch output data node and VSS supply line. Also, embodiments described herein implement thetransistor stack 111A of the OR-NAND gate such that it includes a clocked pull-up current path consisting of two p-channel transistors between the first latch output data node (signal D2) and a VDD supply line, and a pull-down current path consisting of two n-channel transistors between the first latch output data node (signal D2) and VSS supply line - Also, embodiments are described in which the two p-channel transistors in the clocked pull-up current path of the first latch and the two p-channel transistors in the clocked pull-up current path of the second latch have channel lengths of about 7 nm or less, manufacturable for example using so-called 7 nanometer or 5 nanometer nodes.
- The embodiment of
FIG. 1 implements a clocked storage element which triggers the transition of the output data signal D2 on the negative, or falling, edge of the clock signal CLK. - To implement a clocked storage element configured as a falling-edge triggered flip-flop, from the embodiment of
FIG. 1 , an inverse/CLK of the clock signal CLK can be applied instead. In either case, the polarity of CLK signal applied on the latch clock input nodes of the first and second latches is the same. - The embodiment of
FIG. 2 implements a clocked storage element, configured as a rising-edge triggered flip-flop, which triggers the transition of the output data signal D2 on the rising edge of the clock signal CLK, without an added clock signal inverter. - In the illustrated example shown in
FIG. 1 , the clocked storage element has a buffered input receiving a data signal D and a buffered output producing an output signal Q. So in this example shown inFIG. 2 , the data signal D is applied to the input of aninverter 201 acting as a buffer. The output of theinverter 201 is a data signal D0 which can be considered the input of the first latch. The output Q is produced by aninverter 208 connected to the D2 signal of the second latch. The D2 signal can be considered the output of the second latch. - The first latch has the second circuit configuration as described with reference to
FIG. 1 , including thethird transistor stack 111A that implements a merged OR-NANDgate fourth transistor stack 111B which implements aNAND gate 204 which generates a first feedback signal FB1. - The inputs to the merged OR-
NAND gate NAND gate 204 in the second transistor stack include the first latch output data signal D1 and the clock signal CLK. - The second latch is implemented using the first circuit configuration as described above, including the
first transistor stack 110A and thesecond transistor stack 110B. Thefirst transistor stack 110A implements a merged AND-NORgate inverter 208. The output of theinverter 208 is the buffered output signal Q. Thesecond transistor stack 110B implements a NORgate 207 which generates a second feedback signal FB2. - The inputs to the merged AND-NOR
gate gate 207 in the fourth transistor stack include the output data signal D2 and the clock signal CLK. - As seen in this example as well, a critical timing path between the input signal D0 and the output data signal D2 traverses only two transistor stacks. As a result, a critical timing path, from data to output traverses only four transistor gate delays.
- Also, embodiments as described herein implement the transistor stack forming the OR-
NAND gate gate - As with the embodiment of
FIG. 1 , embodiments are described in which the two p-channel transistors in the clocked pull-up current path of the first latch and the two p-channel transistors in the clocked pull-up current path of the second latch have channel lengths of about 7 nm or less. - In the embodiments described with respect to
FIG. 1 andFIG. 2 , the input data signal D is applied throughinverters FIG. 3 . The circuit shown inFIG. 3 is the same as that asFIG. 1 , except that theinverter 101 is replaced with afunctional block 310. The same reference numerals are applied inFIG. 3 as inFIG. 1 for like elements. Thefunctional element 310 shown inFIG. 3 is a combination of a NAND and NOR gates. This is a schematic representation of any variety of combinational logic or other kind of electronic circuit, that can be used to drive signal D0 to be captured by the clocked storage element. Also, in other embodiments, the buffered output signal Q can be driven by circuitry other than theinverter 108 illustrated. -
FIG. 4 is a transistor schematic diagram of a clocked storage element like that ofFIG. 1 . In this example, the input D is applied throughinverter 400 to the input data node for signal D0 (first stack input data node). Also, the output Q is driven by theoutput inverter 410, which receives as input the second latch output data signal D2 (second stack output data node). Other types of circuitry can be used to buffer the inputs and outputs of the clocked storage element ofFIG. 4 . - The embodiment shown in
FIG. 4 includes a first transistor stack 401 (like 110A), a second transistor stack 402 (like 110B), a third transistor stack 403 (like 111A) and a fourth transistor stack 404 (like 111B). A transistor stack as the term is used herein includes a pull-up circuit path between a VDD supply line and an output data node, and a pull-down circuit path between the same output data node and a VSS supply line. - The
first transistor stack 401 includes a first p-channel transistor P1 and a second p-channel transistor P2 connected in series between a VDD supply line and a first latch output data node (signal D1), a first n-channel transistor N1 and a second n-channel transistor N2 connected in series between the first latch output data node (signal D1) and a VSS supply line, a third p-channel transistor P3 connected in parallel with the first p-channel transistor Pb and a third n-channel transistor N3 connected in parallel with the first and second n-channel transistors N1, N2. The first p-channel transistor P1 and first n-channel transistor N1 have gates connected to a data input node (signal D), and the third p-channel transistor P3 and the second n-channel transistor N2 have gates connect to a clock input node CLK. - The pull-up circuit in the
stack 401 includes two current paths, P2-P3 and P2-P1. These current paths each consist of only two p-channel transistors. The pull-down circuit in thestack 401 includes two current paths, N1-N2 and N3. The N1-N2 current path is the longest current path and consists of only two n-channel transistors. - In the illustrated embodiment, the
first transistor stack 401 implements a function (D0 AND CLK) NOR FB1, as illustrated inFIG. 1 . - The
second transistor stack 402 includes a fourth p-channel transistor P4 and a fifth p-channel transistor P5 connected in series between the VDD supply line and a first stack feedback node (signal FB1), and a fourth n-channel transistor N4 and a fifth n-channel transistor N5 connected in parallel between the first stack feedback node (signal FB1) and the VSS supply line. The fourth p-channel transistor P4 and the fourth n-channel transistor N4 have gates connected to the clock input node CLK, the fifth p-channel transistor P5 and the fifth n-channel transistor N5 have gates connected to the first latch output data node (signal D1). The second p-channel transistor P2 and the third n-channel transistor N3 in thefirst stack 401 have gates connected to the first stack feedback node FB1. - In the illustrated embodiment, the
second transistor stack 402 implements a function (D1 NOR CLK), as illustrated inFIG. 1 . - The
third transistor stack 403 includes a sixth p-channel transistor P6 and a seventh p-channel transistor P7 connected in series between a VDD power supply line and a data output node (signal D2) (D2 is also a third stack data output node), a sixth n-channel transistor N6 and a seventh n-channel transistor N7 connected between the data output node and a VSS supply line. An eighth p-channel transistor P8 is connected in parallel with the sixth and seventh p-channel transistors P6, P7. An eighth n-channel transistor N8 is connected in parallel with the seventh n-channel transistor N7. The seventh p-channel transistor P7 and seventh n-channel transistor N7 have gates connected to the first stack output data node (signal D1). The sixth p-channel transistor P6 and the eighth n-channel transistor N8 have gates connect to the clock input node. - The pull-up circuit in the
stack 403 includes two current paths, P7-P6 and P8. The P7-P6 current path is the longest current path and consists of only two p-channel transistors. The pull-down circuit in thestack 401 includes two current paths, N6-N7 and N6-N8. These current paths each consist of only two n-channel transistors. - In the illustrated embodiment, the
third transistor stack 403 implements a function (D1 OR CLK) NAND FB2, as illustrated inFIG. 1 . - The
fourth transistor stack 404 includes a ninth p-channel transistor P9 and a tenth p-channel transistor P10 connected in parallel between a VDD power supply line and a third stack feedback node (signal FB2). Also, thefourth transistor stack 404 includes a ninth n-channel transistor N9 and a tenth n-channel transistor N10 connected in series between the third stack feedback node (signal FB2) and the VSS supply line. The ninth p-channel transistor P9 and the tenth n-channel transistor N10 have gates connect to the clock input node CLK, and the tenth p-channel transistor P10 and the ninth n-channel transistor N9 have gates connect to the data output node (signal D2). - In the illustrated embodiment, the
fourth transistor stack 404 implements a function (D2 NAND CLK), as illustrated inFIG. 1 . - The circuit illustrated in
FIG. 4 , excluding theinput buffer 401 and theoutput buffer 410, consists of 20 CMOS transistors. In one embodiment, the third p-channel transistor P3 in thefirst transistor stack 401, and the sixth p-channel transistor P6 in thethird transistor stack 403 are combined and implemented as a single transistor (represented by box 410). As a result, the circuit shown inFIG. 4 can be implemented in an embodiment consisting of 19 CMOS transistors. - In order for the data D0 to be captured in the first (Master) latch, the clock signal CLK has to be CLK=1. That means that the Master latch will be “transparent”, i.e. any change of D0 will be reflected on the node D1 (D1 will take the opposite value of D0). When the clock signal turns to CLK=0, data on the line D0 will be “captured” in the Master latch, as the
circuit 402, as well as 401 turns into an inverter keeping the value on D1 line in the loop. However, for the “capture” to be reliable, data on D0 cannot change in the same time the clock transitions from 1-to-0, and should be held stable (“frozen”) at least for some time (“setup time” U) before the clock signal changes. This time U is designated as a “setup time” designating the last moment data D0 can change before the clock transition from 1-to-0 (“falling edge” of the clock). - When the clock transitions from 1-to-0, the
circuit 403 will pass the change on D1 line to D2. The time for this change to propagate to D2 will be the time from the clock transition 1-to-0 to the time D2 changes its value. This is designated as CLK-to-Q delay, taw (as D2 is representing the Q signals when input and output inverters are removed). - The portion of the delay a signal travels through the latch (designated as “insertion delay”) is the sum of the setup time U and CLK-to-Q delay, i.e. this is the time from the latest allowed change on the input data D to the change of the output Q and is designated as DQ delay (tDQ), or insertion delay.
- To properly measure D-to-Q delay tDQ, we must bring the change on the data line D closer and closer to the “falling edge” of the clock CLK till the output Q fails to capture the proper value of D. This “signal sweep” is shown in
FIG. 7 and the value of tDQ determined to be about 31 pS for the particular simulation using 5 nm technology. -
FIG. 5 is a timing diagram based on simulation illustrating operation of the circuit ofFIG. 4 for a condition in which the input data signal D0 transitions from high to low while the clock signal CLK is high. It is noted that the first latch in the circuit ofFIG. 4 is transparent while the clock is high but generates an inverted output D1. Also, the second latch in the circuit ofFIG. 4 is transparent while the clock is low, generating the clocked output D2 on the falling edge of the clock signal CLK. - In
FIG. 5 , the signal names are shown on the left, and match the corresponding signal names shown inFIG. 4 . - Referring to
FIG. 5 , at initialization, when the clock signal CLK starts cycling, assuming D0 is high, the internal data signal D1 falls or is set low on the first rising edge of the clock signal CLK because transistors N1 and N2 turn on while transistors P1 and P3 turn off. While D1 remains low, the first feedback signal FB1 is an inverse of the clock signal CLK, controlled by the clock signal CLK on the gates of transistors P4 and N4. So, after the next falling edge of the clock signal CLK, D1 is held low while D0 is high by the feedback signal FB1 on the gate of transistor N3, because the feedback signal FB1 is held high by the low clock CLK on the gate of transistor P4 and low D1 on the gate of transistor P5. - While D1 is low, the signal D2 transitions high on the falling edge of the clock signal CLK via transistors P6 and P7, capturing the data signal D0. The second feedback signal FB2 follows the inverse of the clock signal CLK while D2 is high turning on transistor N9, as a result of transistors P9 and N10.
- As illustrated, if the signal D0 transitions from high to low while the clock signal CLK is low, the first latch output data signal D1 transitions high on the next rising edge of the clock signal CLK. This causes the first feedback signal FB1 to go low and remain low as long as D1 is high, as result of transistor N5.
- The output data signal D2 remains high until the next falling edge of the clock signal CLK, because the second feedback signal FB2 is low. When the second feedback signal FB2 transitions high turning on transistor N6 and N7, the data signal D2 transitions low, capturing the input data signal D0. When the data signal D2 is low, the second feedback signal FB2 is held high.
-
FIG. 6 illustrates simulation result for the circuit ofFIG. 4 . The functionality of the circuit ofFIG. 4 is demonstrated by running it on the HSPICE circuit simulator utilizing 5 nm technology node transistor parameters under the worse environmental conditions and extracted parasitic parameters from the technology. The insertion delay of the clocked storage element, D-to-Q, is determined by changing the data signal D closer to the falling edge of the clock until the output Q fails. The last stable D-Q transition simulated shows D-to-Q delay tDQ equal to about 39 pS. Thus embodiments of the present technology achieve insertion delays less 50 pS, or less than 40 pS, for accessible technology nodes which is substantially faster than comparable clocked storage elements implemented in the same 5 nm technology. -
FIG. 7 is a transistor schematic diagram of a clocked storage element like that ofFIG. 2 . In this example, the input D is applied throughinverter 700 to the input data node for signal D0. Also, the output Q is driven by theoutput inverter 710, which receives as input the second latch output data signal D2. Other types of circuitry can be used to buffer the inputs and outputs of the clocked storage element ofFIG. 7 . - The embodiment shown in
FIG. 7 includes a first transistor stack 701 (like 111A), a second transistor stack 702 (like 111B), a third transistor stack 703 (like 110A) and a fourth transistor stack 704 (like 110B). A transistor stack as the term is used herein includes a pull-up circuit path between a VDD supply line and an output data node, and a pull-down circuit path between the same output data node and a VSS supply line. - The
first transistor stack 701 is like thethird transistor stack 403 ofFIG. 4 , and the transistors have the same labels. Thesecond transistor stack 702 is like thefourth transistor stack 404 ofFIG. 4 , and the transistors have the same labels. Thethird transistor stack 703 is like thefirst transistor stack 401 ofFIG. 4 , and the transistors have the same labels. Thefourth transistor stack 704 is like thesecond transistor stack 402 ofFIG. 4 , and the transistors have the same labels. - In the illustrated embodiment, the
first transistor stack 701 implements a function (D0 OR CLK) NAND FB1, as illustrated inFIG. 2 . - In the illustrated embodiment, the
second transistor stack 702 implements a function (D1 NAND CLK), as illustrated inFIG. 2 . - In the illustrated embodiment, the
third transistor stack 703 implements a function (D1 AND CLK) NOR FB2, as illustrated inFIG. 2 . - In the illustrated embodiment, the
fourth transistor stack 704 implements a function (D2 NOR CLK), as illustrated inFIG. 2 . - The operation of the stacks is not described again. However,
FIG. 8 is a timing diagram showing operation of the clocked storage element ofFIG. 7 , based on simulations assuming a 5 nm manufacturing node, like the simulation used to produceFIG. 5 . - This disclosure describes various embodiments of a clocked storage element where signal from the input D to the output Q, traverses a two logic blocks, each of which is implemented using a single transistor stack. Further, two possible configurations are selected in such a way that the complementary clock signals are selected. This allows for achieving a Master-Slave function without the need to invert the clock signal, as commonly implemented.
- The data insertion point, and the feedback logic, are selected in a way which is implementable as a single logic block. This process is applied in both latch structures: OR-NAND and AND-NOR.
- The selection of the logic blocks is made so that they do not to contain more than two PMOS or two NMOS transistors in the path to the supply voltage VDD or VSS (ground). This is the minimal transistor stack necessary to implement the given function.
- In deep sub-micron technology, such as 7 nm and 5 nm technology nodes, the resistance of the PMOS transistor is roughly equivalent to the resistance of the NMOS transistor of the same sizes, when in the saturation. This fact is used to the advantage in generating the logic structure employed in both latch structures, as the new technology does not favor NMOS transistor path over PMOS any longer.
- In further transistor embodiments of the clocked storage element, it is observed that the PMOS transistors connected to the clock signal can be combined to form a single transistor and shared between the two latches (the third p-channel transistor P3 in the
first transistor stack 401, and the sixth p-channel transistor P6 in the third transistor stack 403). This combined PMOS transistor (P3/P6) is made larger, and both effectively shortens the path to power supply and reduces the number of transistors. This provides an embodiment of the clocked storage element consisting of 19 transistors, thus contributing to the small size of the clocked storage element. - The size of the clocked storage element is roughly proportional to the number of transistors used to build the clocked storage element. Therefore, minimizing the number of transistors does impact the area in a beneficial way. The speed of the clocked storage element, or the amount of time taken from the cycle is equal to the time the signal takes from entering the latch to the time exiting the latch, i.e., D-Q delay. This is described in the equation: Tm=T≥DLmax+DDQmax which states that the fastest the system can run (the highest frequency) is determined by the maximal delay of the signal in the logic critical path and maximal delay of the signal through the clocked storage element. Consequently, the objective in designing the clocked storage element is so that Data-to-Output (Q), D-Q, delay is smallest. This objective will be achieved if, among other criteria, there is the most direct path from the input D to the output Q. By “most direct path” we understand the smallest number of transistor stacks implementing the logic, or complex logic gates, be traversed, and that those transistor stacks are of the least complexity if possible. The third objective of the lowest power consumption is usually achieved if the number of active components is minimized. There are also other factors, such as switching activity of the nodes, charging, and discharging of the nodes etc., that do affect power consumption.
- Providing the logic equivalent of the clocked storage element as a library function, as opposed to transistor diagram, consisting of the logic blocks supplied by a standard cell library, allows for the use of logic synthesis (CAD tools) in creating described clocked storage element. The cell library can be applied by electronic design automation tools in the implementation of an integrated circuit.
- An integrated circuit on a single chip, can include both a rising edge clocked storage element (
FIGS. 1 and 4 ) and a falling edge clocked storage element (FIGS. 2 and 7 ) implemented as described herein, which have clock signals with the same polarity applied to the corresponding clock input nodes. - The circuit of
FIGS. 4 and 7 , and other embodiments of the present technology can be embodied in a computer readable form using a hardware description language such as Verilog and VHDL, and stored in non-transitory data storage medium or media, and used for example as an entry in a cell library. Embodiments can include a latch with the first circuit configuration and a latch with the second circuit configuration, as independently placeable circuit elements in the cell library. A design tool can be configured to place and route the first and second circuit configurations as master or as slave as desired for a particular use of the clocked storage element. Also, this independent placement ability enables placement of the first and second latches of a clocked storage element according to placements of the source or producer of the input data (D) and the destination or consumer of the output data (Q), which placements may not be adjacent in some situations. Thus, an embodiment is provided in which the first latch comprises a circuit cell in a cell library and the second latch comprises a second circuit cell in a cell library, and the first latch and second latch are placed as separate cells by a place and route tool. - The use of the logic synthesis allows for automatic optimal transistor sizing of transistors used in the standard cell libraries to achieve the fastest D-Q path of described clocked storage element, or lowest power consumption, or both depending on the design point.
- The use of the logic synthesis allows for separating the first and second latches (i.e., Master and Slave logic blocks) and placing them in the most appropriate places on the chip, which is determined by the Place and Route (PnR) Computer Aided Design (CAD) tools. This ability to separately place the first and second latches achieving the optimal PnR solution.
- The Data input inverter can be replaced with another functional block, combining the latching function with the logic function, thus enhancing the utilization of the clocked storage element. In the example shown in
FIG. 3 , block 310 represents multiplexer function which is commonly used in conjunction with the latch. - VDD and VSS are voltages on upper and lower supply voltage lines in the circuit, referred to herein as a VDD supply line and VSS supply line, respectively. Typically, VDD is a positive voltage and VSS is ground. VSS is any voltage less than VDD. In some cases, VSS may be a negative voltage. The letters DD and SS are used for historical reasons and do not imply that the supply lines are connected to the drain or source. For example, in the circuit of
FIG. 4 , VDD is the voltage on the VDD supply line connected to the sources of p-channel transistors. - While the present technology is disclosed by reference to various embodiments and examples detailed above, it is to be understood that these examples are intended in an illustrative rather than in a limiting sense. It is contemplated that modifications and combinations will readily occur to those skilled in the art, which modifications and combinations will be within the spirit of the invention and the scope of the following claims.
Claims (26)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/551,610 US11558041B1 (en) | 2021-08-08 | 2021-12-15 | Fast clocked storage element |
US18/096,515 US11967955B2 (en) | 2021-08-08 | 2023-01-12 | Fast clocked storage element |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163230782P | 2021-08-08 | 2021-08-08 | |
US17/551,610 US11558041B1 (en) | 2021-08-08 | 2021-12-15 | Fast clocked storage element |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/096,515 Continuation US11967955B2 (en) | 2021-08-08 | 2023-01-12 | Fast clocked storage element |
Publications (2)
Publication Number | Publication Date |
---|---|
US11558041B1 US11558041B1 (en) | 2023-01-17 |
US20230045265A1 true US20230045265A1 (en) | 2023-02-09 |
Family
ID=84922804
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/551,610 Active US11558041B1 (en) | 2021-08-08 | 2021-12-15 | Fast clocked storage element |
US18/096,515 Active US11967955B2 (en) | 2021-08-08 | 2023-01-12 | Fast clocked storage element |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/096,515 Active US11967955B2 (en) | 2021-08-08 | 2023-01-12 | Fast clocked storage element |
Country Status (1)
Country | Link |
---|---|
US (2) | US11558041B1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130261815A1 (en) * | 2012-03-26 | 2013-10-03 | Kabushiki Kaisha Toshiba | Battery cell monitoring circuit and battery cell monitoring system |
US20160098506A1 (en) * | 2014-10-07 | 2016-04-07 | Freescale Semiconductor, Inc. | Signal delay flip-flop cell for fixing hold time violation |
US9941881B1 (en) * | 2017-03-23 | 2018-04-10 | Qualcomm Incorporated | Apparatus and method for latching data including AND-NOR or OR-NAND gate and feedback paths |
US20180123571A1 (en) * | 2016-10-27 | 2018-05-03 | Arm Limited | Flip-Flop |
US20220190813A1 (en) * | 2020-12-10 | 2022-06-16 | Qualcomm Incorporated | Fault resilient flip-flop with balanced topology and negative feedback |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3513376B2 (en) * | 1997-11-19 | 2004-03-31 | シャープ株式会社 | Flip-flop circuit |
US6895061B1 (en) * | 1999-10-26 | 2005-05-17 | Agilent Technologies, Inc. | Scannable synchronizer having a deceased resolving time |
US6538471B1 (en) * | 2001-10-10 | 2003-03-25 | International Business Machines Corporation | Multi-threshold flip-flop circuit having an outside feedback |
TW200943720A (en) | 2008-04-03 | 2009-10-16 | Faraday Tech Corp | Apparatus of data retention for multi power domains |
GB2471067B (en) | 2009-06-12 | 2011-11-30 | Graeme Roy Smith | Shared resource multi-thread array processor |
US8723548B2 (en) | 2012-03-06 | 2014-05-13 | Broadcom Corporation | Hysteresis-based latch design for improved soft error rate with low area/performance overhead |
JP2015177222A (en) * | 2014-03-13 | 2015-10-05 | 株式会社東芝 | semiconductor integrated circuit |
US10411677B2 (en) * | 2016-07-14 | 2019-09-10 | Samsung Electronics Co., Ltd. | Flip-flop including 3-state inverter |
US11050423B1 (en) | 2020-01-16 | 2021-06-29 | Taiwan Semiconductor Manufacturing Company Ltd. | Flip-flop device and method of operating flip-flop device |
US11552622B1 (en) | 2022-03-23 | 2023-01-10 | SambaNova Systems, Inc. | High-performance flip-flop |
-
2021
- 2021-12-15 US US17/551,610 patent/US11558041B1/en active Active
-
2023
- 2023-01-12 US US18/096,515 patent/US11967955B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130261815A1 (en) * | 2012-03-26 | 2013-10-03 | Kabushiki Kaisha Toshiba | Battery cell monitoring circuit and battery cell monitoring system |
US20160098506A1 (en) * | 2014-10-07 | 2016-04-07 | Freescale Semiconductor, Inc. | Signal delay flip-flop cell for fixing hold time violation |
US20180123571A1 (en) * | 2016-10-27 | 2018-05-03 | Arm Limited | Flip-Flop |
US9941881B1 (en) * | 2017-03-23 | 2018-04-10 | Qualcomm Incorporated | Apparatus and method for latching data including AND-NOR or OR-NAND gate and feedback paths |
US20220190813A1 (en) * | 2020-12-10 | 2022-06-16 | Qualcomm Incorporated | Fault resilient flip-flop with balanced topology and negative feedback |
Also Published As
Publication number | Publication date |
---|---|
US20230155579A1 (en) | 2023-05-18 |
US11967955B2 (en) | 2024-04-23 |
US11558041B1 (en) | 2023-01-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9762214B2 (en) | Flip-flop circuit | |
US7525361B2 (en) | High speed flip-flops and complex gates using the same | |
US7772889B2 (en) | Programmable sample clock for empirical setup time selection | |
US6396307B1 (en) | Semiconductor integrated circuit and method for designing the same | |
US7710155B2 (en) | Dynamic dual output latch | |
CA2338114C (en) | Single rail domino logic for four-phase clocking scheme | |
US6798249B2 (en) | Circuit for asynchronous reset in current mode logic circuits | |
JPH11186882A (en) | D flip-flop | |
US7737757B2 (en) | Low power level shifting latch circuits with gated feedback for high speed integrated circuits | |
US20110260764A1 (en) | Semiconductor integrated circuit, method for designing semiconductor integrated circuit, and computer readable recording medium | |
US6509761B2 (en) | Logical circuit | |
US8181073B2 (en) | SRAM macro test flop | |
US11558041B1 (en) | Fast clocked storage element | |
JP3582967B2 (en) | Latch circuit and flip-flop circuit with clock signal level conversion function | |
CN110798179A (en) | D flip-flop with low clock dissipation power | |
US8604854B1 (en) | Pseudo single-phase flip-flop (PSP-FF) | |
CN210380808U (en) | Circuit for storing data in an integrated circuit device | |
US10812055B2 (en) | Flip flop circuit | |
US20020043990A1 (en) | Logic circuit | |
Jain et al. | Sinusoidal power clock based PFAL | |
Manna et al. | Adiabatic SRAM cell and array | |
US6215344B1 (en) | Data transmission circuit | |
Karplus | Formal Model of MOS Clocking Disciples | |
JP4263841B2 (en) | Semiconductor integrated circuit and semiconductor integrated circuit design method | |
JP2003330984A (en) | Semiconductor integrated circuit and its design method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SAMBANOVA SYSTEMS, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:OKLOBDZIJA, VOJIN G.;REEL/FRAME:058397/0345 Effective date: 20211214 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |