WO2022021091A1 - 时钟树架构、时钟信号传输方法及设备 - Google Patents

时钟树架构、时钟信号传输方法及设备 Download PDF

Info

Publication number
WO2022021091A1
WO2022021091A1 PCT/CN2020/105288 CN2020105288W WO2022021091A1 WO 2022021091 A1 WO2022021091 A1 WO 2022021091A1 CN 2020105288 W CN2020105288 W CN 2020105288W WO 2022021091 A1 WO2022021091 A1 WO 2022021091A1
Authority
WO
WIPO (PCT)
Prior art keywords
clock
frequency
edge
clock signal
target
Prior art date
Application number
PCT/CN2020/105288
Other languages
English (en)
French (fr)
Inventor
金志刚
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202080104699.XA priority Critical patent/CN116209968A/zh
Priority to PCT/CN2020/105288 priority patent/WO2022021091A1/zh
Publication of WO2022021091A1 publication Critical patent/WO2022021091A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/08Clock generators with changeable or programmable clock frequency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • G06F30/396Clock trees

Definitions

  • the present application relates to the technical field of integrated circuits, and in particular, to a clock network, a clock signal transmission method and device.
  • the industry usually adopts the H-tree technology, which enhances the quality of the clock signal by using specially customized clock tree buffer units (buffers, triangles in the figure) and high-level metal traces, and at the same time significantly improves the quality of the clock signal.
  • the delay of the clock tree is reduced, and the cost of timing closure of the global synchronization design is reduced.
  • specially tailored clock tree caches and high-level metal routing can significantly increase the power consumption of the clock tree, and at the same time, the clock signal integrity risk increases significantly as the clock frequency continues to increase.
  • the inventor of the present application found in the process of research and practice that the prior art adopts low frequency for clock signal transmission, as shown in FIG. 2 , the low frequency clock signal is directly generated by the clock source, and after the low frequency clock signal is transmitted to the module that needs to use the clock signal
  • the modules eg, module 1 and module 2
  • generate high-frequency clock signals eg, high-frequency clock signal 1 and high-frequency clock signal 2
  • frequency multiplier circuits eg, frequency multiplier circuit 1 and frequency multiplier circuit 2
  • the high cost of the high-frequency clock generated by the module leads to a significant increase in the power consumption of the chip.
  • the present application provides a clock tree structure, a clock signal transmission method and device, which can save the power consumption of clock signal transmission, enhance the reliability of clock signal transmission, and have higher applicability.
  • the present application provides a clock tree architecture, where the clock tree architecture includes: a clock source, a frequency divider by two, and a clock tree.
  • the clock source is used to generate the clock signal
  • the frequency divider is used to reduce the target clock frequency of the clock signal generated by the clock source to the first clock frequency to obtain the clock signal to be transmitted.
  • the first clock frequency is half of the target frequency.
  • the clock tree is used to receive the to-be-transmitted clock signal and transmit the to-be-transmitted clock signal to the target module, where the sequential logic circuit of the clock tree is implemented by a double-edge register and a double-edge gate control unit.
  • the high-frequency clock signal generated by the clock source can be adjusted to a half-frequency clock signal through a two-frequency divider, and the half-frequency clock signal is transmitted on the clock tree, which can save the power consumption of clock signal transmission.
  • Registers and double-edge gating units to realize the sequential logic circuit of clock tree can make half-frequency clock transmission have engineering practicability, enhance the reliability of clock signal transmission, and have higher applicability.
  • the above-mentioned target module includes a clock pulse width adjustable frequency multiplier, and the clock pulse width adjustable frequency multiplier is used to change the clock frequency of the clock signal to be transmitted from the first clock frequency.
  • the frequency is adjusted to the target frequency.
  • the frequency multiplier is used to generate the frequency multiplier in the target module, which can realize the synchronization of different frequency multipliers, reduce the power consumption of the frequency multiplier, and enhance the reliability of half-frequency clock signal transmission. , the applicability is higher.
  • the above-mentioned clock pulse width adjustable frequency multiplier includes a delay selection terminal and at least one pulse width adjustment unit.
  • the number of the pulse width adjustment units may be determined according to the bit width adjustment requirement of the delay selection. Based on the input signal of the delay selection terminal, it can be determined to connect to the above-mentioned pulse width adjustment unit or bypass the above-mentioned pulse width adjustment unit, so that the high-level pulse width of the frequency multiplied clock can be adjusted, and the operation is flexible and the applicability is high.
  • the above-mentioned pulse width adjustment unit includes multiple buffers or multiple inverters.
  • the number of buffers or inverters in the pulse width adjustment unit can be determined by the adjustment requirement of the high pulse width of the frequency multiplied clock, and the number of buffers or inverters in the pulse width adjustment unit can be flexibly adjusted. high.
  • the target module includes a target double-edge gate control unit and a target double-edge register, and the sequential logic circuit of the target module consists of the target double-edge gate control unit and the target double-edge register.
  • the target module can receive the half-frequency clock signal transmitted on the clock tree through the double-edge gating unit and the double-edge register, and realize the sequential logic circuit of the target module based on the double-edge gating unit and the double-edge register of the target module, Therefore, the function of the target module can be realized based on the half-frequency clock signal, without the need for a frequency multiplier, and the target module has a simple structure and high applicability.
  • the present application provides a clock signal transmission method, and the clock signal transmission method is suitable for dividing by two in the clock tree architecture provided in any one of the first aspect to the fourth possible implementation manner of the first aspect
  • the method includes: receiving a clock signal from a clock source; reducing a target clock frequency of the clock signal generated by the clock source to a first clock frequency to obtain a clock signal to be transmitted.
  • the first clock frequency is half of the target clock frequency.
  • the sequential logic circuit of the clock tree is implemented by double-edge registers and double-edge gate control units.
  • the method further includes adjusting the clock frequency of the to-be-transmitted clock signal from the first clock frequency to the target frequency by using a clock pulse width adjustable frequency multiplier of the target module.
  • the method further includes receiving the clock signal to be transmitted through the target double-edge gating unit and the target double-edge register included in the target module, so as to pass the target double-edge gating unit And the target double edge register realizes the sequential logic circuit of the target module.
  • the present application provides a chip, where the chip includes the clock tree architecture provided in any one of the foregoing first aspect to the fourth possible implementation manner of the first aspect.
  • the present application provides an electronic device, which includes the clock tree architecture provided in any one of the above-mentioned first aspect to the fourth possible implementation manner of the first aspect or the chip provided in the above-mentioned third aspect.
  • the high-frequency clock signal generated by the clock source can be adjusted to a half-frequency clock signal through a two-frequency divider, and the half-frequency clock signal is transmitted on the clock tree, which can save the power consumption of clock signal transmission.
  • Registers and double-edge gating units to realize the sequential logic circuit of clock tree can make half-frequency clock transmission have engineering practicability, enhance the reliability of clock signal transmission, and have higher applicability.
  • Fig. 1 is a structural schematic diagram of a clock tree
  • Fig. 2 is another structural schematic diagram of clock tree
  • FIG. 3 is a schematic structural diagram of a clock tree architecture provided by the present application.
  • Fig. 4 is the waveform schematic diagram of the two-frequency divider provided by the application.
  • FIG. 5 is a schematic structural diagram of a clock tree architecture provided by the present application and a traditional clock tree architecture
  • FIG. 6 is a schematic diagram of a timing model of a single-edge register
  • Fig. 7 is the timing check schematic diagram of sending register and capture register
  • FIG. 8 is a schematic diagram of a timing model of a dual-edge register provided by the present application.
  • Fig. 9 is the structural representation of the timing check between the double-edge register and the double-edge gate control unit
  • Fig. 10 is a schematic diagram of the timing check between the dual-edge register and the dual-edge gate unit
  • Figure 11 is another schematic diagram of the timing check between the dual-edge register and the dual-edge gate unit
  • Figure 12 is a schematic diagram of a timing check between a double-edge register and a single-edge register
  • Figure 13 is a schematic diagram of a timing check between a single-edge register and a double-edge register/gating unit
  • Fig. 14 is the corresponding relation diagram of double-edge register and double-edge gating unit and pseudo-single-edge register and pseudo-single-edge gating unit;
  • Fig. 15 is the design flow chart of the digital integrated circuit provided by this application.
  • 16 is a schematic circuit diagram of a conventional clock frequency multiplying unit
  • 17 is a schematic circuit diagram of a clock pulse width adjustable frequency multiplier provided by the present application.
  • FIG. 21 is a schematic flowchart of the clock signal transmission method improved by the present application.
  • the clock tree architecture provided herein is applicable to large digital SOCs that may be applicable to computer systems or servers that may operate with numerous other general purpose or special purpose computing systems, environments or configurations.
  • computing systems, environments and/or configurations suitable for use with the aforementioned computer systems or servers include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, handheld or laptop devices, micro-based Systems of processors, set-top boxes, programmable consumer electronics, networked personal computers, minicomputer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the foregoing, among others.
  • a computer system or server may be described in the general context of computer system-executable instructions, such as program modules, being executed by the computer system.
  • program modules may include routines, programs, object programs, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • Computer systems or servers may be implemented in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located on local or remote computing system storage media including storage devices.
  • FIG. 3 is a schematic structural diagram of the clock tree architecture provided by the present application.
  • the clock tree architecture provided in this application includes a clock source, a two-frequency divider and a clock tree, which can reduce the clock frequency to half of the target clock frequency when transmitting a clock over a long distance, which can significantly reduce the dynamic power consumption of the clock tree. Enhance the reliability of clock signal transmission.
  • the clock source is connected to the frequency divider by two
  • the clock tree is connected to the clock tree through the frequency divider by two.
  • the clock signal generated by the clock source can output the to-be-transmitted clock signal after passing through the two-frequency divider, and the to-be-transmitted signal can be transmitted to the target modules (such as module 1 and module 2) through the clock tree.
  • the sequential logic circuit brought by the clock tree is implemented by double-edge registers and double-edge gate control units.
  • the clock source may be a phase locked loop (phase lock loop, PLL), or other functional modules other than PLL for generating a clock signal, which can be determined according to the actual application scenario, and is not limited here. It can be understood that the clock source is used to generate the clock signal. In this case, the clock signal generated by the clock source may be a high-frequency clock signal.
  • the clock frequency of the clock signal generated by the clock source is the target frequency. That is to say, in the clock tree architecture provided by this application, the clock source can directly generate high-frequency clock signals, and the low-frequency clock can be directly generated without the clock source, which can avoid the duty cycle of the output clock of the clock source being not 1:1 to the clock tree architecture
  • the influence of the clock of the follow-up circuit, the operation is simple, and the applicability is high.
  • a two-frequency divider is added to the high-frequency clock output. Based on the two-frequency divider, the clock frequency of the high-frequency clock signal generated by the clock source can be reduced by half to obtain a half-frequency clock signal. Specifically, after the clock source generates a high-frequency clock signal, the high-frequency clock signal can be output to a two-frequency divider, and the clock frequency of the high-frequency clock signal generated by the clock source can be reduced to half of the target frequency based on the two-frequency divider. , to obtain the clock signal to be transmitted. That is to say, the clock frequency of the clock signal to be transmitted here is half of the target frequency.
  • the frequency divider by two can be a register frequency division, and the waveform allocated by the register is shown in Figure 4.
  • FIG. 4 is a schematic waveform diagram of the frequency divider provided by the present application. As shown in Figure 4, based on the waveform of the two-frequency divider, it can be seen that the transition of the output clock Q of the two-frequency divider is generated by the rising edge of the input clock CLK. Based on the commonly used electronic design automation (EDA) tools The change in the duty cycle of the output clock due to the divider by two can be accurately calculated.
  • EDA electronic design automation
  • the two-frequency divider can also use other types of frequency dividers other than register allocation, as long as it can avoid the influence of the clock source output clock duty cycle of not 1:1 on the clock in the subsequent circuit. Therefore, the selection of the type of the two-frequency divider can be determined according to the actual application scenario, which is not limited here.
  • the clock signal to be transmitted can be input into the clock tree, and the clock signal to be transmitted can be input through the clock
  • the tree transmits the clock signal to be transmitted to the target module, and converts the transmission of the high-frequency clock signal into a half-frequency clock signal (that is, a clock signal whose clock frequency is half of the target frequency), which can reduce the power consumption of the clock signal transmission and enhance the clock signal. Reliability of signal transmission.
  • the clock tree can receive the to-be-transmitted clock signal output by the divider by two, and transmit the to-be-transmitted clock signal to the target module.
  • the target module here can be any functional module in the computer system and/or server that is used to perform specific tasks and needs to be driven by the clock transmitted from the clock tree, which can be determined according to the actual application scenario, and is not limited here.
  • the clock frequency of the clock signal to be transmitted output by the two-frequency divider is half of the target frequency.
  • the sequential logic carried on the clock tree The circuit can be implemented by double-edge registers and double-edge gate control units, so that the transmission of the half-frequency clock signal on the clock tree has engineering practicability and higher applicability.
  • FIG. 5 is a schematic structural diagram of a clock tree architecture provided by the present application and a conventional clock tree architecture.
  • the clock source generates a high-frequency clock signal and then inputs the full-frequency clock tree, and transmits the high-frequency clock signal to the modules (such as module 1 and module 2) through the full-frequency clock tree.
  • the sequential logic circuits carried on the high-frequency clock tree are implemented by registers and gating units, and the long-distance transmission of high-frequency clock signals on the full-frequency clock tree will bring clock signal integrity risks.
  • the clock tree architecture provided by this application can convert the long-distance transmission of high-frequency clock signals on the full-frequency clock tree into the transmission of half-frequency clock signals on the half-frequency clock tree.
  • the sequential logic circuit on the half-frequency clock tree It can be implemented by double-edge registers and double-edge gate control units, which can enhance the feasibility of half-frequency clock signal transmission and save the power consumption of clock signal transmission.
  • the sequential logic circuits of the frequency-divided clock band are widely distributed, so in order to reduce the design complexity, usually the half-frequency clock does not carry sequential logic circuits. If in practical application scenarios, the half-frequency clock does need to have sequential logic circuits from the perspective of functional design. At this time, in order to ensure that the logical function of the sequential logic circuit is consistent with the logic function in the full-frequency clock signal transmission mode, only the The clock frequency of sequential logic circuits in different locations is recovered from half frequency to full frequency.
  • the sequential logic circuit carried on the half-frequency clock tree can be implemented by double-edge registers and double-edge gating units, which can avoid adding a large number of clock frequency multiplication units in the clock tree architecture. high.
  • Pseudo-single-edge registers and pseudo-single-edge gating units are for the realization of real double-edge registers.
  • a virtual model (or pseudo-model) designed with the function of the double-edge gating unit, and this pseudo-model is modeled in the same way as the traditional model of the register and the gating unit.
  • the types of models that need to be established for traditional registers and gate control units can include functional models, timing models, physical models and Scan test models (as shown in the first column in Table 1, Table 1 is the traditional The modeling relationship table of registers and gating units, double-edge registers and double-edge gating units, and pseudo-single-edge registers and pseudo-single-edge gating units), for double-edge registers and double-edge gating units, pseudo-single-edge registers
  • the model and the establishment method of the pseudo-single-edge gating unit are different from those required by the traditional register and gating unit, as shown in Table 1 below:
  • timing model can be implemented by timing library
  • physical model can be implemented by circuit physical model (library exchange format, LEF)
  • test model can be implemented by Scan test model.
  • the functional model can also be implemented in the industry's classic VHDL or Verilog language, and the timing model can use the timing modeling method provided in this application, and there is no need to establish a physical model (that is, the physical model is empty) and the test model (i.e. the test model is empty).
  • the models that need to be established also include a functional model, a timing model, a physical model, and a test model.
  • functional modules can be implemented in the industry's classic VHDL or Verilog language, and the timing model can be implemented using the industry's classic register description method.
  • the physical model only contains the information required for physical implementation without dual-edge information, and the physical model can be implemented using The industry's classic physical model description method is implemented, and the test model is used to implement test logic generation and test vector generation, and the test model can be implemented by the Scan test model.
  • the implementation of each model in the models to be established for the pseudo-single-edge register and the pseudo-single-edge gating unit can be determined according to the actual application scenario, which is not limited here.
  • FIG. 6 is a schematic diagram of a timing model of a single-edge register.
  • the timing model of a traditional register ie, a single-edge register
  • the timing model of a traditional register usually includes the following timing information (Timing Arcs):
  • setup time Clock (CLK) and input data (D) setup time (referred to as setup time);
  • Clock (CLK) and input data (D) hold time (referred to as hold time);
  • the timing check of the traditional digital integrated circuit is mainly the check of the setup time and the check of the hold time.
  • the check of the setup time and the check of the hold time occurs in the previous stage register (such as the send register, or the launch register (launch register)) and the post-stage register. between level registers (such as capture registers, or capture registers).
  • FIG. 7 is a schematic diagram of timing check of the transmit register and the capture register. As shown in Figure 7, the output data Q1 end of the transmit register is connected to the input data D2 end of the capture register.
  • the timing check of the transmit register and the capture register can include the check of the setup time and the hold time corresponding to the transmit register to the capture register, where the flag
  • the curves with arrows for setup time and hold time represent the check of setup time and the check of hold time from the transmit register to the capture register.
  • the timing model of the dual-edge register provided by the present application can establish two sets of timing checks for two different clock edges (including rising and falling), wherein each set of timing checks includes setup time. Inspection and hold time inspection.
  • FIG. 8 is a schematic diagram of a timing model of a dual-edge register provided by the present application.
  • the timing model of the dual-edge register provided by the present application includes a timing model triggered by a rising edge (shown by a dotted line in FIG. 8 ) and a timing model triggered by a falling edge (shown by a solid line in FIG. 8 ), and Both the rising-edge-triggered timing model and the falling-edge-triggered timing model include the following timing information:
  • timing analysis tools commonly used in the industry (hereinafter referred to as timing analysis tools)
  • timing analysis tools To achieve the purpose of double edge timing check, the specific proof is as follows:
  • FIG. 9 is a schematic structural diagram of the timing check between the double-edge register and the double-edge gate control unit.
  • the sending register is a double-edge register
  • the capture register is a double-edge register/gating unit
  • the output data Q1 end of the sending register is connected to the input data D2 end of the capture register. Since the dual-edge register can trigger the register to work on both the rising and falling edges, the timing check needs to consider that the clocks of the Launch register and Capture register are in-phase and in-phase.
  • FIG. 10 is a schematic diagram of the timing check between the dual-edge register and the dual-edge gate unit.
  • the timing analysis tool will perform four different setup times (referred to as setup (ie establishment))/hold time ( Abbreviated hold (ie hold) timing checks, represented by four different lines.
  • setup (ie establishment) setup (ie establishment)
  • hold time Abbreviated hold (ie hold) timing checks, represented by four different lines.
  • b in Figure 10 shows the correct setup time/hold time timing check required for the double edge register/gating unit. It can be seen that the check of the timing analysis tool can cover the correct timing check required by the double-edge register/gating unit, that is, the timing check of the double-edge register can be realized by using the traditional timing analysis tool and the timing modeling method. .
  • the Launch register and Capture register clock inversion it is similar to the same phase of the clock.
  • FIG. 11 is another schematic diagram of the timing check between the dual-edge register and the dual-edge gate unit. As shown in Figure 11, it can be seen that the check of the timing analysis tool can cover the correct timing check required by the double-edge register/gating unit. Similarly, it can be seen that for the case where the Launch register and Capture register clocks are inverted, Using traditional timing analysis tools and timing modeling methods, the timing check of dual-edge registers can still be achieved.
  • the Launch register ie, the transmit register
  • the Capture register ie, the capture register
  • Figure 12 is a schematic diagram of the timing check between the dual-edge register and the single-edge register.
  • the Launch register is a single-edge register
  • the Capture register is a double-edge register/gating unit, as shown in Figure 13.
  • Figure 13 is a schematic diagram of the timing check between the single-edge register and the double-edge register/gating unit. It can be seen that for the timing modeling of the above double-edge register/gating unit, the setup time/hold time check of the timing check tool is the correct timing check method.
  • FIG. 14 is a diagram showing the corresponding relationship between the double-edge register and the double-edge gating unit and the pseudo-single-edge register and the pseudo-single-edge gating unit.
  • shaded boxes represent real dual-edge registers and real dual-edge gated unit models
  • unshaded boxes represent pseudo-single-edge registers and pseudo-single-edge gated unit models. The following mainly describes how to use the above models in various stages of digital integrated circuit design, so that double-edge registers and double-edge gate control units can be used for the functions and DFT logic on the clock tree.
  • FIG. 15 is a design flow chart of the digital integrated circuit provided by the present application.
  • the design flow of digital integrated circuits includes logic design, functional verification, logic synthesis, formal verification and timing analysis of logic synthesis, design for testability, formal verification and timing analysis of testable design, physical design , test vector generation, as well as physical verification and test vector verification.
  • the design process of the above parts includes:
  • Verilog hardware description language code can be used to convert the design into logic gates.
  • Design for testability is followed by formal verification and timing analysis of the design for testability.
  • testability design insertion After completing the testability design insertion, the physical design will be carried out. After the physical design is realized, the formal verification will be carried out, and the final design will be converted into the final file required by the manufacturing plant. At the same time, test vector generation is performed on the final logical netlist (netlist).
  • Formal verification ensures that the functions are consistent at different stages of design implementation; timing analysis can ensure that the timing of the entire design meets the original design requirements.
  • netlist function simulation and DFT vector verification for netlist (netlist), netlist function simulation, post-simulation and test vector verification, etc. to ensure the correctness of logic functions and test vectors.
  • Physical verification and power integrity analysis ensure the correctness of the physical implementation.
  • the pseudo-single-edge register model is used for the timing analysis of each stage, including logic design, functional verification, logic synthesis, testability design, physical design, test vector generation, and formal verification based on pseudo-unit registers.
  • Edge registers and double edge gate units seamlessly integrate with traditional flow and timing analysis tools.
  • the half-frequency clock scheme can directly have engineering practicability without adding a frequency multiplier circuit.
  • the above-mentioned design and implementation methods for double-edge registers and double-edge gate control units are not limited to the sequential logic carried by the half-frequency clock, but can also be extended to general sequential logic design, which can be determined according to the actual application scenario. No restrictions.
  • the target module can restore the clock of the half-frequency clock signal to the full frequency through a clock multiplier (Doubler) with an adjustable clock pulse width.
  • the target module may include a clock pulse width adjustable frequency multiplier, and the clock frequency of the clock signal to be transmitted is adjusted from the first clock frequency to the target frequency through the clock pulse width adjustable frequency multiplier.
  • FIG. 16 is a schematic circuit diagram of a conventional clock frequency multiplying unit. As shown in Figure 16, the traditional clock frequency multiplication unit is implemented by a delay XOR unit.
  • the pulse width of the frequency multiplier varies with process, voltage, temperature and other factors, if only the traditional clock frequency multiplication unit is used, the length of the clock tree after the frequency multiplier will be caused by the different number of registers carried by different frequency multiplication units. Different, the traditional frequency multiplier unit will have the reliability risk of clock pulse width in long-distance transmission. At the same time, if the design has a large voltage range requirement, the traditional frequency multiplier unit will have the risk of clock pulse width, which makes the frequency multiplier unit limit the use range of the design voltage.
  • the clock frequency multiplication unit with adjustable clock pulse width provided by the present application can realize the adjustment of the clock pulse width and has high applicability. Referring to FIG. 17 , FIG.
  • the clock pulse width adjustable frequency multiplier includes a delay selection terminal and one or more pulse width adjustment units (assuming 2 pulse width adjustment units), and one pulse width adjustment unit includes multiple buffers or multiple inverters. Based on the input signal of the delay selection terminal, it can be determined to connect the pulse width adjustment unit or bypass the pulse width adjustment unit.
  • the delay is adjusted through the delay selection terminal (DSEL), as shown in Figure 17, through the delay selection terminal, three gears can be adjusted through two bits, and the actual design can also add more gears as needed , that is, increase the bit width of the DSEL.
  • DSEL delay selection terminal
  • DSEL delay selection terminal
  • the high pulse width of the multiplied clock is determined by the number of buffers or inverters in the dotted box, that is, the pulse width adjustment unit shown in the dotted box at this time Valid (that is, the pulse width adjustment unit shown in the dotted box is connected to the clock pulse width adjustable frequency multiplier), the buffer in the dotted box (that is, the pulse width adjustment unit shown in the dotted box) will be bypassed, At this time, this gear is the default gear of pulse width.
  • the buffers or inverters in the dotted frame and the dotted frame will work at the same time, that is, the pulse width adjustment unit shown in the dotted frame and the pulse width adjustment unit shown in the dotted frame are connected at the same time. Input into the clock pulse width adjustable frequency multiplier, and it is valid at the same time. At this time, the high-level pulse width of the multiplied clock reaches the maximum.
  • DSEL is 10
  • the buffers or inverters in the dotted frame and the dotted frame are bypassed, that is, the pulse width adjustment unit shown in the dotted frame and the pulse width adjustment unit shown in the dotted frame are simultaneously bypassed. Bypass, at this time, the high-level pulse width of the multiplied clock is the smallest. Multiplier bypassable control can be achieved through the EDGE_MODE signal.
  • the commonly used delay can be designed to half the clock cycle under low voltage, or it can be designed according to the actual design. It is determined by demand, and there is no limit here.
  • FIG. 18 is another schematic structural diagram of the clock tree architecture provided by the present application.
  • the clock source (assuming a PLL) generates a high-frequency clock, and the frequency is divided by two before the clock enters the clock tree, that is, the high-frequency clock signal generated by the clock source can pass through the two-frequency divider
  • the half-frequency clock signal is obtained, and the half-frequency clock signal can be input into the clock tree and transmitted to the target module based on the clock tree.
  • the functional logic on the circuit driven by the clock tree and the registers and gating units in the DFT OCC logic use double-edge registers and double-edge gating units.
  • the target modules (such as module 1 and module 2) may include frequency multipliers (such as clock pulse width adjustable frequency multipliers), and high-frequency clocks are recovered in module 1 and module 2 through the frequency multiplier.
  • the structure of the target module two is the same:
  • FIG. 19 is another schematic structural diagram of the clock tree architecture provided by the present application.
  • the clocks are all in half-frequency mode, until the timing unit at the end of the clock tree multiplies the half-frequency clock to recover.
  • the frequency multiplier and the register 1 can be applied to the target module 1
  • the frequency multiplier and the register 2 can be applied to the target module 2. Because the distance from the frequency multiplier to the register unit is very close, the pulse width of the frequency multiplier can be reduced.
  • the register behind the frequency multiplier can be replaced with a latch (Latch), which can be determined according to the actual application scenario, and is not limited here.
  • Latch latch
  • the structure of the target module is three:
  • FIG. 20 is another schematic structural diagram of the clock tree architecture provided by the present application.
  • the present application provides a complete implementation of dual-edge registers and dual-edge gate control units, so the target module can also directly use dual-edge registers and dual-edge gate control units without using frequency multiplication unit, simple operation and high applicability.
  • the target module may include a target double-edge gate control unit and a target double-edge register, and the sequential logic circuit of the target module is implemented by the target double-edge gate control unit and the target double-edge register.
  • the double-edge gating unit and the double-edge register 1 are applicable to the target module 1
  • the double-edge gating unit and the double-edge register 2 are applicable to the target module 2.
  • the high-frequency clock signal generated by the clock source can be adjusted to a half-frequency clock signal through a two-frequency divider, instead of directly outputting a low-frequency clock from a traditional clock source such as a PLL, which can avoid the duty cycle of the PLL output clock Not a 1:1 effect on the clock in subsequent circuits.
  • Transmission of half-frequency clock signals on the clock tree can save the power consumption of clock signal transmission.
  • the sequential logic circuit of the clock tree based on double-edge registers and double-edge gate control units can make half-frequency clock transmission engineering practical and enhance the The reliability of clock signal transmission is higher, and the applicability is higher.
  • FIG. 21 is a schematic flowchart of the improved clock signal transmission method of the present application.
  • the clock signal transmission method provided by the application is applicable to the frequency divider in the above-mentioned clock data architecture provided by the application, and the method comprises the steps:
  • the divider by two receives a clock signal from a clock source.
  • the first clock frequency is half of the target clock frequency.
  • S212 Input the clock signal to be transmitted into the clock tree, and transmit the clock signal to be transmitted to the target module through the clock tree.
  • the sequential logic circuit of the clock tree is implemented with double-edge registers and double-edge gate control units.
  • the method further includes:
  • the clock frequency of the clock signal to be transmitted is adjusted from the first clock frequency to the target frequency through the adjustable frequency multiplier of the clock pulse width of the target module.
  • the method further includes:
  • the clock signal to be transmitted is received through the target double-edge gating unit and the target double-edge register included in the target module, so as to realize the sequential logic circuit of the target module through the target double-edge gating unit and the target double-edge register.
  • the two-frequency divider can adjust the high-frequency clock signal generated by the clock source to a half-frequency clock signal, instead of directly outputting a low-frequency clock from a traditional clock source such as a PLL, which can avoid the duty cycle of the PLL output clock not being different. 1:1 effect on the clock in subsequent circuits. Transmission of half-frequency clock signals on the clock tree can save the power consumption of clock signal transmission.
  • the sequential logic circuit of the clock tree based on double-edge registers and double-edge gate control units can make half-frequency clock transmission engineering practical and enhance the The reliability of clock signal transmission is higher, and the applicability is higher.
  • the present application further provides a chip, where the chip includes the above clock tree architecture provided by the present application.
  • the present application provides an electronic device, and the electronic device includes the clock tree architecture provided by the present application or the above chip.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • Manipulation Of Pulses (AREA)

Abstract

本申请提供了一种时钟树架构、时钟信号传输方法及设备,该时钟树架构包括时钟源,二分频器和时钟树,其中该时钟源用于产生时钟信号,二分频器用于将时钟源产生的时钟信号的目标时钟频率降低为第一时钟频率以得到待传输时钟信号,第一时钟频率为目标频率的一半,时钟树,用于接收待传输时钟信号,并将待传输时钟信号传送至目标模块,该时钟树的时序逻辑电路采用双沿寄存器和双沿门控单元实现。采用本申请可节省时钟信号传输的功耗,增强时钟信号传输的可靠性,适用性更高。

Description

时钟树架构、时钟信号传输方法及设备 技术领域
本申请涉及集成电路技术领域,尤其涉及一种时钟网络、时钟信号传输方法及设备。
背景技术
随着芯片规模增大和时钟频率的提高,高频时钟信号的远距离传输会带来时钟信号完整性的风险,高频时钟信号远距离传输时钟的延迟也会对全局同步设计的时序收敛带来困难,同时对于大型数字系统芯片(system on chip,SOC),时钟上的功耗也会显著增加。因此,如图1,业界通常采用H时钟树(H-tree)技术,通过利用特殊定制的时钟树缓存单元(buffer,图中三角形)和高层金属走线来使得时钟信号质量得到增强,同时显著减小了时钟树的延时,减小全局同步设计的时序收敛的代价。然而,特殊定制的时钟树缓存和高层金属走线会使得时钟树的功耗显著增加,同时,时钟频率持续提高后时钟信号完整性风险显著增加。
本申请的发明人在研究和实践过程中发现,现有技术对时钟信号传输采用低频,如图2,在时钟源直接产生低频时钟信号,将低频时钟信号传输至需要用到时钟信号的模块之后在模块(如模块1和模块2)通过倍频电路(如倍频电路1和倍频电路2)产生高频时钟信号(如高频时钟信号1和高频时钟信号2)。然而,模块产生高频时钟的代价大,导致了芯片的功耗显著增加,同时不同的模块产生的高频时钟难以做到同步,适用性差。
发明内容
本申请提供了一种时钟树架构、时钟信号传输方法及设备,可节省时钟信号传输的功耗,增强时钟信号传输的可靠性,适用性更高。
第一方面,本申请提供了一种时钟树架构,该时钟树架构包括:时钟源,二分频器和时钟树。其中时钟源用于产生时钟信号,二分频器用于将时钟源产生的时钟信号的目标时钟频率降低为第一时钟频率以得到待传输时钟信号。这里第一时钟频率为目标频率的一半。该时钟树用于接收该待传输时钟信号,并将待传输时钟信号传送至目标模块,这里时钟树的时序逻辑电路采用双沿寄存器和双沿门控单元实现。在本申请中,通过二分频器可将时钟源产生的高频时钟信号调整为半频时钟信号,在时钟树上传输半频时钟信号,可节省时钟信号传输的功耗,同时基于双沿寄存器和双沿门控单元实现时钟树的时序逻辑电路可使得半频时钟传输具有工程可实用性,增强时钟信号传输的可靠性,适用性更高。
结合第一方面,在第一种可能的实现方式中,上述目标模块中包括时钟脉宽可调倍频器,时钟脉宽可调倍频器用于将待传输时钟信号的时钟频率从第一时钟频率调整为目标频率。在本申请中,在目标模块中通过时钟脉宽可调倍频器产生倍频时钟,可实现不同倍频器的同步,可降低倍频器的功耗,增强半频时钟信号传输的可靠性,适用性更高。
结合第一方面第一种可能的实现方式,在第二种可能的实现方式中,上述时钟脉宽可调倍频器中包括延时选择端和至少一个脉宽调节单元。这里,脉宽调节单元的数量可根据延时选择的位宽调节需求确定。基于延时选择端的输入信号可确定接入上述脉宽调节单元 或者旁路上述脉宽调节单元,从而可调节倍频时钟的高电平脉宽,操作灵活,适用性高。
结合第一方面第二种可能的实现方式,在第三种可能的实现方式中,上述脉宽调节单元中包括多个缓冲器或者多个反相器。在本申请中,脉宽调节单元中缓冲器或者反相器的数量可由倍频时钟的高脉宽的调节需求确定,脉宽调节单元中缓冲器或者反相器的数量可灵活调节,适用性高。
结合第一方面,在第四种可能的实现方式中,上述目标模块中包括目标双沿门控单元和目标双沿寄存器,目标模块的时序逻辑电路由目标双沿门控单元和目标双沿寄存器实现。在本申请中,目标模块可通过双沿门控单元和双沿寄存器接收时钟树上传输的半频时钟信号,基于目标模块的双沿门控单元和双沿寄存器实现目标模块的时序逻辑电路,从而可实现基于半频时钟信号实现目标模块的功能,无需倍频器,目标模块的结构简单,适用性高。
第二方面,本申请提供了一种时钟信号传输方法,该时钟信号传输方法适用于第一方面至第一方面第四种可能的实现方式中任一种提供的时钟树架构中的二分频器,该方法包括:从时钟源接收时钟信号;将时钟源产生的时钟信号的目标时钟频率降低为第一时钟频率以得到待传输时钟信号。这里,第一时钟频率为目标时钟频率的一半。将待传输时钟信号输入时钟树,并通过时钟树将待传输时钟信号传送至目标模块,这里时钟树的时序逻辑电路采用双沿寄存器和双沿门控单元实现。
结合第二方面,在第一种可能的实现方式中,该方法还包括通过目标模块的时钟脉宽可调倍频器将待传输时钟信号的时钟频率从第一时钟频率调整为目标频率。
结合第二方面,在第二种可能的实现方式中,该方法还包括通过目标模块中包括的目标双沿门控单元和目标双沿寄存器接收待传输时钟信号,以通过目标双沿门控单元和目标双沿寄存器实现目标模块的时序逻辑电路。
第三方面,本申请提供了一种芯片,该芯片包括上述第一方面至第一方面第四种可能的实现方式中任一种提供的时钟树架构。
第四方面,本申请一种电子设备,该电子设备包括上述第一方面至第一方面第四种可能的实现方式中任一种提供的时钟树架构或者上述第三方面提供的芯片。
在本申请中,通过二分频器可将时钟源产生的高频时钟信号调整为半频时钟信号,在时钟树上传输半频时钟信号,可节省时钟信号传输的功耗,同时基于双沿寄存器和双沿门控单元实现时钟树的时序逻辑电路可使得半频时钟传输具有工程可实用性,增强时钟信号传输的可靠性,适用性更高。
附图说明
图1是时钟树的一结构示意图;
图2是时钟树的另一结构示意图;
图3是本申请提供的时钟树架构的一结构示意图;
图4是本申请提供的二分频器的波形示意图;
图5是本申请提供的时钟树架构和传统时钟树架构的结构示意图;
图6是单沿寄存器的时序模型示意图;
图7是发送寄存器和捕获寄存器的时序检查示意图;
图8是本申请提供的双沿寄存器的时序模型示意图;
图9是双沿寄存器和双沿门控单元之间的时序检查的结构示意图;
图10是双沿寄存器和双沿门口单元之间的时序检查的一示意图;
图11是双沿寄存器和双沿门口单元之间的时序检查的另一示意图;
图12是双沿寄存器和单沿寄存器之间的时序检查示意图;
图13是单沿寄存器和双沿寄存器/门控单元之间的时序检查示意图;
图14是双沿寄存器和双沿门控单元与伪单沿寄存器和伪单沿门控单元的对应关系图;
图15是本申请提供的数字集成电路的设计流程图;
图16是传统时钟倍频单元的电路示意图;
图17是本申请提供的时钟脉宽可调倍频器的电路示意图;
图18是本申请提供的时钟树架构的另一结构示意图;
图19是本申请提供的时钟树架构的另一结构示意图;
图20是本申请提供的时钟树架构的另一结构示意图;
图21是本申请提高的时钟信号传输方法的流程示意图。
具体实施方式
本申请提供的时钟树架构可适用于大型数字SOC,该大型数字SOC可适用于计算机系统或者服务器,该计算机系统或者服务器可与众多其它通用或专用计算系统、环境或配置一起操作。这里,适用于与上述计算机系统或者服务器一起使用的计算系统、环境和/或配置包括但不限于:个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统、大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。这里,计算机系统或者服务器可以在由计算机系统执行的计算机系统可执行指令(诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统或者服务器可以在分布式云计算环境中实施,分布式云计算环境中,任务由通过通信网络链接的远程处理设备执行。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。
参见图3,图3是本申请提供的时钟树架构的一结构示意图。在本申请提供的时钟树架构中包括时钟源、二分频器和时钟树,可实现在远距离传输时钟时,将时钟频率降低为目标时钟频率的一半,可显著降低时钟树动态功耗,增强时钟信号传输的可靠性。如图3所示,在本申请提供的时钟树架构中,时钟源连接二分频器,并通过二分频器连接时钟树。时钟源产生的时钟信号经过二分频器之后可输出待传输时钟信号,通过时钟树可将待传输信号传送至目标模块(如模块1和模块2)。时钟树所带的时序逻辑电路采用双沿寄存器和双沿门控单元实现。这里,时钟源可以是锁相环(phase lock loop,PLL),也可以是PLL之外的其他用于产生时钟信号的功能模块,具体可根据实际应用场景确定,在此不做限制。可以理解,时钟源用于产生时钟信号,此时时钟源产生的时钟信号可以是高频时钟信号,为方便描述,可假设时钟源产生的时钟信号的时钟频率为目标频率。也就是说,在本申请 提供的时钟树架构中,时钟源可以直接产生高频时钟信号,无需时钟源直接产生低频时钟,可避免时钟源输出时钟的占空比不是1:1对时钟树架构的后续电路的时钟的影响,操作简单,适用性高。
在一些可行的实施方式中,在高频时钟输出加入二分频器,基于二分频器可将时钟源产生的高频时钟信号的时钟频率降低一半得到半频时钟信号。具体的,时钟源产生高频时钟信号之后,可将该高频时钟信号输出至二分频器,基于二分频器可将时钟源产生的高频时钟信号的时钟频率降低为目标频率的一半,以得到待传输时钟信号。也就是说,这里待传输时钟信号的时钟频率为目标频率的一半。这里,二分频器可以是寄存器分频,该寄存器分配的波形如图4所示。图4是本申请提供的二分频器的波形示意图。如图4,基于二分频器的波形可以看到二分频器的输出时钟Q的跳变都是由输入时钟CLK的上升沿产生,基于常用的电子设计自动化(electronic design automation,EDA)工具可以精确计算由于二分频器导致的输出时钟占空比的变化。可以理解,在实际应用场景中,二分频器也可以采用寄存器分配之外的其他类型的分频器,只要能达到避免时钟源输出时钟占空比不是1:1对后续电路中时钟的影响的效果即可,因此二分频器的类型选取可根据实际应用场景确定,在此不做限制。
在一些可行的实施方式中,二分频器将时钟源产生的高频时钟信号处理为时钟频率只有目标频率的一半的待传输时钟信号之后,则可将待传输时钟信号输入时钟树,通过时钟树将待传输时钟信号传送至目标模块,将高频时钟信号的传输转换为半频时钟信号(即时钟频率为目标频率的一半的时钟信号)的传输可降低时钟信号传输的功耗,增强时钟信号传输的可靠性。换句话说,时钟树可接收二分频器输出的待传输时钟信号,并将待传输时钟信号传送至目标模块。这里目标模块可以是计算机系统和/或服务器中用于执行具体任务、需要由时钟树传送过来的时钟来进行驱动的任一功能模块,具体可根据实际应用场景确定,在此不做限制。
在一些可行的实施方式中,二分频器输出的待传输时钟信号的时钟频率为目标频率的一半,此时为了更好地将待传输时钟信号传送至目标模块,时钟树上携带的时序逻辑电路可采用双沿寄存器和双沿门控单元实现,从而可使得半频时钟信号在时钟树上传输具有工程可实用性,适用性更高。参见图5,图5是本申请提供的时钟树架构和传统时钟树架构的结构示意图。如图5所示,在传统时钟树架构中,时钟源产生高频时钟信号之后输入全频时钟树,通过全频时钟树将高频时钟信号传送至模块(如模块1和模块2),全频时钟树上携带的时序逻辑电路采用寄存器和门控单元实现,而高频时钟信号在全频时钟树上的远距离传输时会带来时钟信号完整性风险。本申请提供的时钟树架构可将高频时钟信号在全频时钟树上的远距离传输转换为半频时钟信号在半频时钟树上的传输,此时,半频时钟树上的时序逻辑电路可采用双沿寄存器和双沿门控单元实现,可增强半频时钟信号传输的可行性,同时可节省时钟信号传输的功耗。
在一些可行的实施方式中,由于半频时钟信号在时钟树上的传输的距离比较长,分频时钟带的时序逻辑电路分布较广,因此为了减少设计复杂度,通常是半频时钟不带时序逻辑电路。如果在实际应用场景中,在功能设计角度上考虑半频时钟确实需要带时序逻辑电路,此时,为了保证时序逻辑电路的逻辑功能和全频时钟信号传输模式下的逻辑功能一致, 只能将不同位置的时序逻辑电路的时钟频率从半频恢复到全频。然而,这样的实现方式会导致时钟树架构中加入大量的时钟倍频单元,从而导致时钟树架构的功耗的显著增加,使得半频时钟信号的远距离传输不具有工程可实用性。在本申请提供的时钟树架构,在半频时钟树上携带的时序逻辑电路可采用双沿寄存器和双沿门控单元实现,可避免在时钟树架构中加入大量的时钟倍频单元,适用性高。
在一些可行的实施方式中,在集成电路的设计实现流程中,针对寄存器和门控单元,在不同的设计实现阶段可采用不同的模型,因此在本申请提供的时钟树架构的设计实现流程中,可对上述双沿寄存器和双沿门控单元进行建模,同时建模为对应的两个模型,包括一个模型为真实的双沿寄存器和双沿门控单元,另外一个模型为伪单沿寄存器和伪单沿门控单元模型。这里,伪单沿寄存器和伪单沿门控单元可以理解为虚拟的单沿寄存器和虚拟的单沿门控单元,伪单沿寄存器和伪单沿门控单元是为了实现真实存在的双沿寄存器和双沿门控单元的功能而设计的虚拟模型(或称伪模型),且这种伪模型和传统的寄存器和门控单元的模型的建模方式相同。在具体的设计实现流程中,传统的寄存器和门控单元所需要建立的模型类型可包括功能模型、时序模型、物理模型和Scan测试模型(如表1中第一列所示,表1为传统的寄存器和门控单元、双沿寄存器和双沿门控单元以及伪单沿寄存器和伪单沿门控单元的建模关系表),针对双沿寄存器和双沿门控单元、伪单沿寄存器和伪单沿门控单元所需建立的模型以及模型的建立方式与传统的寄存器和门控单元所需建立的模型不尽相同,如下表1:
表1
Figure PCTCN2020105288-appb-000001
如表1所示,在传统的寄存器和门控单元建立的模型类型中,功能模块可采用业界经典的超高速集成电路硬件描述语言(very-high-speed integrated circuit hardware description language,VHDL)或者Verilog语言实现,时序模型可采用时序库实现,物理模型可采用电路物理模型(library exchange format,LEF)实现,测试模型可采用Scan测试模型实现。可以理解,上述传统的寄存器和门控单元所需建立的模型中各模型建立的实现方式可根据实际应用场景确定,在此不做限制。对于双沿寄存器和双沿门控单元,功能模型同样可采用业界经典的VHDL或者Verilog语言实现,时序模型可采用本申请提供的时序建模方式,且无需建立物理模型(即物理模型为空)和测试模型(即测试模型为空)。如表1所示,对于伪单沿寄存器和伪单沿门控单元,需要建立的模型同样包括功能模型、时序模型、物理模型和测试模型。同理,功能模块可采用业界经典的VHDL或者Verilog语言实现,时序模型可采用业界经典的寄存器描述方法实现,物理模型只包含物理实现需要的信息而不包含双沿信息,且该物理模型可采用业界经典的物理模型描述方法实现,测试模型用于实现测试逻辑生成和测试向量生成,且该测试模型可采用Scan测试模型实现。同 时,对于伪单沿寄存器和伪单沿门控单元所需建立的模型中各模型建立的实现方式可根据实际应用场景确定,在此不做限制。
在一些可行的实施方式中,参见图6,图6是单沿寄存器的时序模型示意图。如图6所示,传统的寄存器(即单沿寄存器)的时序模型中通常包括如下时序信息(Timing Arcs):
1、时钟(CLK)和输入数据(D)建立时间(简称建立时间setup time);
2、时钟(CLK)和输入数据(D)保持时间(简称保持时间hold time);
3、时钟(CLK)到输出数据(Q)输出时间(CLK-to-Q delay,简称输出时间)。
传统的数字集成电路的时序检查主要是setup time的检查和hold time的检查,setup time的检查和hold time的检查发生在前级寄存器(如发送寄存器,或称launch寄存器(launch register))和后级寄存器(如捕获寄存器,或称capture寄存器(capture register))之间。参见图7,图7是发送寄存器和捕获寄存器的时序检查示意图。如图7所示,发送寄存器的输出数据Q1端连接捕获寄存器的输入数据D2端。假设发送寄存器和捕获寄存器均为上升沿触发的寄存器,如图7所示,发送寄存器和捕获寄存器的时序检查可包括从发送寄存器到捕获寄存器对应的建立时间的检查和保持时间的检查,其中标记为建立时间(setup time)和保持时间(hold time)带箭头的曲线表示从发送寄存器到捕获寄存器对应的建立时间的检查和保持时间的检查。
在一些可行的实施方式中,本申请提供的双沿寄存器的时序模型,可针对两个不同的时钟沿(包括上升和下降)分别建立两套时序检查,其中每套时序检查中包括setup time的检查和hold time的检查。参见图8,图8是本申请提供的双沿寄存器的时序模型示意图。如图8所示,本申请提供的双沿寄存器的时序模型中包括上升沿触发的时序模型(图8中虚线所示)和下降沿触发的时序模型(图8中实线所示),且上升沿触发的时序模型和下降沿触发的时序模型中均包括如下时序信息:
1、时钟(CLK)和输入数据(D)建立时间;
2、时钟(CLK)和输入数据(D)保持时间;
3、时钟(CLK)到输出数据(Q)输出时间。
在一些可行的实施方式中,由于当前业界没有直接针对双沿寄存器、双沿门控单元的时序分析工具,本申请提供的时序建模可以利用业界常用的时序分析工具(以下简称时序分析工具)达到双沿的时序检查目的,具体证明如下:
假设本申请提供的时序建模的实现方式中同时存在双沿寄存器、双沿门控单元、传统单沿寄存器和传统单沿门控单元,则在一个实际设计中可能存在如下两种和双沿寄存器相关的时序检查:
1)双沿寄存器、双沿门控单元之间的时序检查
参见图9,图9是双沿寄存器和双沿门控单元之间的时序检查的结构示意图。如图9所示,假设发送寄存器为双沿寄存器,捕获寄存器为双沿寄存器/门控单元,发送寄存器的输出数据Q1端连接捕获寄存器的输入数据D2端。由于双沿寄存器在上升沿和下降沿均可以触发寄存器工作,因此,时序检查需要考虑Launch寄存器和Capture寄存器的时钟为同相和反相两种情况。参见图10,图10是双沿寄存器和双沿门口单元之间的时序检查的一示意图。如图10中a所示,Launch寄存器和Capture寄存器时钟同相时,针对上述双沿 寄存器的时序建模,时序分析工具会进行四种不同的setup time(简称setup(即建立))/hold time(简称hold(即保持))时序检查,用四种不同的线条表示。图10中b表示了针对双沿寄存器/门控单元,需要的正确的setup time/hold time时序检查。可以看到,时序分析工具的检查能覆盖到双沿寄存器/门控单元需要的正确的时序检查,也就是说,利用传统的时序分析工具加上时序建模方法可以实现双沿寄存器的时序检查。针对Launch寄存器和Capture寄存器时钟反相的情况和时钟同相类似。参见图11,图11是双沿寄存器和双沿门口单元之间的时序检查的另一示意图。如图11所示,可以看到,时序分析工具的检查能覆盖到双沿寄存器/门控单元需要的正确的时序检查,同理可以看出,针对Launch寄存器和Capture寄存器时钟反相的情况,利用传统的时序分析工具加上时序建模方法依然可以实现双沿寄存器的时序检查。
2)双沿寄存器、双沿门控单元和单沿寄存器、单沿门控单元之间的时序检查
针对双沿寄存器/门控单元和传统单沿寄存器之间的检查也可能存在两种情况,一种为Launch寄存器(即发送寄存器)为双沿寄存器,Capture寄存器(即捕获寄存器)为单沿寄存器,如图12,图12是双沿寄存器和单沿寄存器之间的时序检查示意图。另外一种情况,Launch寄存器为单沿寄存器,Capture寄存器为双沿寄存器/门控单元,如图13,图13是单沿寄存器和双沿寄存器/门控单元之间的时序检查示意图。可以看到,针对上述双沿寄存器/门控单元的时序建模,时序检查工具的setup time/hold time检查即为正确的时序检查方式。
在一些可行的实施方式中,针对双沿寄存器/双沿门控单元,在设计的不同阶段分别利用不同的模型。参见图14,图14是双沿寄存器和双沿门控单元与伪单沿寄存器和伪单沿门控单元的对应关系图。为方便起见,如图14所示,用阴影框表示真实的双沿寄存器和真实的双沿门控单元模型,无阴影框表示伪单沿寄存器和伪单沿门控单元模型。下面主要描述了在数字集成电路设计的各个阶段,如何将上述模型用起来,使得双沿寄存器和双沿门控单元可以用于时钟树上所带的功能和DFT逻辑。
参见图15,图15是本申请提供的数字集成电路的设计流程图。如图15所示,在数字集成电路的设计流程中包括逻辑设计、功能验证、逻辑综合、逻辑综合的形式验证和时序分析、可测试性设计、可测试设计的形式验证和时序分析、物理设计、测试向量生成,以及物理验证和测试向量验证等。其中,上述各部分的设计流程包括:
1)首先,从逻辑功能设计开始,比如基于伪单沿寄存器的逻辑设计。
这部分通常用业界经典的Verilog或者VHDL语言描述。
2)完成逻辑设计以后,会对逻辑设计进行功能验证。
3)功能验证完成之后,会进行逻辑综合。
此时可将Verilog硬件描述语言代码将设计转化成逻辑门。
4)逻辑综合之后会对逻辑综合进行形式验证和时序分析,之后会进行可测试性设计。
可测试性设计之后会对可测试性设计进行形式验证和时序分析。
5)完成可测试性设计插入后会进行物理设计,物理设计实现之后会进行形式验证,最终设计转换成制造工厂需要的最终文件。同时,会在最终的逻辑网表(netlist)上进行测试向量生成。
如图15所示,在整个设计流程中,形式验证和时序分析贯穿其中,通过形式验证保证在设计实现的不同阶段,功能保持一致;时序分析可以确保整个设计的时序满足设计原始要求。此外,还会有针对网表(netlist)的网表功能仿真和DFT向量验证、网表功能仿真、后仿真以及测试向量验证等来保证逻辑功能和测试向量的正确性。物理验证和电源完整性分析保证了物理实现的正确性。
在一些可行的实施方式中,由于业界没有针对双沿寄存器和双沿门控单元的实现流程,通过本申请提供的上述不同阶段利用不同的双沿寄存器和双沿门控单元模型的实现流程,可以借助传统的数字集成电路设计流程实现双沿寄存器和双沿门控单元的使用。
在设计实现阶段,包括基于伪单元寄存器的逻辑设计、功能验证、逻辑综合、可测试性设计、物理设计、测试向量生成、形式验证等各阶段的时序分析均采用伪单沿寄存器模型,使得双沿寄存器和双沿门控单元和传统流程和时序分析工具无缝结合。
在网表功能仿真、DFT向量验证、测试向量验证、物理验证、电源完整性分析、最终时序分析阶段采用真实的双沿寄存器和双沿门控单元模型。
通过上述实现方式可在不需要增加倍频电路的情况下,使得半频时钟方案直接具有工程可实用性。上述针对双沿寄存器和双沿门控单元的设计实现方法并不局限于半频时钟上所带的时序逻辑,也可以推广到一般的时序逻辑设计上,具体可根据实际应用场景确定,在此不做限制。
在一些可行的实施方式中,时钟树将半频时钟信号传送至目标模块之后,目标模块可通过时钟脉宽可调节的时钟倍频器(Doubler)将半频时钟信号的时钟恢复为全频。换句话说,目标模块中可包括时钟脉宽可调倍频器,通过时钟脉宽可调倍频器将待传输时钟信号的时钟频率从第一时钟频率调整为目标频率。参加图16,图16是传统时钟倍频单元的电路示意图。如图16所示,传统的时钟倍频单元采用延时异或单元实现。然而,由于倍频器的脉宽随工艺、电压以及温度变化等因素,如果仅采用传统的时钟倍频单元会因为不同倍频单元所带的寄存器个数不同导致倍频器后的时钟树长度不同,传统倍频单元会存在时钟脉宽在长距离传输的可靠性风险。同时,如果设计存在较大的电压范围要求,传统倍频单元会存在时钟脉宽风险,使得倍频单元限制了设计电压的使用范围。本申请提供的时钟脉宽可调节的时钟倍频单元可实现时钟脉宽的调整,适用性高。参见图17,图17是本申请提供的时钟脉宽可调倍频器的电路示意图。如图17所示,时钟脉宽可调倍频器中包括延时选择端和一个或者多个脉宽调节单元(假设2个脉宽调节单元),一个脉宽调节单元中包括多个缓冲器或者多个反相器。基于延时选择端的输入信号可确定接入脉宽调节单元或者旁路脉宽调节单元。如图17所示,延时通过延时选择端(DSEL)调节,如图17所示,通过延时选择端可通过两位实现3档可调,实际设计也可以根据需要增加更多档位,即,增加DSEL的位宽。DSEL为00时,为默认档位;DSEL为01时,为脉宽增加档位;DSEL为10时,为脉宽减小档位。
在一些可行的实施方式中,在DSEL为00时,倍频时钟的高脉宽由点线框内的缓冲器或者反相器个数决定,即此时点线框所示的脉宽调节单元有效(即点线框所示的脉宽调节单元接入到时钟脉宽可调倍频器中),虚线框内的缓冲器(即虚线框所示的脉宽调节单元)会被旁路,此时该档位为脉宽默认档位。在DSEL为01时,点线框和虚线框内的缓冲器或 者反相器会同时起作用,即此时点线框所示的脉宽调节单元和虚线框所示的脉宽调节单元同时接入到时钟脉宽可调倍频器中,同时有效。此时,倍频时钟的高电平脉宽达到最大。在DSEL为10时,点线框和虚线框内的缓冲器或者反相器都被旁路,即此时点线框所示的脉宽调节单元和虚线框所示的脉宽调节单元同时被旁路掉,此时,倍频时钟的高电平脉宽为最小。倍频可旁路控制可通过EDGE_MODE信号实现。
默认延时(DSEL=00,图17中点线框内缓冲器或者反相器个数)根据时钟频率确定,常用的可以将延时在低压下设计到时钟周期的一半,也可以根据实际设计需求决定,在此不做限制。
目标模块的结构一:
参加图18,图18是本申请提供的时钟树架构的另一结构示意图。
在一些可行的实施方式中,如图18,时钟源(假设为PLL)产生高频时钟,在时钟进入时钟树之前进行二分频,即时钟源产生的高频时钟信号可通过二分频器得到半频时钟信号,半频时钟信号可输入时钟树,基于时钟树传送至目标模块。此时,在时钟树驱动的电路上的功能逻辑和DFT OCC逻辑中的寄存器和门控单元采用双沿寄存器和双沿门控单元。目标模块(比如模块1和模块2)中可包括倍频器(如时钟脉宽可调倍频器),通过倍频器在模块1和模块2中恢复高频时钟。
目标模块的结构二:
参见图19,图19是本申请提供的时钟树架构的另一结构示意图。
在一些可行的实施方式中,如图19所示,这是一个应用更能节省时钟功耗的应用场景。在图19所示的时钟树架构中,时钟均采用半频模式,直到时钟树末端的时序单元再将半频时钟进行倍频恢复。其中,倍频器和寄存器1可适用于目标模块1,倍频器和寄存器2可适用于目标模块2。由于,倍频器到所带的寄存器单元距离很近,可以减小倍频器的脉宽。可选的,图19所示的时钟树架构中,倍频器后面所带的寄存器可以替换为锁存器(Latch),具体可根据实际应用场景确定,在此不做限制。
目标模块的结构三:
参见图20,图20是本申请提供的时钟树架构的另一结构示意图。
在一些可行的实施方式中,本申请提供了完整的双沿寄存器和双沿门控单元的实现方案,因此在目标模块中还可以直接采用双沿寄存器和双沿门控单元而不使用倍频单元,操作简单,适用性高。换句话说,目标模块中可包括目标双沿门控单元和目标双沿寄存器,目标模块的时序逻辑电路由目标双沿门控单元和目标双沿寄存器实现。如图20所示,双沿门控单元和双沿寄存器1可适用于目标模块1,双沿门控单元和双沿寄存器2可适用于目标模块2。
在本申请中,通过二分频器可将时钟源产生的高频时钟信号调整为半频时钟信号,而非直接由传统的PLL等时钟源直接输出低频时钟,可避免PLL输出时钟占空比不是1:1对后续电路中时钟的影响。在时钟树上传输半频时钟信号,可节省时钟信号传输的功耗,同时基于双沿寄存器和双沿门控单元实现时钟树的时序逻辑电路可使得半频时钟传输具有工程可实用性,增强时钟信号传输的可靠性,适用性更高。
参见图21,图21是本申请提高的时钟信号传输方法的流程示意图。本申请提供的时 钟信号传输方法适用于上述本申请提供的时钟数据架构中的二分频器,该方法包括步骤:
S210,二分频器从时钟源接收时钟信号。
S211,将所时钟源产生的时钟信号的目标时钟频率降低为第一时钟频率以得到待传输时钟信号。
这里,第一时钟频率为目标时钟频率的一半。
S212,将所待传输时钟信号输入时钟树,并通过时钟树将待传输时钟信号传送至目标模块。
这里,时钟树的时序逻辑电路采用双沿寄存器和双沿门控单元实现。
在一些可行的实施方式中,该方法还包括:
通过目标模块的时钟脉宽可调倍频器将待传输时钟信号的时钟频率从第一时钟频率调整为目标频率。
在一些可行的实施方式中,该方法还包括:
通过目标模块中包括的目标双沿门控单元和目标双沿寄存器接收待传输时钟信号,以通过目标双沿门控单元和目标双沿寄存器实现目标模块的时序逻辑电路。
具体实现中,上述各个步骤中各个模块所执行的实现方式可参见上述本申请提供的时钟树架构中各个功能模块所执行的实现方式,在此不再赘述。
在本申请中,二分频器可将时钟源产生的高频时钟信号调整为半频时钟信号,而非直接由传统的PLL等时钟源直接输出低频时钟,可避免PLL输出时钟占空比不是1:1对后续电路中时钟的影响。在时钟树上传输半频时钟信号,可节省时钟信号传输的功耗,同时基于双沿寄存器和双沿门控单元实现时钟树的时序逻辑电路可使得半频时钟传输具有工程可实用性,增强时钟信号传输的可靠性,适用性更高。
在一些可行的实施方式中,本申请还提供了一种芯片,该芯片包括上述本申请提供的时钟树架构。
在一些可行的实施方式中,本申请一种电子设备,该电子设备包括上述本申请提供的时钟树架构或者上述芯片。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (10)

  1. 一种时钟树架构,其特征在于,所述时钟树架构包括:
    时钟源,用于产生时钟信号;
    二分频器,用于将所述时钟源产生的时钟信号的目标时钟频率降低为第一时钟频率以得到待传输时钟信号,所述第一时钟频率为所述目标频率的一半;
    时钟树,用于接收所述待传输时钟信号,并将所述待传输时钟信号传送至目标模块,所述时钟树的时序逻辑电路采用双沿寄存器和双沿门控单元实现。
  2. 根据权利要求1所述的时钟树架构,其特征在于,所述目标模块中包括时钟脉宽可调倍频器,所述时钟脉宽可调倍频器用于将所述待传输时钟信号的时钟频率从第一时钟频率调整为所述目标频率。
  3. 根据权利要求2所述的时钟树架构,其特征在于,所述时钟脉宽可调倍频器中包括延时选择端和至少一个脉宽调节单元;
    所述延时选择端的输入信号用于确定接入所述脉宽调节单元或者旁路所述脉宽调节单元。
  4. 根据权利要求3所述的时钟树架构,其特征在于,所述脉宽调节单元中包括多个缓冲器或者多个反相器。
  5. 根据权利要求1所述的时钟树架构,其特征在于,所述目标模块中包括目标双沿门控单元和目标双沿寄存器,所述目标模块的时序逻辑电路由所述目标双沿门控单元和所述目标双沿寄存器实现。
  6. 一种时钟信号传输方法,其特征在于,所述时钟信号传输方法适用于权利要求1-5任意一项所述的时钟树架构的二分频器,所述方法包括:
    从所述时钟源接收时钟信号;
    将所述时钟源产生的所述时钟信号的目标时钟频率降低为第一时钟频率以得到待传输时钟信号,所述第一时钟频率为所述目标时钟频率的一半;
    将所述待传输时钟信号输入时钟树,并通过所述时钟树将所述待传输时钟信号传送至目标模块,所述时钟树的时序逻辑电路采用双沿寄存器和双沿门控单元实现。
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    通过所述目标模块的时钟脉宽可调倍频器将所述待传输时钟信号的时钟频率从第一时钟频率调整为所述目标频率。
  8. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    通过所述目标模块中包括的目标双沿门控单元和目标双沿寄存器接收所述待传输时钟信号,以通过所述目标双沿门控单元和所述目标双沿寄存器实现所述目标模块的时序逻辑电路。
  9. 一种芯片,其特征在于,所述芯片包括如权利要求1-5任一项所述的时钟树架构。
  10. 一种电子设备,其特征在于,所述电子设备包括如权利要求1-5任一项所述的时钟树架构或者如权利要求9所述的芯片。
PCT/CN2020/105288 2020-07-28 2020-07-28 时钟树架构、时钟信号传输方法及设备 WO2022021091A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080104699.XA CN116209968A (zh) 2020-07-28 2020-07-28 时钟树架构、时钟信号传输方法及设备
PCT/CN2020/105288 WO2022021091A1 (zh) 2020-07-28 2020-07-28 时钟树架构、时钟信号传输方法及设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/105288 WO2022021091A1 (zh) 2020-07-28 2020-07-28 时钟树架构、时钟信号传输方法及设备

Publications (1)

Publication Number Publication Date
WO2022021091A1 true WO2022021091A1 (zh) 2022-02-03

Family

ID=80036965

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105288 WO2022021091A1 (zh) 2020-07-28 2020-07-28 时钟树架构、时钟信号传输方法及设备

Country Status (2)

Country Link
CN (1) CN116209968A (zh)
WO (1) WO2022021091A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114818595A (zh) * 2022-06-24 2022-07-29 飞腾信息技术有限公司 芯片模块接口时钟构建方法、装置、存储介质及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539958A (zh) * 2008-03-18 2009-09-23 北京芯慧同用微电子技术有限责任公司 一种标准单元库和集成电路的设计方法和装置
US20090327792A1 (en) * 2008-06-27 2009-12-31 Intel Corporation Bus frequency adjustment circuitry for use in a dynamic random access memory device
CN102047340A (zh) * 2008-05-28 2011-05-04 美光科技公司 用于多相时钟产生的设备和方法
CN102831273A (zh) * 2012-08-30 2012-12-19 锐迪科科技有限公司 包含双边沿触发器的数字集成电路设计方法
CN110827872A (zh) * 2018-08-14 2020-02-21 三星电子株式会社 延迟锁相环电路、半导体存储器设备和操作该电路的方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539958A (zh) * 2008-03-18 2009-09-23 北京芯慧同用微电子技术有限责任公司 一种标准单元库和集成电路的设计方法和装置
CN102047340A (zh) * 2008-05-28 2011-05-04 美光科技公司 用于多相时钟产生的设备和方法
US20090327792A1 (en) * 2008-06-27 2009-12-31 Intel Corporation Bus frequency adjustment circuitry for use in a dynamic random access memory device
CN102831273A (zh) * 2012-08-30 2012-12-19 锐迪科科技有限公司 包含双边沿触发器的数字集成电路设计方法
CN110827872A (zh) * 2018-08-14 2020-02-21 三星电子株式会社 延迟锁相环电路、半导体存储器设备和操作该电路的方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JI, YUNFEI: "Research and Preparation of Epoxy Resins Toughened by Dimer Acid Diglycidyl Ester", DISCOURSE OF THE FIFTEENTH NATIONWIDE SYMPOSIUM OF EPOXY RESIN APPLIED TECHNOLOGY(THE THIRTEENTH NATIONWIDE SYMPOSIUM OF E.R.A.T.HUAZHONG INSTITUTION, 31 May 2012 (2012-05-31), XP055892689, [retrieved on 20220216] *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114818595A (zh) * 2022-06-24 2022-07-29 飞腾信息技术有限公司 芯片模块接口时钟构建方法、装置、存储介质及电子设备
CN114818595B (zh) * 2022-06-24 2022-09-13 飞腾信息技术有限公司 芯片模块接口时钟构建方法、装置、存储介质及电子设备

Also Published As

Publication number Publication date
CN116209968A (zh) 2023-06-02

Similar Documents

Publication Publication Date Title
US7478256B2 (en) Coordinating data synchronous triggers on multiple devices
CN112000173B (zh) 一种检查跨时钟域多位信号时序违反的方法及系统
US9405877B1 (en) System and method of fast phase aligned local generation of clocks on multiple FPGA system
US8661378B2 (en) Asychronous system analysis
WO2022021091A1 (zh) 时钟树架构、时钟信号传输方法及设备
US20090271747A1 (en) Logic circuit designing device, logic circuit designing method and logic circuit designing program for asynchronous logic circuit
US8301933B2 (en) Multi-clock asynchronous logic circuits
US20170373675A1 (en) Method and apparatus for phase-aligned 2x frequency clock generation
CN111124363B (zh) 一种真随机数生成方法及真随机数发生器
US9627012B1 (en) Shift register with opposite shift data and shift clock directions
JP5431290B2 (ja) クロック・ドメイン・クロッシングのデータ転送回路および方法
US7982502B2 (en) Asynchronous circuit representation of synchronous circuit with asynchronous inputs
US8397189B2 (en) Model checking in state transition machine verification
Orabi et al. On the implementation of a rotated chaotic lorenz system on FPGA
Garcia et al. Synthesis of locally-clocked asynchronous systems with bundled-data implementation on FPGAs
US8756543B2 (en) Verifying data intensive state transition machines related application
Bhandari et al. Fpga based high performance asynchronous alu based on modified 4 phase handshaking protocol with tapered buffers
Manoj et al. Investigation of Duty Cycle Distortion in Clock Channels with Infinisim Clock edge Technology
CN114578895A (zh) 一种集成电路及其时钟信号配送方法
Poornima et al. Functional verification of clock domain crossing in register transfer level
Tatapudi et al. A high performance hybrid wave-pipelined multiplier
US7010072B2 (en) Aligned clock forwarding scheme
Semba et al. A study on the optimization of asynchronous circuits during RTL conversion from synchronous circuits
Semba et al. A Design Support Tool Set for Interface Circuits Between Synchronous and Asynchronous Modules
Bernardi et al. SAME 2004

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20947719

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20947719

Country of ref document: EP

Kind code of ref document: A1