WO2023098064A1 - 一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构 - Google Patents

一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构 Download PDF

Info

Publication number
WO2023098064A1
WO2023098064A1 PCT/CN2022/102672 CN2022102672W WO2023098064A1 WO 2023098064 A1 WO2023098064 A1 WO 2023098064A1 CN 2022102672 W CN2022102672 W CN 2022102672W WO 2023098064 A1 WO2023098064 A1 WO 2023098064A1
Authority
WO
WIPO (PCT)
Prior art keywords
clock
delay
chip
regional
adjustment unit
Prior art date
Application number
PCT/CN2022/102672
Other languages
English (en)
French (fr)
Inventor
匡晨光
张艳飞
陈波寅
范继聪
Original Assignee
无锡中微亿芯有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 无锡中微亿芯有限公司 filed Critical 无锡中微亿芯有限公司
Priority to US17/955,581 priority Critical patent/US20230016311A1/en
Publication of WO2023098064A1 publication Critical patent/WO2023098064A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/10Distribution of clock signals, e.g. skew
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • G06F30/396Clock trees

Definitions

  • the invention relates to the field of clock design, in particular to a chip clock architecture with adjustable clock offset of a programmable logic chip.
  • Clock Tree (Clock Tree) is often used in the design of programmable logic chips.
  • the current clock tree structure of programmable logic chips is a hierarchical design of clocks.
  • fishbone wiring is used to implement the clock structure, as shown in Figure 1.
  • the structure has the advantages of simple structure, obvious hierarchy, and good fit with programmable resources.
  • Small clock skew (skew) is an important indicator and purpose of clock tree design, but the existing fishbone structure clock architecture has large clock skew, which will affect the overall performance. Moreover, the clock skew will increase with the increase of the scale of the programmable logic chip. When the number of clock regions in the programmable logic chip increases to a certain scale, the clock skew will reach an unacceptable value.
  • the general ASIC clock tree synthesis technology can reduce the clock skew, but because it does not plan the clock hierarchy, block, layout, and routing in advance, it cannot be integrated with other programmable logic units, so it is not easy to use according to the application. Because it is used to design layout and routing of clocks and programmable design of clock resources, it cannot be applied to clock tree design of programmable logic chips to optimize clock skew.
  • the existing fishbone clock architecture has a large clock skew, which will affect the overall performance.
  • the general ASIC clock tree synthesis technology can reduce the clock skew, it cannot be applied to the clock tree design of the programmable logic chip to optimize the clock skew.
  • the inventor proposes a chip clock architecture with adjustable clock offset of a programmable logic chip.
  • the technical solution of the present invention is as follows:
  • a chip clock architecture with adjustable clock offset of a programmable logic chip includes: a global clock and several regional clocks, the clock input of the global clock is connected to a clock source, and the clock output of the global clock is respectively connected to To the clock input of each regional clock, each regional clock is connected to the clock load of its corresponding chip region and provides a clock signal;
  • a delay adjustment unit is provided in the path of at least one regional clock, and the delay adjustment unit includes several parallel delay paths with different delay values; the delay adjustment unit gates one of the delay paths according to the obtained configuration signal so that the connected region
  • the clock has a corresponding target delay, and the target delay of each regional clock corresponds to the clock skew working mode of the programmable logic chip.
  • the clock offset operation mode of the programmable logic chip includes at least one of a zero offset operation mode, an advanced offset operation mode and a lag offset operation mode;
  • the zero offset working mode is the clock skew working mode in which the phases of the clock signals of all clock loads with different distances from the clock source are the same; the leading skew working mode is the clock load of the farther away from the clock source The clock skew working mode in which the phase of the signal is more advanced; the lag skew working mode is the clock skew working mode in which the phase of the clock signal of the clock load is farther away from the clock source.
  • the delay adjustment unit analyzes the configuration code stream input from the outside of the programmable logic chip to obtain the configuration signal; or, the chip clock architecture also includes a configuration signal generation unit implemented by resources in the programmable logic chip, and the configuration signal The generation unit generates a configuration signal corresponding to the clock offset working mode of the programmable logic chip and supplies it to the delay adjustment unit.
  • the configuration signal generation unit generates the configuration of the delay adjustment unit connected to the regional clock according to the layout position of the clock load in each chip area and the clock signal obtained, combined with the target delay of the corresponding regional clock Signal.
  • the delay adjustment unit also includes an enabling terminal, and selects one of the delay paths according to the obtained configuration signal when receiving an enabling signal of an active level, and selects one of the delay paths when receiving an enabling signal of an invalid level. Turn off all delay paths when the signal is turned off;
  • the enable signal of the delay adjustment unit is at an active level when the resource of the chip region corresponding to the regional clock connected to the delay adjustment unit is in use, and is at an active level when the resource of the chip region corresponding to the regional clock connected to the delay adjustment unit is not in use. invalid level.
  • any i-th delay path of a delay adjustment unit includes a cascaded gate buffer and i delay buffers, and i is a parameter And the initial value of i is 0; the delay adjustment unit controls one of the gating buffers to be turned on and the rest of the gating buffers to be turned off according to the obtained configuration signal, so that the delay path where the turned-on gating buffer is located is strobe.
  • the delay value change value of the i+1 delay path relative to the i delay path is a delay value generated by a delay buffer, and the delay values generated by each delay buffer in the delay adjustment unit are equal And it is consistent with the delay value T RE of the global clock between two adjacent different chip regions.
  • the global clock is a vertical clock of vertical wiring
  • each regional clock is a horizontal clock of horizontal wiring
  • the delay value T RE of the global clock between two adjacent different chip areas is the global clock in The delay value on the vertical height of the chip area corresponding to one regional clock, and the vertical height of the chip area corresponding to each regional clock is the same, then the delay values generated by all single delay buffers in all delay adjustment units are equal.
  • a further technical solution thereof is that the global clock is connected to the regional clock through a delay adjustment unit, and the global clock is connected to at least one regional clock through several stages of cascaded delay adjustment units.
  • the present application discloses a chip clock architecture with adjustable clock offset of a programmable logic chip.
  • a delay adjustment unit including a plurality of delay paths with different delay values is used to gate the corresponding delay paths.
  • you can make the connected regional clocks have different delays to adjust the clock skew between different regional clocks, so that the clock skew of the chip can be adjusted in a relatively large range.
  • the delay adjustment The different path selection of the unit will also make the clock skew different, so as to meet the different clock skew working modes required in different application scenarios.
  • FIG. 1 is a schematic structural diagram of an existing clock architecture with a herringbone structure.
  • FIG. 2 is a schematic structural diagram of the chip clock architecture of the present application.
  • Fig. 3 is a schematic diagram of the internal structure of the delay adjustment unit.
  • FIG. 4 is a schematic diagram of a configuration gate of a chip clock architecture in an example.
  • the chip clock architecture includes: a global clock and several regional clocks, the clock input of the global clock Connect to the clock source (root clk).
  • the clock output terminals of the global clock are respectively connected to the clock input terminals of the respective regional clocks.
  • Each regional clock is connected to the clock load of its corresponding chip area and provides a clock signal.
  • the clock load is a resource that needs a clock signal inside the programmable logic chip, such as a register and the like.
  • the global clock is a vertical clock of a vertical line
  • each regional clock is a horizontal clock of a horizontal line.
  • the time domain of each region is realized by using a unidirectional fishbone structure.
  • At least one path of the regional clock is provided with a delay adjustment unit, optionally, the paths of all regional clocks are respectively provided with a delay adjustment unit, or only some of the paths of the regional clocks are provided with a delay adjustment unit, as shown in Figure 2
  • the global clock is connected to the regional clock through the delay adjustment unit, that is, the delay adjustment unit is set at the junction of the global clock and the regional clock, and this application will be followed as an example.
  • the delay adjustment unit is arranged at other positions in the path of the regional clock.
  • the delay adjustment unit includes several parallel delay paths with different delay values. As shown in FIG. 3 , any i-th delay path of a delay adjustment unit includes a gating buffer BUF0 and i delay buffers BUF1 cascaded, where i is a parameter and the initial value of i is 0. Thus, a delay buffer BUF1 is added to the i+1 delay path relative to the i delay path, and the difference in delay value increase is the delay value generated by the added delay buffer BUF1.
  • the delay values generated by any two delay buffers BUF1 in the delay adjustment unit are equal or unequal.
  • the delay values generated by all delay buffers BUF1 in the delay adjustment unit are equal, so that any i+1th delay
  • the delay value changes of the paths relative to the i-th delay path are all equal, so that the delay adjustment unit generates a combination of several delay values with equal difference changes.
  • the delay value generated by each delay buffer BUF1 in the delay adjustment unit is consistent with the delay value T RE of the global clock between two adjacent different chip regions, so that the clock offset Adjustment works great.
  • the delay value T RE of the global clock between two adjacent different chip regions is the chip corresponding to the global clock in a regional clock
  • the delay value on the vertical height of the region, and for easy expansion, the vertical height of the chip region corresponding to each regional clock is the same, so the delay value T RE of the global clock between any two adjacent different chip regions is equal.
  • the setting positions of different delay adjustment units in the path of the regional clock may be the same or different, and generally, the delay values generated by all single delay buffers in all delay adjustment units are equal.
  • Each delay adjustment unit adopts the structure shown in FIG. 3 , the number of delay paths included in any two delay adjustment units is the same or different, and the combinations of all delay values generated by any two delay adjustment units are the same or different.
  • Each delay adjustment unit will receive configuration signals, which are respectively marked as Flag0 ⁇ Flag6 in FIG. 2 .
  • the delay adjustment unit gates one of the delay paths according to the obtained configuration signal so that the connected regional clock has a corresponding target delay.
  • the delay adjusting unit controls one of the gating buffers BUF0 to be turned on and the other gating buffers BUF0 to be turned off according to the acquired configuration signal, so that the delay path where the turned-on gating buffer BUF0 is located is gated.
  • the configuration signal specifically includes the S signal and the SN signal of each strobe buffer.
  • the delay adjustment unit can also be connected in series into multiple stages, while achieving the same effect, it takes up less configuration resources, that is, the global clock and At least one regional clock is connected through several stages of cascaded delay adjustment units.
  • the delay adjustment unit can adjust the delay of the regional clock of the chip area according to actual needs, so as to achieve the required target delay.
  • the target delay of each regional clock corresponds to the clock offset working mode of the programmable logic chip.
  • the clock offset operating mode of the programmable logic chip of the present application includes at least one of the zero offset operating mode, the leading offset operating mode and the lagging offset operating mode, which are respectively:
  • the zero offset operation mode is a clock offset operation mode in which the phases of the clock signals of all clock loads with different distances from the clock source are the same.
  • the clock architecture of the chip includes a global clock and 7 regional clocks, which are respectively regional clocks 0 ⁇ 6, and regional clocks 0 ⁇ 2 are located above the horizontal position of the clock source and the distance from the clock source gradually decreases.
  • the regional clock 3 is located at the same horizontal position as the clock source, the regional clocks 4 ⁇ 6 are located below the horizontal position of the clock source and the distance from the clock source gradually increases, the delay value of the global clock between any two adjacent different chip regions for T RE .
  • the internal delay path R3 can compensate the delay of vertical transmission of different regional clocks, so that the chip clock offset can be adjusted without being affected by the increase of the area, and kept at a relatively small value to work in the zero-offset working mode.
  • the leading skew working mode is a clock skew working mode in which the phase of the clock signal of the clock load is farther away from the clock source.
  • the delay path R0 inside the gate delay adjustment unit cell0 and cell6 the delay path R2 inside the gate delay adjustment unit cell1 and cell5, and the delay inside the gate delay adjustment unit cell2 and cell4
  • the path R4 and the delay path R6 inside the gate delay adjustment unit cell3 can realize the advanced offset working mode.
  • the hysteresis-offset operation mode is a clock-offset operation mode in which the phase of the clock signal of the clock load lags further away from the clock source.
  • the advanced offset working mode can be realized.
  • the specific value of the target delay of each regional clock can also be different, and the configuration information of the corresponding delay adjustment unit can also be different, which is flexible and adjustable.
  • the configuration information of the corresponding delay adjustment unit can also be different, which is flexible and adjustable.
  • the delay adjustment unit mainly has two acquisition methods according to the obtained configuration signal:
  • the delay adjustment unit analyzes the configuration code stream input from the outside of the programmable logic chip to obtain the configuration signal, that is, the user configures the delay adjustment unit through the external configuration code stream to generate the required delay and achieve the required Clock skew mode of operation.
  • the chip clock architecture also includes a configuration signal generation unit realized by resources in the programmable logic chip, and the configuration signal generation unit generates a configuration signal corresponding to the clock offset working mode of the programmable logic chip and provides it to the delay adjustment unit. Specifically, the configuration signal generation unit generates the configuration signal of each delay adjustment unit according to the layout position of the predetermined feedback terminal of each regional clock and the fed back clock signal, combined with the target delay of each regional clock.
  • the predetermined feedback end there are generally two typical situations for the predetermined feedback end: (1) The input end of the delay adjustment unit of each regional clock is used as the predetermined feedback end, then based on the architecture where the delay adjustment unit is set at the junction of the global clock and the regional clock, the The signal fed back by the predetermined feedback terminal is the input signal provided by the global clock to each regional clock. (2) Use the clock terminal of one of the clock loads connected to the regional clock as the predetermined feedback terminal.
  • the configuration signal generation unit can generate the configuration signal of each delay adjustment unit in combination with the target delay of each regional clock.
  • the clock signal at the input end of cell0 is marked as T0
  • the clock signal at the input end of cell3 is marked as T1
  • the distance between cell0 and the clock source can be determined according to the layout positions of cell0 and cell3 compared to the distance between cell3 and the clock source big.
  • the configuration signals of cell0 and cell3 can be generated to gate the delay path R0 with a smaller delay value inside cell0 and the delay path R6 with a larger delay value inside the gate delay adjustment unit cell3, so that the clock signals of the two of the same phase.
  • One clock tree may correspond to one configuration signal generation unit, which acquires the layout position of the predetermined feedback terminal of each regional clock and the fed back clock signal, and generates the configuration signal of the delay adjustment unit connected to each regional clock. It is also possible that each delay adjustment unit corresponds to a configuration signal generation unit, which obtains the layout position and the fed-back clock signal of the predetermined feedback end of the connected regional clock, and the layout position and the fed-back clock signal of the predetermined feedback end of other regional clocks, and A configuration signal corresponding to a configuration signal generating unit is generated.
  • clock resources are used differently in different application scenarios, and the delay adjustment unit further includes an enable terminal EN for receiving an enable signal.
  • the enable signal is generated according to the resource usage of the chip region corresponding to the regional clock connected to the delay adjustment unit, and this operation can be completed by resources in the programmable logic chip.
  • an enable signal of an active level is generated.
  • the delay adjustment unit receives the enable signal of an active level, it follows the above process according to the obtained configuration The signal gates one of the delay paths for delay and clock skew adjustments.
  • an enable signal of an invalid level is generated, and the delay adjustment unit turns off all delay paths when receiving an enable signal of an invalid level, reducing power consumption of the chip.
  • the delay adjustment units cell0, cell4, cell5, and cell6 are all turned off when receiving the enable signals of the invalid level, assuming that the low level is invalid.
  • the delay adjustment units cell1, cell2 and cell3 receive the enable signal of the effective level, and the delay adjustment unit cell1 gates the internal delay path R1 according to the configuration signal, and the delay adjustment unit cell2 gates the internal delay path R2 according to the configuration signal.
  • the adjustment unit cell3 gates the internal delay path R3 according to the configuration signal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)
  • Pulse Circuits (AREA)

Abstract

本发明公开了一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构,涉及时钟设计领域,该芯片时钟架构的至少一个区域时钟的通路中设置有延迟调节单元,延迟调节单元中包括若干条并联的具有不同延迟值的延迟路径;延迟调节单元根据获取到的配置信号选通其中一条延迟路径使得所连接的区域时钟具有对应的目标延迟,各个区域时钟的目标延迟与可编程逻辑芯片的时钟偏移工作模式对应,通过控制延迟调节单元中选通的延迟路径,可以调节不同区域时钟之间的时钟偏移,使得芯片的时钟偏移能够在一个比较大的范围内进行调节,同样资源配置下,延迟调节单元不同的路径选择,也会使得时钟偏移不同,以满足不同应用场景下所需要的不同的时钟偏移工作模式。

Description

一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构 技术领域
本发明涉及时钟设计领域,尤其是一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构。
背景技术
时钟树(Clock Tree)常常被用于可编程逻辑芯片设计中,目前的可编程逻辑芯片的时钟树结构是对时钟进行分层设计,通常采用鱼骨型走线来实现时钟结构,如图1所示,该结构具有结构简单、层次明显、与可编程资源的契合度好等优点。
较小的时钟偏移(skew)是时钟树设计的一项重要指标和目的,但现有的鱼骨型结构的时钟架构具有较大的时钟偏移,会影响整体性能。且时钟偏移会随着可编程逻辑芯片规模的增大而增大,当可编程逻辑芯片中时钟区域数量增大到一定规模时,时钟偏移会大到一个难以接受的值。一般的ASIC的时钟树综合技术虽然能够减小时钟偏移,但因为其没有提前规划好时钟层次、分块、布局、走线,不能和其他可编程逻辑单元整合在一起,因此不易于根据用于设计对时钟进行布局布线和时钟资源进行可编程设计,所以并不能应用于可编程逻辑芯片的时钟树设计以优化时钟偏移。
技术问题
现有的鱼骨型结构的时钟架构具有较大的时钟偏移,会影响整体性能。一般的ASIC的时钟树综合技术虽然能够减小时钟偏移,但并不能应用于可编程逻辑芯片的时钟树设计以优化时钟偏移。
技术解决方案
本发明人针对上述问题及技术需求,提出了一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构,本发明的技术方案如下:
一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构,该芯片时钟架构包括:一个全局时钟和若干个区域时钟,全局时钟的时钟输入端连接时钟源,全局时钟的时钟输出端分别连接至各个区域时钟的时钟输入端,每个区域时钟连接至其对应的芯片区域的时钟负载并提供时钟信号;
至少一个区域时钟的通路中设置有延迟调节单元,延迟调节单元中包括若干条并联的具有不同延迟值的延迟路径;延迟调节单元根据获取到的配置信号选通其中一条延迟路径使得所连接的区域时钟具有对应的目标延迟,各个区域时钟的目标延迟与可编程逻辑芯片的时钟偏移工作模式对应。
其进一步的技术方案为,可编程逻辑芯片的时钟偏移工作模式包括零偏移工作模式、超前偏移工作模式和滞后偏移工作模式中的至少一种;
其中,零偏移工作模式是与时钟源的距离不同的所有时钟负载的时钟信号的相位均相同的时钟偏移工作模式;超前偏移工作模式是与时钟源的距离越远的时钟负载的时钟信号的相位越超前的时钟偏移工作模式;滞后偏移工作模式是与时钟源的距离越远的时钟负载的时钟信号的相位越滞后的时钟偏移工作模式。
其进一步的技术方案为,延迟调节单元解析可编程逻辑芯片外部输入的配置码流获取到配置信号;或者,芯片时钟架构还包括由可编程逻辑芯片内的资源实现的配置信号产生单元,配置信号产生单元产生与可编程逻辑芯片的时钟偏移工作模式对应的配置信号提供给延迟调节单元。
其进一步的技术方案为,配置信号产生单元根据各个芯片区域中的时钟负载的布局位置及其获取到的时钟信号,结合对应的区域时钟的目标延迟,产生区域时钟所连接的延迟调节单元的配置信号。
其进一步的技术方案为,延迟调节单元还包括使能端,并在接收到有效电平的使能信号时根据获取到的配置信号选通其中一条延迟路径,在接收到无效电平的使能信号时关断所有延迟路径;
其中,延迟调节单元的使能信号在延迟调节单元所连接的区域时钟对应的芯片区域的资源使用时为有效电平,在延迟调节单元所连接的区域时钟对应的芯片区域的资源未使用时为无效电平。
其进一步的技术方案为,根据权利要求1的芯片时钟架构,其特征在于,一个延迟调节单元的任意第i条延迟路径包括级联的一个选通缓冲器和i个延迟缓冲器,i为参数且i的起始值为0;延迟调节单元根据获取到的配置信号控制其中一个选通缓冲器导通、其余的选通缓冲器关断,使得导通的选通缓冲器所在的延迟路径被选通。
其进一步的技术方案为,第i+1条延迟路径相对于第i条延迟路径的延迟值变化值为一个延迟缓冲器产生的延迟值,延迟调节单元中各个延迟缓冲器产生的延迟值均相等且与全局时钟在相邻两个不同芯片区域之间的延迟值T RE一致。
其进一步的技术方案为,全局时钟为垂直走线的纵向时钟,每个区域时钟为水平走线的横向时钟,全局时钟在相邻两个不同芯片区域之间的延迟值T RE为全局时钟在一个区域时钟对应的芯片区域的纵向高度上的延迟值,且各个区域时钟对应的芯片区域的纵向高度相同,则所有延迟调节单元中的所有单个延迟缓冲器产生的延迟值均相等。
其进一步的技术方案为,任意两个延迟调节单元中包括的延迟路径的数量相同或不同,任意两个延迟调节单元产生的所有的延迟值的组合相同或不同。
其进一步的技术方案为,全局时钟通过延迟调节单元连接区域时钟,且全局时钟与至少一个区域时钟之间通过若干级级联的延迟调节单元相连。
有益效果
本申请公开了一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构,该芯片时钟架构中,利用内部包含多条具有不同延迟值的延迟路径的延迟调节单元,通过选通相应延迟路径,可以使得所连接的区域时钟具有不同的时延,以调节不同区域时钟之间的时钟偏移,使得芯片的时钟偏移能够在一个比较大的范围内进行调节,同样资源配置下,延迟调节单元不同的路径选择,也会使得时钟偏移不同,以满足不同应用场景下所需要的不同的时钟偏移工作模式。
附图说明
图1是现有的鱼骨型结构的时钟架构的结构示意图。
图2是本申请的芯片时钟架构的结构示意图。
图3是延迟调节单元的内部结构示意图。
图4是一个实例中的芯片时钟架构的配置选通示意图。
本发明的实施方式
下面结合附图对本发明的具体实施方式做进一步说明。
本申请公开了一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构,请参考图2所示的示意图,该芯片时钟架构包括:一个全局时钟和若干个区域时钟,全局时钟的时钟输入端连接时钟源(root clk)。全局时钟的时钟输出端分别连接至各个区域时钟的时钟输入端。每个区域时钟连接其对应的芯片区域的时钟负载并提供时钟信号,时钟负载是可编程逻辑芯片内部需要时钟信号的资源,比如寄存器等等。可选的,在一个实施例中,全局时钟为垂直走线的纵向时钟,每个区域时钟为水平走线的横向时钟。在另一个实施例中,如图2所示,每个区域时域采用单向鱼骨型结构实现。
至少一个区域时钟的通路中设置有延迟调节单元,可选的,所有区域时钟的通路中分别设置有延迟调节单元,或者,只有部分区域时钟的通路中设置有延迟调节单元,如图2以包含7个区域时钟,且所有区域时钟的通路中分别设置有延迟调节单元为例,7个延迟调节单元分别记为cell0~cell6。在一个实施例中,全局时钟通过延迟调节单元连接区域时钟,也即延迟调节单元设置在全局时钟与区域时钟的交界处,本申请后续以此为例。在另一个实施例中,根据不同的应用需求,延迟调节单元设置在区域时钟的通路中的其他位置。
延迟调节单元中包括若干条并联的具有不同延迟值的延迟路径。如图3所示,一个延迟调节单元的任意第i条延迟路径包括级联的一个选通缓冲器BUF0和i个延迟缓冲器BUF1,i为参数且i的起始值为0。由此,第i+1条延迟路径相对于第i条延迟路径增加了一个延迟缓冲器BUF1,且延迟值增大的差值即为增加的该延迟缓冲器BUF1产生的延迟值。延迟调节单元中任意两个延迟缓冲器BUF1产生的延迟值相等或不相等,在一个实施例中,延迟调节单元中所有延迟缓冲器BUF1产生的延迟值均相等,使得任意第i+1条延迟路径相对于第i条延迟路径的延迟值变化值均相等,由此延迟调节单元产生等差值变化的若干个延迟值的组合。
进一步的在另一个实施例中,延迟调节单元中每个延迟缓冲器BUF1产生的延迟值与全局时钟在相邻两个不同芯片区域之间的延迟值T RE一致,以使得对时钟偏移的调节效果很好。如图2所示,在全局时钟为纵向时钟、区域时钟为横向时钟的架构中,全局时钟在相邻两个不同芯片区域之间的延迟值T RE即为全局时钟在一个区域时钟对应的芯片区域的纵向高度上的延迟值,且为了易于拓展,各个区域时钟对应的芯片区域的纵向高度相同,因此全局时钟在任意的相邻两个不同芯片区域之间的延迟值T RE均相等。
不同延迟调节单元在区域时钟的通路中的设置位置可以相同也可以不同,且一般情况下,所有延迟调节单元中的所有单个延迟缓冲器产生的延迟值均相等。各个延迟调节单元均采用如图3所示的结构,任意两个延迟调节单元中包括的延迟路径的数量相同或不同,任意两个延迟调节单元产生的所有的延迟值的组合相同或不同。
每个延迟调节单元会接收到配置信号,如图2中分别记为Flag0~Flag6。该芯片时钟架构在工作过程中,延迟调节单元根据获取到的配置信号选通其中一条延迟路径使得所连接的区域时钟具有对应的目标延迟。具体的,延迟调节单元根据获取到的配置信号控制其中一个选通缓冲器BUF0导通、其余的选通缓冲器BUF0关断,使得导通的选通缓冲器BUF0所在的延迟路径被选通。如图3所示,配置信号具体包括各个选通缓冲器的S信号和SN信号。当配置信号中的配置位过多或者区域数量过多时,为了节省配置资源,还可以将延迟调节单元串联成多级,在达到相同效果的同时,占用更少的配置资源,也即全局时钟与至少一个区域时钟之间通过若干级级联的延迟调节单元相连。
由此可见,通过延迟调节单元可以根据实际需要对芯片区域的区域时钟的延迟进行调整,以达到所需要的目标延迟。而各个区域时钟的目标延迟与可编程逻辑芯片的时钟偏移工作模式对应,通过将各个芯片区域的区域时钟调整为目标延迟,就可以完成对可编程逻辑芯片的时钟偏移的调整,使得可编程逻辑芯片达到所需要的时钟偏移工作模式。
本申请的可编程逻辑芯片的时钟偏移工作模式包括零偏移工作模式、超前偏移工作模式和滞后偏移工作模式中的至少一种,分别为:
(1)零偏移工作模式是与时钟源的距离不同的所有时钟负载的时钟信号的相位均相同的时钟偏移工作模式。
比如在图2的结构中,该芯片时钟架构包括一个全局时钟和7个区域时钟分别为区域时钟0~6,区域时钟0~2位于时钟源的水平位置上方且与时钟源的距离逐渐变小,区域时钟3与时钟源位于同一水平位置,区域时钟4~6位于时钟源的水平位置下方且与时钟源的距离逐渐变大,全局时钟在任意相邻两个不同芯片区域之间的延迟值为T RE
假设7个区域时钟与全局时钟之间分别设置延迟调节单元cell0~cell6,且7个延迟调节单元的内部包含的延迟路径的结构完全相同,每个延迟调节单元内部具有8条延迟路径R0~R7,且产生的延迟分别为R0=0,R1=T RE,R2=2T RE,R3=3T RE,R4= 4T RE,R5= 5T RE,R6= 6T RE,R7= 7T RE
则通过选通延迟调节单元cell0和cell6内部的延迟路径R0、选通延迟调节单元cell1和cell5内部的延迟路径R1、选通延迟调节单元cell2和cell4内部的延迟路径R2、选通延迟调节单元cell3内部的延迟路径R3,可以补偿不同区域时钟在纵向传输的延迟,使得芯片时钟偏移调整到不受区域增加的影响,保持在一个比较小的值,以工作在零偏移工作模式。
(2)超前偏移工作模式是与时钟源的距离越远的时钟负载的时钟信号的相位越超前的时钟偏移工作模式。
则基于上述情况(1)的举例,通过选通延迟调节单元cell0和cell6内部的延迟路径R0、选通延迟调节单元cell1和cell5内部的延迟路径R2、选通延迟调节单元cell2和cell4内部的延迟路径R4、选通延迟调节单元cell3内部的延迟路径R6,可以实现超前偏移工作模式。
(3)滞后偏移工作模式是与时钟源的距离越远的时钟负载的时钟信号的相位越滞后的时钟偏移工作模式。
则基于上述情况(1)的举例,通过选通所有7个延迟调节单元cell0~6内部的延迟路径R0,可以实现超前偏移工作模式。
同时,在实现任意一种时钟偏移工作模式下,各个区域时钟的目标延迟的具体取值也可以不同,对应的延迟调节单元的配置信息也可以不同,灵活多变,可调节性好。比如在上述情况(3)中,通过选通所有7个延迟调节单元cell0~6内部的延迟路径R1,也可以实现超前偏移工作模式,但会使得7个区域时钟的目标延迟都不同。
延迟调节单元根据获取到的配置信号主要有两种获取方式:
(1)延迟调节单元解析可编程逻辑芯片外部输入的配置码流获取到配置信号,也即由用户通过外部的配置码流对延迟调节单元进行配置,以产生所需要的延迟,实现所需要的时钟偏移工作模式。
(2)芯片时钟架构还包括由可编程逻辑芯片内的资源实现的配置信号产生单元,配置信号产生单元产生与可编程逻辑芯片的时钟偏移工作模式对应的配置信号提供给延迟调节单元。具体的,配置信号产生单元根据各个区域时钟的预定反馈端的布局位置及反馈的时钟信号,结合各个区域时钟的目标延迟,产生各个延迟调节单元的配置信号。
其中,预定反馈端一般有两种典型的情况:(1)将各个区域时钟的延迟调节单元的输入端作为预定反馈端,那么基于延迟调节单元设置在全局时钟和区域时钟交界处的架构,该预定反馈端反馈的信号即为全局时钟提供给各个区域时钟的输入信号。(2)将区域时钟所连接的其中一个时钟负载的时钟端作为预定反馈端。
无论如何设置预定反馈端,配置信号产生单元结合各个区域时钟的目标延迟,即可产生各个延迟调节单元的配置信号。比如对应于图2,cell0的输入端的时钟信号记为T0,cell3的输入端的时钟信号记为T1,且根据cell0和cell3的布局位置就可以确定cell0与时钟源的距离比cell3与时钟源的距离大。由此可以产生cell0和cell3的配置信号,以选通cell0内部的具有较小延迟值的延迟路径R0、选通延迟调节单元cell3内部具有较大延迟值的延迟路径R6,使得两者的时钟信号的相位相同。
可以一个时钟树对应一个配置信号产生单元,其获取各个区域时钟的预定反馈端的布局位置及反馈的时钟信号、并产生各个区域时钟所连接的延迟调节单元的配置信号。也可以每个延迟调节单元对应一个配置信号产生单元,其获取所连接的区域时钟的预定反馈端的布局位置及反馈的时钟信号,以及其他区域时钟的预定反馈端的布局位置及反馈的时钟信号,并产生对应的一个配置信号产生单元的配置信号。
可选的,在另一个实施例中,从低功耗考虑,在不同的应用场景下,时钟资源的利用情况不一样,延迟调节单元还包括使能端EN并用于接收使能信号。该使能信号根据延迟调节单元所连接的区域时钟对应的芯片区域的资源使用情况产生,可以由可编程逻辑芯片内的资源来完成这一操作。当延迟调节单元所连接的区域时钟对应的芯片区域的资源使用时,产生有效电平的使能信号,延迟调节单元在接收到有效电平的使能信号时,按照上述流程根据获取到的配置信号选通其中一条延迟路径进行延迟和时钟偏移调节。当延迟调节单元所连接的区域时钟对应的芯片区域的资源未使用时,产生无效电平的使能信号,延迟调节单元在接收到无效电平的使能信号时关断所有延迟路径,减小芯片的功耗。
比如在一个实例中,基于上述图2的结构,如图4所示,若在一个场景下只需要使用区域时钟1、2、3的芯片区域的时钟负载,其他芯片区域的时钟负载未使用,且需要实现零偏移工作模式时。则延迟调节单元cell0、cell4、cell5、cell6接收到无效电平的使能信号均关断,假设低电平无效。延迟调节单元cell1、cell2和cell3接收到有效电平的使能信号,且延迟调节单元cell1根据配置信号选通内部的延迟路径R1、延迟调节单元cell2根据配置信号选通内部的延迟路径R2、延迟调节单元cell3根据配置信号选通内部的延迟路径R3。
以上所述的仅是本申请的优选实施方式,本发明不限于以上实施例。可以理解,本领域技术人员在不脱离本发明的精神和构思的前提下直接导出或联想到的其他改进和变化,均应认为包含在本发明的保护范围之内。

Claims (10)

  1. 一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构,其特征在于,所述芯片时钟架构包括:一个全局时钟和若干个区域时钟,所述全局时钟的时钟输入端连接时钟源,所述全局时钟的时钟输出端分别连接至各个区域时钟的时钟输入端,每个区域时钟连接至其对应的芯片区域的时钟负载并提供时钟信号;
    至少一个区域时钟的通路中设置有延迟调节单元,所述延迟调节单元中包括若干条并联的具有不同延迟值的延迟路径;所述延迟调节单元根据获取到的配置信号选通其中一条延迟路径使得所连接的区域时钟具有对应的目标延迟,各个区域时钟的目标延迟与可编程逻辑芯片的时钟偏移工作模式对应。
  2. 根据权利要求1所述的芯片时钟架构,其特征在于,所述可编程逻辑芯片的时钟偏移工作模式包括零偏移工作模式、超前偏移工作模式和滞后偏移工作模式中的至少一种;
    其中,所述零偏移工作模式是与所述时钟源的距离不同的所有时钟负载的时钟信号的相位均相同的时钟偏移工作模式;所述超前偏移工作模式是与所述时钟源的距离越远的时钟负载的时钟信号的相位越超前的时钟偏移工作模式;所述滞后偏移工作模式是与所述时钟源的距离越远的时钟负载的时钟信号的相位越滞后的时钟偏移工作模式。
  3. 根据权利要求1所述的芯片时钟架构,其特征在于,
    所述延迟调节单元解析所述可编程逻辑芯片外部输入的配置码流获取到所述配置信号;
    或者,所述芯片时钟架构还包括由所述可编程逻辑芯片内的资源实现的配置信号产生单元,所述配置信号产生单元产生与所述可编程逻辑芯片的时钟偏移工作模式对应的配置信号提供给延迟调节单元。
  4. 根据权利要求3所述的芯片时钟架构,其特征在于,所述配置信号产生单元根据各个区域时钟的预定反馈端的布局位置及反馈的时钟信号,结合各个区域时钟的目标延迟,产生各个延迟调节单元的所述配置信号。
  5. 根据权利要求1所述的芯片时钟架构,其特征在于,所述延迟调节单元还包括使能端,并在接收到有效电平的使能信号时根据获取到的配置信号选通其中一条延迟路径,在接收到无效电平的使能信号时关断所有延迟路径;
    其中,所述延迟调节单元的使能信号在所述延迟调节单元所连接的区域时钟对应的芯片区域的资源使用时为有效电平,在所述延迟调节单元所连接的区域时钟对应的芯片区域的资源未使用时为无效电平。
  6. 根据权利要求1-5任一所述的芯片时钟架构,其特征在于,根据权利要求1所述的芯片时钟架构,其特征在于,一个延迟调节单元的任意第i条延迟路径包括级联的一个选通缓冲器和i个延迟缓冲器,i为参数且i的起始值为0;所述延迟调节单元根据获取到的配置信号控制其中一个选通缓冲器导通、其余的选通缓冲器关断,使得导通的选通缓冲器所在的延迟路径被选通。
  7. 根据权利要求6所述的芯片时钟架构,其特征在于,第i+1条延迟路径相对于第i条延迟路径的延迟值变化值为一个延迟缓冲器产生的延迟值,所述延迟调节单元中各个延迟缓冲器产生的延迟值均相等且与所述全局时钟在相邻两个不同芯片区域之间的延迟值T RE一致。
  8. 根据权利要求7所述的芯片时钟架构,其特征在于,所述全局时钟为垂直走线的纵向时钟,每个区域时钟为水平走线的横向时钟,所述全局时钟在相邻两个不同芯片区域之间的延迟值T RE为所述全局时钟在一个区域时钟对应的芯片区域的纵向高度上的延迟值,且各个区域时钟对应的芯片区域的纵向高度相同,则所有延迟调节单元中的所有单个延迟缓冲器产生的延迟值均相等。
  9. 根据权利要求1所述的芯片时钟架构,其特征在于,任意两个延迟调节单元中包括的延迟路径的数量相同或不同,任意两个延迟调节单元产生的所有的延迟值的组合相同或不同。
  10. 根据权利要求1所述的芯片时钟架构,其特征在于,所述全局时钟通过延迟调节单元连接区域时钟,且所述全局时钟与至少一个区域时钟之间通过若干级级联的延迟调节单元相连。
PCT/CN2022/102672 2021-12-03 2022-06-30 一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构 WO2023098064A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/955,581 US20230016311A1 (en) 2021-12-03 2022-09-29 Clock skew-adjustable chip clock architecture of progarmmable logic chip

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111470014.6A CN114167943A (zh) 2021-12-03 2021-12-03 一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构
CN202111470014.6 2021-12-03

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/955,581 Continuation US20230016311A1 (en) 2021-12-03 2022-09-29 Clock skew-adjustable chip clock architecture of progarmmable logic chip

Publications (1)

Publication Number Publication Date
WO2023098064A1 true WO2023098064A1 (zh) 2023-06-08

Family

ID=80482908

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/102672 WO2023098064A1 (zh) 2021-12-03 2022-06-30 一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构

Country Status (2)

Country Link
CN (1) CN114167943A (zh)
WO (1) WO2023098064A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117075684A (zh) * 2023-10-16 2023-11-17 中诚华隆计算机技术有限公司 一种Chiplet芯片的自适应时钟网格化校准方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114167943A (zh) * 2021-12-03 2022-03-11 无锡中微亿芯有限公司 一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08274602A (ja) * 1995-03-31 1996-10-18 Ando Electric Co Ltd 可変遅延回路
US20080141061A1 (en) * 2006-12-12 2008-06-12 Chan Yuen H Programmable local clock buffer capable of varying initial settings
US20100117705A1 (en) * 2008-11-11 2010-05-13 Nec Electronics Corporation Semiconductor integrated circuit device having plural delay paths and controller capable of Blocking signal transmission in delay path
CN105786087A (zh) * 2016-02-23 2016-07-20 无锡中微亿芯有限公司 一种用于可编程器件的降低时钟偏移的方法
CN107453736A (zh) * 2016-05-27 2017-12-08 台湾积体电路制造股份有限公司 延迟电路、延迟元件通电及操作方法
CN114167943A (zh) * 2021-12-03 2022-03-11 无锡中微亿芯有限公司 一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7864625B2 (en) * 2008-10-02 2011-01-04 International Business Machines Corporation Optimizing SRAM performance over extended voltage or process range using self-timed calibration of local clock generator
KR20130125036A (ko) * 2012-05-08 2013-11-18 삼성전자주식회사 시스템 온 칩, 이의 동작 방법, 및 이를 포함하는 시스템

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08274602A (ja) * 1995-03-31 1996-10-18 Ando Electric Co Ltd 可変遅延回路
US20080141061A1 (en) * 2006-12-12 2008-06-12 Chan Yuen H Programmable local clock buffer capable of varying initial settings
US20100117705A1 (en) * 2008-11-11 2010-05-13 Nec Electronics Corporation Semiconductor integrated circuit device having plural delay paths and controller capable of Blocking signal transmission in delay path
CN105786087A (zh) * 2016-02-23 2016-07-20 无锡中微亿芯有限公司 一种用于可编程器件的降低时钟偏移的方法
CN107453736A (zh) * 2016-05-27 2017-12-08 台湾积体电路制造股份有限公司 延迟电路、延迟元件通电及操作方法
CN114167943A (zh) * 2021-12-03 2022-03-11 无锡中微亿芯有限公司 一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117075684A (zh) * 2023-10-16 2023-11-17 中诚华隆计算机技术有限公司 一种Chiplet芯片的自适应时钟网格化校准方法
CN117075684B (zh) * 2023-10-16 2023-12-19 中诚华隆计算机技术有限公司 一种Chiplet芯片的自适应时钟网格化校准方法

Also Published As

Publication number Publication date
CN114167943A (zh) 2022-03-11

Similar Documents

Publication Publication Date Title
WO2023098064A1 (zh) 一种可编程逻辑芯片的时钟偏移可调的芯片时钟架构
WO2020248318A1 (zh) 一种支持宽频率范围的双向自适应时钟电路
TWI416302B (zh) 具電源模式感知之時脈樹及其合成方法
US7315594B2 (en) Clock data recovering system with external early/late input
US6720810B1 (en) Dual-edge-correcting clock synchronization circuit
CN103516359B (zh) 时钟发生电路和包括时钟发生电路的半导体装置
US20080127003A1 (en) Opposite-phase scheme for peak current reduction
CN100541385C (zh) 数字电视调制器芯片中同步分频时钟的产生装置及其方法
JP2010200090A (ja) 位相補償用クロック同期回路
WO2020140782A1 (zh) 模数转换器及其时钟产生电路
CN101615912A (zh) 并串转换器及其实现方法
TWI544305B (zh) 在電路中的時脈樹與其合成方法及操作方法
JP4642417B2 (ja) 半導体集積回路装置
JP2005100269A (ja) 半導体集積回路
US9467152B2 (en) Output circuit
JP2000124795A (ja) デジタルdll回路
US20230016311A1 (en) Clock skew-adjustable chip clock architecture of progarmmable logic chip
CN111381654B (zh) 负载探测电路、soc系统、及负载探测电路的配置方法
JP3508762B2 (ja) 分周回路
US6351168B1 (en) Phase alignment system
JPH0865173A (ja) パラレルシリアル変換回路
JP5580763B2 (ja) 半導体集積回路
US6373302B1 (en) Phase alignment system
US7319348B2 (en) Circuits for locally generating non-integral divided clocks with centralized state machines
JP5639740B2 (ja) Dll回路とその制御方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22899884

Country of ref document: EP

Kind code of ref document: A1