TWI244015B - A clock tree synthesizing tool synchronously considering low clock skew and low power consumption - Google Patents

A clock tree synthesizing tool synchronously considering low clock skew and low power consumption Download PDF

Info

Publication number
TWI244015B
TWI244015B TW93124824A TW93124824A TWI244015B TW I244015 B TWI244015 B TW I244015B TW 93124824 A TW93124824 A TW 93124824A TW 93124824 A TW93124824 A TW 93124824A TW I244015 B TWI244015 B TW I244015B
Authority
TW
Taiwan
Prior art keywords
clock
buffer
power consumption
tree
clock tree
Prior art date
Application number
TW93124824A
Other languages
Chinese (zh)
Other versions
TW200428240A (en
Inventor
Wu-Shiung Feng
Ming-Hong Lai
Jau-Kai Jang
Original Assignee
Univ Chang Gung
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Chang Gung filed Critical Univ Chang Gung
Priority to TW93124824A priority Critical patent/TWI244015B/en
Publication of TW200428240A publication Critical patent/TW200428240A/en
Application granted granted Critical
Publication of TWI244015B publication Critical patent/TWI244015B/en

Links

Landscapes

  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

The invention brings up a method, which is applied with the skill of insertion, removal, swap and shift buffers to quickly determine the type and location of every buffer type in the clock tree, and pertinent to the combinational design process of designed clock tree to quickly determine the type and location of skew buffer, making the entire design meet the database design specification and the clock skew meet condition limitation respectively. The computation tool of the invention that quickly determines the buffer type and position in the clock tree comprises the inputted clock tree structure to be processed containing the information of each circuit and buffer insertion location, the database for inputting specific component containing the characteristics of input load, power consumption, clock delay, conversion time for output signal, etc., and initial state configuration. Determine if the lock tree structure complies with the feasible solution of design specification; if no, apply the load balancing and buffer balancing methods to get a feasible solution. Quickly determine the buffer type to reduce the power consumption of the clock tree. Apply the optimal technique integrating simulated annealing method to acquire the global optimal solution of the integrated buffer with minimum power consumption.

Description

1244015 九、發明說明: 【發明所屬之技術領域】 本發明係有關高速超大型積體雷 、。路中低功率時鐘樹電路合成之電腦輔 有關 助設計,增德视咖郁蝴t撕,織入、移除、 置換、移位時鐘樹上各緩衝器以降低時鐘樹雜功率及時鐘歪鲜之技術 【先前技術】 作為貧料處理速度的指標。而各峨在傳送的過財,必織可能維持訊 號之完整性(s㈣Integrity),亦即必須滿足以下時序設賴 ^^f^€(Cl〇ck ^ ? 貧料庫’需滿足其設計規範。時鐘延遲意指㈣鐘訊號發送端傳送到各同 步系統接收端所花費的最長時間。縮短時鐘延遲有助於加速時鐘訊號的傳 遞。換言之,可提升電路的操作頻率。時鐘涵相树鐘訊號發送端到 Μ里樹中任兩接收端之間的路#時間差,其值若過大,各同步系統接收端 接收到的時鐘訊號便不同步,可能造成訊號失真與邏輯誤動作。而對於特 疋的緩衝1、,其輸人域雜時間(TransitiQn Time)讀$貞載,具有可容 、 下限,^輸入5孔號轉換時間或輸出負載大於或小於該上、下限時, 則將造成緩衝器之不可預期操作情形,故時鐘樹上之緩衝器及正反器 (Flip-Flop) ’其操作狀態均需滿足設計規範。 隨著超大型積體電路深次微米製程技術的突飛猛進,在考慮時序方 面’訊號在互連線路(interconnect)上的延遲已遠大於在各元件上的延遲。對 Ϊ244015 時鐘樹合成設相言,為減树路延遲,在習知技術巾,第—種設計方式 為盡可能驗時鐘财路的長度··魏長騎長,則錢路上的時間延遲 愈大’同0㈣虎在線路巾轉換所消耗之功率亦同時增加。—般而言,在超 大型積體電路佈局設計中’驗線路長度財最優舰考慮來減小時間延 遲。 第二種設計方式乃於祕讀t之位人緩補,補可以分段線 路而減小線觀遲’亚且職衝器能輪電容,可減少各段線路的等效負 載里使為虎决速充電減少上升時間,達到縮小線路延遲的目的,同時對 於電路雜功率考量方面,輸人訊號轉換時_減少,將同時可降低元件 之耗月b ®-為時鐘樹網路展開圖,時鐘訊號的輸入接塾⑽稱為樹根 (Root) ’正反器⑽的時鐘訊號接收端稱為樹葉㈣。由接塾⑽(樹根)開 始至正反益⑽的時鐘訊號接收端m i㈣),包含沿途的緩衝器與連接線 路’構成-條完整路徑(Path),所累積的緩衝器與線路延遲即為路徑延遲 ( y)目中兩條分別到樹根机1與F1.3的路徑,其路徑延遲差便是— 組時鐘歪曲率。在時鐘樹網路⑽上的適當位置插入緩衝器⑽,除可減小 時間L遲外’透過選擇不同的缓衝器種類而改變各接收端時間延遲大小的 方式,尚可減小時鐘樹中各路徑的時鐘歪曲率。 2二為習知技術(美國專利5,卿22) a設計佈局流糊1統的 佈局。又口十通*僅進行放置(54)與繞線(62)。使用邏輯合成㈣。办她純)工 具產生—組描述邏輯閘及其連線的電網排線表列_ist),接魏置㈣將 電網排線表列中代表各種類邏輯間的標準單元(Standard Ce戦在晶片中所 1244015 選定的位置,通常位置之間的距離盎 /、I、、、罔排線表列中各元件的連結關係有1244015 IX. Description of the invention: [Technical field to which the invention belongs] The present invention relates to a high-speed super large-scale integrated mine. The low-power clock tree circuit synthesis of the computer aids the design of the computer. It can add tears, weave, remove, replace, and shift the buffers on the clock tree to reduce the clock tree's miscellaneous power and clock distortion. Technology [Prior technology] As an indicator of the processing speed of lean materials. And for each of the riches in the transmission, we must maintain the integrity of the signal, that is, it must meet the following timing requirements ^^ f ^ € (Clock ^? Poor material warehouse 'needs to meet its design specifications. Clock delay means the longest time it takes for the clock signal sender to transmit to the receivers of each synchronization system. Shortening the clock delay can help speed up the transmission of the clock signal. In other words, it can increase the operating frequency of the circuit. Clock clock signal transmission The path #time difference between any two receivers in the M-tree is too large. If the value is too large, the clock signals received by the receivers of the synchronization systems will be out of sync, which may cause signal distortion and logical malfunction. For special buffering 1. Its TransitiQn Time reads $ Zanzai, which has a tolerance and lower limit. ^ When the input 5 hole number conversion time or the output load is greater than or less than the upper and lower limit, it will cause the buffer to be unavailable. Expected operating conditions, so the buffer tree and flip-flop on the clock tree's operating state must meet the design specifications. With the rapid development of ultra-large integrated circuit deep sub-micron process technology In terms of timing considerations, the delay of the signal on the interconnect is much larger than the delay on each component. Let's talk about 244015 clock tree synthesis to reduce the delay of the tree path. In the conventional technology, the first- This design method is to check the length of the financial path as much as possible. Wei Changqi, the longer the time delay on the money path, the greater the power consumed by the transition of the line towel in the same way. In the design of large-scale integrated circuit layout, 'the optimal length of the circuit is considered to reduce the time delay. The second design method is to cushion the person who reads t, and the compensation can be segmented to reduce the line-view delay.' The power amplifier can round the capacitor, which can reduce the equivalent load of each section of the line, reduce the rise time for the tiger to charge the speed, and achieve the purpose of reducing the line delay. At the same time, for the consideration of the circuit's miscellaneous power, when the input signal is converted _ Reduce, at the same time can reduce the component consumption b ®-for the clock tree network expansion, the clock signal input connection is called the root (Root) 'Flip-Flop' clock signal receiving end is called the leaves. You then ( Root) The clock signal receiving end mi i㈣) from the positive to the negative, including the buffers and connection lines along the way to form a complete path (Path). The accumulated buffer and line delay is the path delay (y) There are two paths to the root machine 1 and F1.3 respectively. The path delay difference is-the skew rate of the group clock. Insert the buffer ⑽ at an appropriate position on the clock tree network 除. In addition to reducing the time L, the method of changing the time delay of each receiving end by selecting different types of buffers can still reduce the time in the clock tree. Clock skew for each path. 22 is a conventional technology (U.S. Patent 5, Qing 22) a design layout layout unified layout. And mouth pass * only place (54) and winding (62). Use logic to synthesize ㈣. Do her pure) tool generation-group description of the logic gate and its connected grid wiring list_ist), followed by Wei Zhizheng will represent the standard unit between various types of logic in the grid wiring list (Standard Ce 戦 on the chip The location of the selected place, 1244015, usually the distance between the locations Ang / I ,,, 罔 The connection relationship of each component in the line list is

關,其目的是為了減少稍後繞線的具择A 、又。、、九線(62)則依放置的結果作實際連 線 現今的設計則將時鐘樹合忐牛〜、 心成步驟⑽、㈣歸〇)加入佈局設計中。步 驟(56)進行最佳化時鐘樹網路的合成與繞線,因此修改部分邏輯合成電網排 線表列的内容。此外步_尚包含—分析硬體描述處理器(脑職Off, the purpose of which is to reduce the optional A and again of the winding later. , 62, and 9 (62) are actually connected according to the result of the placement. The current design adds the clock tree to the yak ~, the heart into steps (⑽, ㈣ 归 〇) into the layout design. Step (56) optimizes the synthesis and winding of the clock tree network, so the content of the logic synthesis grid routing table is modified. In addition step _ still contains-analysis hardware description processor (brain duties

Wn ’根據給定的時序設計規範,分析邏輯合成電網排線 表列中最认_祕入料,其緩_加碌嶋線翻巾與其他元件 的相關位置可作為下一步驟的緩衝器插入之實體設計㈣一^^ 考依據。 *於描述各賴__㈣#蝴_繼步綱 時放置妥當,入緩衝器後將會影響已放置的其它元件,因此步驟㈣必須 重新腐讀的放置位置。另外,轉衝_入分段了縣由時鐘气 2=至各Γ端的連線’因此必須修改步驟(S6)已完成之時鐘樹網路的 般而言,為了特最小時鐘延遲,調整後的緩翻位置並不 放_ &轉財較優先的 =卜步馨)計算每_㈣鐘爾人端至邮接收 ==立物翻财無物伽跡料合時 則了進仃步驟(62)各元件的實 包括輪^ _線,反之,則必須_步驟㈣重新調整 讀4略元件職置位置鱗鐘_路實體繞線。 然而’如圖二的時鐘樹網路合成流程,所有插入的緩衝器均僅使用同 1244015 —種緩衝’卻转細實際設計巾,_科序動 Llbrary)中具綱㈣咖咖她· ^ 、$空彳更具賴以滿足時序設計規範,為其主要缺點。若適 時鐘樹上各___,將難各额舰紐而缺⑽鐘^率使 其符合時序設計規範,以及騎降辦鐘射_辨消耗。通常㈣單 凡式設計資料庫林_型的緩衝器,其面積社何能互異,於是駆動 Γ級電路的_聲—般面_的_航_較小的能驅動 又大的負载’加速負载的訊號上升時間。然而較大的緩衝器本身也且有較 大的延遲及較大功率消耗,如何在時鐘延遲及功率消耗中採取一折衷點, 為本發明演算法考量重點之一。 於選擇緩衝器類型時’假設考慮可供選擇的緩衝器類型有η種,同時 時鐘樹網路上需要插人m個_。若採取地毯式搜尋帥纖加 s霞h),則必須判斷所有mn種緩衝器組合方式中,何者能求得最小時鐘延 遲並符合時鐘歪曲麵制。-般而言,在同時考慮緩衝器插人與緩衝器類 型選擇的問題上’由辦賴鱗的建立倾時序_酬的結果反覆進 ^T^^0»£^t^P^»(Circuit Level Simulations),Jc 設計流程帽會花費大量輯料間。因麟於標轉喊超大型積體電 路設計工具而言,如何在時鐘樹電路合成階段,由給定的緩衝器時序設計 資料庫中’快速找出-_當的緩衝器類型,以減少模擬時間,符合時鐘 歪曲率時料計規範及最低神消耗,在時鐘樹電路合缸具中佔有相當 重要的地位。 1244015 在習知技術中,(由 A. Vittal,M. Marek-Sadowska,“Low-Power Buffered Clock Tree Design,” IEEE Trans· on CAD of Integrated CAS,Vol· 16, No· 9, pp· 965-975, 1997年提出),應用插入緩衝器方式及建構H樹架構以達到面積與 功率消耗的最佳解;習知技術中提出應用插入緩衝器之技巧降低時鐘樹之 時鐘歪曲率(由 J· L· Neves 與 E. G. Friedman,“Design Methodology for Synthesizing Clock Distribution Networks Exploiting Nonzero Localized ClockWn 'According to a given timing design specification, the analysis logic synthesizes the most recognized _secret input in the grid wiring list, and the relative positions of its slow _ plus Luline line towel and other components can be inserted as a buffer in the next step The basis of the physical design: ^^ * Placed properly when describing each step__㈣ # butterfly_ Follow steps. After entering the buffer, other components that have been placed will be affected. Therefore, step ㈣ must be read again. In addition, the transfer rush into the county segmented from the clock gas 2 = to the connections of each Γ end. Therefore, the clock tree network that has been completed in step (S6) must be modified. In general, in order to minimize the clock delay, the adjusted The position of turning over is not put. _ &Amp; Priority to transfer wealth = Bu Buxin) Calculate that every person who receives from the post to the post receives == standing and turning over wealth and material, and then enter step (62). The component's actual includes the wheel ^ _ line, otherwise, you must _ step ㈣ readjust the position of the element position scale clock _ road physical winding. However, 'as shown in the clock tree network synthesis process in Figure 2, all the inserted buffers only use the same 1244015-a kind of buffer', but the actual design towel is thinned down, _ Ke Xudong Llbrary) with the outline of the coffee 她 ^, $ Empty is more dependent on meeting timing design specifications as its main disadvantage. If it is appropriate for each clock on the clock tree, it will be difficult for each ship to be queued, and the lack of clocks will make it meet the timing design specifications, and the clock will be used to identify the consumption. Generally, the design of a single Lin Fan-type database buffer is different, so how can the area of the buffer be different from each other, so the Γ sound circuit's _ acoustic-general surface _ aviation _ smaller can drive a large load to accelerate Signal rise time of the load. However, the larger buffer itself also has a large delay and a large power consumption. How to take a compromise between clock delay and power consumption is one of the key considerations of the algorithm of the present invention. When selecting the buffer type, it is assumed that there are n types of buffer types to choose from, and m__ need to be inserted on the clock tree network. If a carpet-like search method is used, it is necessary to determine which of the mn buffer combinations can obtain the minimum clock delay and conform to the clock skew surface system. -In general, in considering both the buffer insertion and the buffer type selection, the results of the establishment of the timing sequence are repeatedly repeated ^ T ^^ 0 »£ ^ t ^ P ^» (Circuit Level Simulations), Jc design process caps cost a lot of compilation room. In terms of the design of ultra-large integrated circuit design tools, how to quickly find out the current buffer type from a given buffer timing design database during the clock tree circuit synthesis phase to reduce simulation Time, which conforms to the clock distortion rate time gauge specifications and the minimum God consumption, occupy a very important position in the clock tree circuit and cylinder. 1244015 In the conventional technology, (by A. Vittal, M. Marek-Sadowska, "Low-Power Buffered Clock Tree Design," IEEE Trans. On CAD of Integrated CAS, Vol. 16, No. 9, pp. 965- 975, proposed in 1997), applying the method of inserting buffers and constructing an H-tree architecture to achieve the best solution of area and power consumption; the technique of applying inserting buffers is proposed in the conventional technology to reduce the clock distortion rate of the clock tree (by J · L. Neves and EG Friedman, "Design Methodology for Synthesizing Clock Distribution Networks Exploiting Nonzero Localized Clock

Skew,,,IEEE Trans· On VLSI Systems,Vol· 4, No· 2, pp· 286-291,1996 年提 出);而習知技術提出應用插入緩衝器之技巧減少緩衝器之差異並減少短路 電流以避免額外的電流消耗(由S· Pullela,N· Menezes與L. T. Pillage,“Low Power IC Clock Tree Design", Proceedings of the IEEE Custom IntegratedSkew ,,, IEEE Trans. On VLSI Systems, Vol. 4, No. 2, pp. 286-291, proposed in 1996); and the conventional technology proposes the application of the technique of inserting buffers to reduce the differences in buffers and reduce the short-circuit current. To avoid extra current consumption (by S. Pullela, N. Menezes and LT Pillage, "Low Power IC Clock Tree Design ", Proceedings of the IEEE Custom Integrated

Circuits Conference 1995, pp· 263-266);另有習知技術(由 κ· m· Cairig etc. A New Direction in ASIC High-Performance Clock Methodology,,,Circuits Conference 1995, pp. 263-266); and other conventional technologies (by κ · m · Cairig etc. A New Direction in ASIC High-Performance Clock Methodology ,,,

Proceedings of the IEEE Custom Integrated Circuits Conference 19985 pp. 593-596)應雜人緩_及減少多狀繞線崎鋪訊及減少功率消耗。 ’、、、:而以上’寅^法中僅使用同—種緩衝II種類,與實際電路設計流程不符, 另外卫未使恥除、置換、移位時鐘樹上騎器操倾巧,無法對時鐘樹 連線中達到最料序及功率消耗之、緩衝^組合。其他尚有美國專利 US6696863應用叢集插人緩衝器以達到相同時鐘延遲、脳避22應用階 層式緩衝謝似制零日_鱗、us_37平衡日_崎低時鐘 歪曲率、卿67_顧平衡各階層^制符合設計規範, 細麵選擇Η樹之深度以減少功率消耗,上述專利其針對時鐘樹之人 成’應用單-緩騎之插人技巧或分㈣路以達断鐘樹之功率消耗或降 低%鐘歪曲率之目標’然而皆無法提出—種同時針對多種緩衝器種類,並 1244015 應用緩衝嶋、峨f_〜糊吨鐘 之有效演算法。 了、里正曲率及功率消耗 本發明人Μ請巾華民國專 ^史計規範之電路合成工具」,舒對高=^^=緩衝器種類並滿足時 助料,適當選擇時鐘樹上各體電路巾時鐘樹電路合成 :合時鐘歪曲率時序設計規範,然其緩衝器、;,,足最小時鐘延遲及 式’且無針對功率消耗影響做最佳化分析f * ^僅提供置換緩衝器之方Proceedings of the IEEE Custom Integrated Circuits Conference 19985 pp. 593-596) should reduce the number of people and reduce power consumption. In the above method, only the same type of buffer II is used, which is inconsistent with the actual circuit design process. In addition, Wei did not make shame removal, replacement, or shifting of the clock on the clock tree. The clock tree connection achieves the most order and power consumption of the buffer combination. Other U.S. patents US6696863 use cluster insert buffers to achieve the same clock delay, avoiding 22 application-level buffers, like zero-day _ scale, us_37 balance day _ low clock skew rate, and 67_ GU balance all levels ^ The system conforms to the design specifications. The depth of the linden tree is selected in detail to reduce the power consumption. The above patents aim at the artificial creation of the clock tree. However, none of the goals of the clock distortion rate can be put forward-a kind of effective algorithm for 1244015 to simultaneously buffer multiple clock types, and apply buffer clocks and clocks. The curvature and power consumption of the inventor, the inventor of the inventor of the Republic of China ^ historiography standard circuit synthesis tools ", Shu Duo Gao = ^ ^ = buffer type and meet the needs of time, appropriate selection of each body on the clock tree Circuit towel clock tree circuit synthesis: design specifications for timing distortion of the clock, but its buffer,; ,, and minimum clock delay and formula 'and no optimization analysis for the impact of power consumption f * ^ only provide replacement buffer square

f發明内容J 本發明的目的在於提出一種應用插入、移除 衝器種類的方法,恤、、換、移位時鐘樹上緩 合現有崎物_路軸,不僅可配 見有的哺树合成設計流程,迅速決定緩衝器擺 低時鐘樹上椒輪卜,細峨 ^編高 鐘樹合成設奸r Φ n 了切合%序設計規範。時 流程中咐SI 擺放的位置及軸,在於傳統晶片設計f SUMMARY OF THE INVENTION J The purpose of the present invention is to propose a method for applying insertion and removal of punch types. The shirts, shifts, and shifts of the clock tree ease the existing kimonos and road shafts. The design process quickly determined that the buffer was placed on the clock tree, and the high-clock tree was combined to design the r Φ n to meet the% order design specifications. The position and axis of the SI in the process are in the traditional chip design

4 中 4½¼ 之:W 括時鐘樹電組排線表列、時鐘樹中各心執订。此時輸入的資訊包 衝器功率__路延遲、緩衝器延遲及緩1 速吻細放位置及種 1车/ 最繼更峨_最切恤,錢符合時鐘 正曲率设計規範的時鐘樹電_絲列。 本餐明所提出的方法之所以可1 可低力率日守鐘樹合成的目的,在於 =地讀各緩衝器擺放位置及決定其種類,其特點在於:⑴本發明 元正之:樣结構’紀錄各緩衝器前級、後級之關係,在每一次最佳化 10 1244015 運异過程中,可在線性時間内計算各緩衝器之時鐘延遲、功率消耗之資訊; ()為達財喊歪鲜之設計規範,本發明針對時鐘樹各層的緩衝 器y采用經驗法則中緩衝器種類與負載之關係,進行最佳化調整,以期得 到最小功率消耗及滿足嚴格時鐘歪曲率限制之設計規範;(3)應用啟發式 (Heuristic)的觀念作最佳化運算,而不顧地毯式搜尋之窮舉法,可以節省 求取最佳解的時間。由於時鐘樹中具有有限個緩衝器,但擺放位置並無限 ^本發日聽壯述啟發式法蚊緩_槪位紐,細模擬退火 演算法(Simulated Annealing,SA)可求得局部最佳解知加〇ptima】Among 4 in 4½: W includes clock tree electrical group wiring list, clock tree each heart order. At this time, the input packet punch power __ road delay, buffer delay and slow 1 speed kiss fine position and type 1 car Electric_wire column. The reason why the method proposed by this meal is that the low-power-rate sun-keeper clock tree can be synthesized is to read the placement position of each buffer and determine its type. Its characteristics are as follows: ⑴ This invention Yuanzhengzhi: sample structure 'Record the relationship between the front stage and the back stage of each buffer. During each optimization process of 10 1244015, the clock delay and power consumption of each buffer can be calculated in linear time; () A crooked design specification. The present invention adopts the rule of thumb in the relationship between the buffer type and the load in each layer of the clock tree to optimize the adjustment in order to obtain the minimum power consumption and meet the design specifications for strict clock skew restrictions. (3) Applying the concept of Heuristic for optimization calculations, regardless of the exhaustive method of carpet search, can save the time to find the best solution. There are a limited number of buffers in the clock tree, but the placement is not infinite. ^ This post describes the heuristic method of mosquito mitigation. A fine simulated annealing algorithm (Simulated Annealing, SA) can be used to obtain the local best.知 知 加 〇ptima]

Solution) ’再經負載平衡之技術,可求得全域最轉⑼_ 〇邱細 S—); (4)在設備容許職圍下,本發鴨發躺軟紅具關處理具有 百萬邏輯間的向速超大型積體電路。 在-特定的實施射,本發明之低耗驗速決定緩衝置及種類的 方法’包含以下步驟··(1)輸入待4理科鐘樹電網排線表列,包含各線路 的資訊與緩衝器插人的位置,另外亦輸人緩衝器資料庫及正反器資料庫, 以提供程式之依據;(2)初雜態辦’·(3)觸此電路結構是否滿足時 鐘歪曲率的設計規範,以及電路操作時是否將違反元件資料庫之限制^ 若違反上賴範’進行緩衝器之插人、移除、置換、移位步驟,以得時鐘 樹合成之可行解,·⑶斜行解翻啟發相算法搭賴魏切算法,^ 取最小功率消耗並滿足設計規範之最佳解。 【實施方式】 如前所述’本發_目的在於提出—種快速決定時鐘射所有緩衝器 1244015 及種_演异工具,使得時鐘_耗之功率最小並符合時鐘歪曲率 作設計規範。由騎際緩辨辦辆獅提供着 有限,最簡單使用來求解上述最佳化問題的方式即為地毯式搜尋’逐一置 換時鐘樹中每-緩衝器的種類並比較各組之功率消耗與時鐘歪曲率,直到 最後挑出其中擁有最小功率消耗並符合時鐘歪曲率時序設計規範的最佳緩 衝器類型組合。細b方法最主要的缺點在於:_運算時間;_ 緩衝器插人、移除、移位等步驟,程式不易求得最佳解。Solution) 'With the technology of load balancing, we can obtain the most comprehensive domain in the world _ 〇 邱 细 S—); (4) Under the allowable range of equipment, the hair of the hair duck lay soft red with processing of millions of logic. Toward super-large integrated circuit. In the specific implementation of the method, the method of determining the buffer placement and type of the low-consumption speed test of the present invention 'includes the following steps ... (1) Enter the list of 4 scientific clock tree power grids, including information and buffers for each line The position of the person is also input into the buffer database and flip-flop database to provide the basis for the program; (2) the initial miscellaneous state office '· (3) whether the circuit structure meets the design specifications of the clock distortion rate , And whether the circuit database will be violated during the operation of the device. ^ If the violation of the above Lai Fan's steps of inserting, removing, replacing, and shifting the buffer to obtain a feasible solution for clock tree synthesis, The heuristic algorithm relies on the Weiche algorithm to obtain the best solution that minimizes power consumption and meets design specifications. [Embodiment] As mentioned above, the purpose of the present invention is to propose a kind of fast decision clock to shoot all buffers 1244015 and a differentiating tool, so that the power consumption of the clock is minimized and it meets the clock distortion rate as a design specification. Running a lion by a rider provides a limited, the easiest way to solve the above optimization problem is to search for the carpet one by one to replace the type of each buffer in the clock tree and compare the power consumption of each group and the clock. Skew, until the best combination of buffer types is selected, which has the lowest power consumption and meets clock skew timing design specifications. The main disadvantages of the thin b method are: _ operation time; _ buffer insertion, removal, shift and other steps, the program is not easy to find the best solution.

基於上述理㈣物啦蝴娜,她魏—快速決定 時鐘樹中緩編人、雜、、録恤,働到:⑴將原始不 付合设计規犯之時鐘樹設計,應用緩衝器插入、移除、置換、移位等步驟, 至得可贿;(2)制平__狀啟發切算法,㈣有效地決 定低耗能、時觀遲最姐符合時鐘歪醇設計規麵最錢衝器位置及 種類組合。圖三顯示依據本發明施行特定實施例的系統方塊圖。輸入特定 實施例的時鐘樹合成電嶋線表列_)、時鐘歪轉設計規範上限(綱, 以及包含緩衝ϋ與正反n之辨、時序:#料庫_,經由步驟_判斷是 否符合設計規範及時鐘歪曲率限制,假設違反上述限糊進行步驟(加), 調整時鐘樹之合成方式使其能符合設計規範,反之則進行步驟(⑽,置換 緩衝器型態以快速降低時鐘樹消耗功率,然而本步驟存在違反設計規範之 風險’故於步哪4)中判斷是否時鐘樹仍符合設計規範,若違反則進行步 驟(11_覆至前-狀態’若不違反靡步尋耐進行降低功率消耗之最 佳化演异法,至轉(12_本發明德減雜樹合錢算法結束。 12 1244015 特定資料庫之边皇分才斤 在低耗能時鐘樹合成演算軸呈中,對於個別設計資料庫之_之 功率消耗雜可提供抑之、轉_斷絲,_糾對狀缓衝哭 資料庫,對_五之帽㈣構下_谢析圖,㈣緩衝器獅 CLKBUFXL具編、難貞載、功蝴小驅動能力, _UFX1具有次小之輸入負載、次小消耗功率及次小驅動能力,而 CLkbuFX2〇最有最大輸入負載、最大消耗功率及最大驅動能力。考慮圖 才、.里树〃為A、B、C二階層,其中a階層為固定種類之主要驅動緩 衝器,令其為CLKBUFX8 ; B階層為缺緩翻,可變化其_器種類及 其個數,分別為圖四之X軸與Y軸之變數;c階層為負載正反器,固定正 反器翻,且固定正反器數目為勘個,其負載數值可估算為正反器輸入 電容加上平轉路貞載。緩魅⑽)為α騎巾主魏祕娜,緩衝器 _為Β階層緩衝器之—’ Β階層之緩衝器最多可達卿,負載於緩衝器 (152),並依Β階層緩衝器數目平均負載c階層正反器_。圖四之功率消 耗圖為該時顯賴耗功率總合,包含緩補之娜消耗功率 、正反器之 操作消耗功率及線路訊號轉換時之消耗功率,其中消耗功率為〇表示該種 時鐘樹架構歧龍庫設計規範,故0討論。其巾魏衝器輸入訊號轉 換時間或輸出負載無法經由資料庫查表(TableL〇〇k_up)得知,將利用已知的 輸入訊號轉換時間與負載電容值查出四個角落的功率消耗數值,再利用多 項式線性内插法(PolynomialInterp〇lati〇n)方法計算内插值求近似的缓衝器 功率消耗’本發明利用經驗方程式外,心儀,經由四個角落求 13 1244015 出Z、B、C、Z)四個常數後,再代入實際輸入訊號轉換時間及輪出負载求 解。經由本功率雜0之呈現:可發現其功賴耗為―凹面分佈,在⑽ 緩衝器個數較少之情況下,由於負載增加使得正反器之輸人轉換時間上 升’進而增加正反器部份之功率消耗;而在B級緩衝器尺寸較大之設計中, 雖可降低輸人訊號轉換時間,然而在緩衝器部份之功率消耗則略有增加。 經由上述針對特定資料庫功率消耗之分析,可提供本發明在具有不同特性 之叹,十貝料庫時,採取不同策略之時鐘樹合成緩衝器選擇演算法。Based on the above-mentioned principles, She Wei-she quickly decided to slow down the editing, editing, and recording of clocks in the clock tree. Then she reached: ⑴ Insert the original clock tree design that does not meet the design violations, insert the buffer, and move it. Steps such as division, replacement, shift, etc., are deserving of bribes; (2) leveling __ shape heuristic cutting algorithm, which effectively determines the low energy consumption, time and time, the youngest sister is in line with the clock design, the most money punch Location and type combination. FIG. 3 shows a block diagram of a system according to a specific embodiment of the present invention. Enter the clock tree synthetic electrical line list of the specific embodiment _, the upper limit of the clock skew design specification (outline, and include the distinction between buffer ϋ and positive and negative n, timing: # 料 库 _, and determine whether it meets the design by step _ Specifications and clock distortion rate limits. Assuming that the above-mentioned restrictions are violated, the steps (addition) are adjusted. The synthesis method of the clock tree is adjusted to meet the design specifications. Otherwise, the steps are performed (⑽, replacing the buffer type to quickly reduce the clock tree power consumption). However, in this step, there is a risk of violating the design specification. Therefore, in step 4), it is judged whether the clock tree still meets the design specification. If it is violated, the step (11_override to the pre-state) is performed. The optimization algorithm of power consumption is to turn to (12_ the invention ends the hybrid tree reduction algorithm ends. 12 1244015 The edge of the specific database is in the low-energy clock tree synthesis calculation axis. For the Individual design database of _ power consumption miscellaneous can provide suppression, turn _ broken wire, _ correct shape buffer buffer database, correct _ five caps under the structure _ Xie analysis diagram, ㈣Buffer Lion CLKBUFXL with editing, difficult Zheng Zai, Gong Butterfly Driving capacity, _UFX1 has the second-smallest input load, the second-smallest power consumption, and the second-smallest driving capacity, while CLkbuFX20 has the most maximum input load, maximum power consumption, and maximum driving capacity. Consider the map, A.B. Levels C and C, where level a is the main driving buffer of a fixed type, let it be CLKBUFX8; level B is a slow transition, and its type and number can be changed, which are the X-axis and Y-axis of Figure 4. Tier c is the load flip-flop, the fixed flip-flop is turned over, and the fixed number of flip-flops is the number of loads. The load value can be estimated as the input capacitance of the flip-flop plus the load of the translation circuit. (Slow charm) Is the master of the α riding towel Wei Mi Na, the buffer _ is the B-level buffer-'The B-level buffer can reach at most Qing, load on the buffer (152), and the average load c-level positive according to the number of B-level buffers Inverter _. The power consumption diagram in Figure 4 is the sum of the apparent power consumption at that time, including the power consumption of the buffer, the power consumption of the inverter and the power consumption of the line signal conversion, where the power consumption is 0. Qilong library design of this kind of clock tree architecture Standard, so 0 is discussed. Its input signal conversion time or output load cannot be known through the database look-up table (TableLOOk_up), and four known input signal conversion times and load capacitance values will be used to detect four. The power consumption value at the corners is then calculated by the polynomial linear interpolation method (PolynomialInterpolation) to calculate the interpolation power to approximate the buffer power consumption. The present invention uses the empirical equation to determine 13 1244015 through the four corners. Z, B, C, Z), and then substitute the actual input signal conversion time and round-out load to solve. Through the presentation of this power miscellaneous 0: it can be found that its power consumption is ―concave distribution, in the ⑽ buffer. In the case of a small number, the input conversion time of the flip-flop is increased due to the increase of the load, thereby increasing the power consumption of the flip-flop part. However, in the design of a larger B-stage buffer size, the input loss can be reduced. Signal conversion time, however, the power consumption in the buffer section increases slightly. Through the above analysis of the power consumption of a specific database, the present invention can provide a clock tree synthesis buffer selection algorithm that adopts different strategies when the database has different characteristics.

搜鐘盘申時鐘延遲考量步驟 本發明所考餘時序設計的規範方面均與計算訊號由輸人端至輸出相 的路徑延遲有關,即累積一條由樹根至樹葉的完整路徑上,緩衝器延遲廣 線路延遲的總和。歸納其計算原則有:⑴為達到快速計算路徑延遲的目的 線路電«料生格離 edueed standafd p_itie f_< ^ 為述,此時置換緩衝H_類僅影響編延遲,與線路延遲無關丨圖计Steps for searching clock delay and applying clock delay considerations The specification aspects of the remaining timing design of the present invention are related to calculating the path delay from the input terminal to the output phase, that is, accumulating a complete path from the root of the tree to the leaves, and the buffer delay is wide. The sum of line delays. The calculation principles are summarized as follows: ⑴ In order to achieve the purpose of calculating the path delay quickly, the line circuit «data generation grid edueed standafd p_itie f_ < ^ is described, at this time, the replacement buffer H_ class only affects the compilation delay and has nothing to do with the line delay.

為,圖坤)為其聊格式線路電性參數,其中PI模型(R1,C1,C2)為 y⑻的前三階泰勒展開式近似,可視為緩衝即吻的等效線路錢,兑線 Cl+C2 〇 R2O 0 ^ ^並忽略線路延遲之效應,即»_ —緩衝器下之緩衝器,其輸入I 遲時間均鱗。猶峨雌獅__咖 序= 14 1244015 —表得知,如圖六所示,應用多項式内插值可求近似的緩衝器延遲值, 求取輸出訊號轉換時間的方式亦同。(2)置換緩衝器的種類,相對改變該缓 “。的知人Ί:谷值(PlnCapaeitance)。如由其前—級緩衝器輸出觀察,因改 又了負载里’而緩衝器延遲亦會隨之改變。例如圖七,若改變緩衝器(偏) 的類型,則改變緩衝器_的等效負載,因此緩衝器_與_都必須重 新查表指延遲。(3)對於緩衝器的移位,由於線路的改變造成缓衝器的負 載亦會党到影響,故緩衝器的延遲亦會造成影響。例如圖六,若改變緩衝 器(162)與緩衝_)在實體設計上之相對位置,則改變緩衝器_的等效 負載’因此緩衝器_與(糊必須重新查表計算延遲。(4)同步系統正反 益為最後-級緩衝器的㈣,通常—個緩會接多個正反器,換句話說, 整體正反器的數量比最後一級緩衝器的數量多。 圖八為本發明之低耗能快速決定時鐘樹中緩衝器位置及種類演曾工且 中,判斷原始輸入時鐘樹設計是否符合設計規範之步驟_之詳細流程 2包括w模、_整正反驗於日_之位_),日_負載平 衡步驟_,而各模組於調整後皆必須進行路徑延遲計算及設計規範檢 4驟_卿)t輸·料。亀摊赚闡述於下。 首先是驗_斷程綱及_,包含兩項觀察重點:⑴在資料庫中 緩衝器的功率消耗與時鐘延遲的計算中 拖州^ ㈣綠騎L訊號轉 換日守間、輪出負载大小。因此對於目標 ㈣馳⑼—級之緩衝器而言,變更 置或種類,均會造成前—級緩衝器之輪出訊號轉換時間的 1244015 改艾’同% ’貧料庫中對於元件(缓衝器及正反器)輸入訊號的轉換時間, 有限疋之上、下界,若輸入訊號之轉換時間小於可容許之下界,或轉入 凡號之^換#間大於可容許之上界,皆會造成元件在操作時無法正確預估 其現象,稱為違反設計規範。同樣的,對於目標緩衝器所串接之元件,其 負載值在資料庫忖有蚊之上、τ界限,钟過此一顧亦會造成緩衝 益在細作時無法正顧估其紐,此雜為違反設計規範。⑺在數位電路 设計中,時鐘樹為最長—條訊號線,時鐘訊號經由clk接聊傳人晶片中後, j過時鐘賴卿辦,_恤刪,咖咖_ 邂輯的延遲時職趨近_致愈好,其中電路設計中規範最長延遲時間與最 短延物權在-糊之巾,_輪醇,叫驗歸驟亦 需判斷電路之_曲率是否符合設計規範。本_斷程式步驟將成為 本發明中最嫩輸^所㈣切撕_,齡判斷程式 須盡可能提昇效率。 解的過程巾,主要考慮@素即树緩衝器貞載,其除 综合以上各點,她娜何行瞻軸椒設計至可行 影響整體功率消耗外, 對於緩衝器之時鐘延遲、輸出訊號轉換時間均有顯著之影響,合理 免广杨^ ’㈣足缺緩衝器之輸人陶她時間設計規 之負載’使件其時鐘延遲效應趨近 π 果。 心體之日翻^遲亦有正面之效 步驟(204)中 為快速解決時鐘歪曲率設計規範之方式 對於嚴格的時 16 1244015 鐘至曲率限制,-級緩衝器所造成的時鐘延遲 因此原始時鐘樹設計中#在力 x c s、夂吟鐘歪曲率限制, 〜种右存在—正反器與其他正反 在同一階層上,即會造成時鐘歪曲率限制違反。鐘樹上之位置不 重新連接正反__上之健,軌棒摘,本步驟將 在選擇緩衝譲嫩正反科同物前級軸之平衡, 大小,以期緩衝器之延遲效應滿足嚴格之時鐘歪曲率限制。月匕力及已負載 轉_)中,為上述移動正反器步驟仍無法滿足時鐘歪轉時所採取 之Μ ’平衡時鐘射緩衝器之For example, Tu Kun) is the electrical parameters of its chat format line, where the PI model (R1, C1, C2) is the first third-order Taylor expansion approximation of y⑻, which can be regarded as the equivalent line money buffering and kissing, against the line Cl + C2 〇R2O 0 ^ ^ and ignore the effect of line delay, that is, the buffer under the buffer, whose input I delay time is evenly scaled. The lioness __ coffee order = 14 1244015 —The table shows that, as shown in Figure 6, the approximate buffer delay value can be obtained by applying polynomial interpolation, and the method for obtaining the output signal conversion time is the same. (2) The types of replacement buffers should be changed relatively slowly. Known person: Valley value (PlnCapaeitance). As observed by the output of its pre-stage buffer, the delay of the buffer will also change as the load is changed. For example, in Figure 7, if the type of buffer (bias) is changed, the equivalent load of buffer_ will be changed, so both buffer_ and _ must re-check the table and refer to the delay. (3) Shifting the buffer Because the load of the buffer will also be affected by the change of the line, the delay of the buffer will also be affected. For example, in Figure 6, if the relative position of the buffer (162) and buffer_) is changed in physical design, Then change the equivalent load of the buffer_ so the buffer_and (the paste must be re-checked to calculate the delay. (4) the positive and negative of the synchronization system is the last-stage buffer, usually one buffer will receive multiple positive buffers. Inverters, in other words, the number of overall flip-flops is greater than the number of last-stage buffers. Figure 8 shows the low power consumption of the present invention to quickly determine the buffer position and type in the clock tree. Whether the clock tree design meets the design specifications The detailed flow 2 of step _ includes w mode, _ correction and inverse check on the day ___, and day _ load balancing step _, and each module must perform path delay calculation and design specification check after adjustment_输) t loses and expects. The profit sharing is explained in the following. The first is to check the program and _, including two observation points: 拖 in the database of the buffer power consumption and clock delay calculations 拖 州 ㈣ Green riding L signal conversion between day guard and rotation load size. Therefore, for the target gallop-level buffer, changing the setting or type will cause a 1244015 change of the front-level buffer's rotation signal conversion time. The conversion time of the input signals of components (buffers and flip-flops) in the "same%" lean library is limited by the upper and lower bounds. If the conversion time of the input signal is less than the allowable lower bound, or it is transferred to any Zhi ^ chang # is larger than the allowable upper bound, which will cause the component to fail to predict its phenomenon during operation, which is called a violation of design specifications. Similarly, for the components connected to the target buffer, the load value is in the data Kuyu has a mosquito and τ boundary Gu will also cause the buffer benefits to be unable to assess its neglect in detail, which is a violation of design specifications. ⑺ In digital circuit design, the clock tree is the longest-a signal line, and the clock signal is passed to the chip via clk. , J pass the clock Lai Qing Office, _shirt delete, coffee _ _ series of delay time is approaching _ the better, of which the circuit design to regulate the longest delay time and the shortest delay in property rights-_ It is also necessary to judge whether the curvature of the circuit meets the design specifications. The step of this program will become the most tender input in the present invention. The age judgment program must improve the efficiency as much as possible. The solution process towel The main consideration is @ 素 即 树 Buffer Zhen Zai. In addition to the above points, her design is feasible and affects the overall power consumption. It has significant effects on the clock delay and output signal conversion time of the buffer. It is reasonable to avoid the impact of Guang Yang ^ 'the input of the lack of buffers on the load of her time design rules' makes the clock delay effect approach π results. The time of turning the mind and body has a positive effect. The method in step (204) for quickly resolving clock skew design specifications is for the strict time limit of 16 1244015 clocks to the curvature limit. The clock delay caused by the -stage buffer is therefore the original clock. In the tree design # in force xcs, moaning bell distortion rate limitation, ~ the right exists-the flip-flop and other pros and cons are on the same level, which will cause the clock distortion rate limit to be violated. The position on the bell tree is not reconnected to the positive and negative __ on the rails, picking rails, this step will choose to buffer the balance and size of the front axis of the same object, so that the delay effect of the buffer meets the strict Clock skew limit. The force of the moon dagger and the loaded turn _) are those of the M ′ balanced clock shot buffer taken when the above-mentioned steps of moving the flip-flop still fail to meet the clock skew.

哭、τ 6 汛5虎由輸入接腳至緩衝 二反—遲外’亦可降低缓衝器、正反器上功率之消耗。如圖 (::7峨衝器Α __緩衝器衝器c,且由缓 衝為Α的輸出端向後級計算輪 輪出負載小於緩衝器A所屬緩衝器種類最大 ^,U^(252)^1A^W^^Cry, τ 6 flood 5 tiger from the input pin to the buffer. Second inversion-late outside ’can also reduce the power consumption of the buffer and the flip-flop. As shown in the figure (:: 7 E Chong A __ buffer Chong c, and the output from the buffer A is calculated to the rear stage. The round load is less than the largest buffer type of buffer A ^, U ^ (252) ^ 1A ^ W ^^

職細’⑽(254)咖Μ (256)緩衝器Μ 、讀_、可接伙輸錢,鱗婦鐘樹上緩衝賴組A、 具有可行解。考慮緩衝器B、c之下—級緩衝器,其中由⑽)緩 Η斤動為緩衝$ D、E、F,由(256)緩衝n c所驅動為緩衝器G、 Η 1〗、K、L,當前級緩_至:欠級、缓·之實體鱗長度在近似等長, 且-人級緩衝器之輸人負載相等的情況下考慮,因此(254)緩衝器B的輸出 、载】方、(256)緩動C ’其將造成(254)緩衝器B上的時鐘延遲小於 (256)緩衝器c,因此緩衝器D、E、F上之時鐘延遲大於緩衝器g、h、 1 J、K、L ’將違反嚴格的時鐘歪曲率設計規範。在本步驟中,為避免這 17 I244015 Γ 惰歪曲率過大,故採取負載平衡之策略以符合設計規 2。若時鐘樹上同-階層存在兩個或兩個以上的緩衝器,選擇 衝器中細艇所造成假設細C所連接次級緩 連接後的錄LQA^,,,纽緩衝器B與_器K相 _ . 、J移除—Κ與緩衝器。嚷關係,並重 連接緩衝器Β與緩衝器κ。經過此 重新 負載將小於餘^、、 卞作了確做知Β所新增的輪出 平衡增:所減少的輸_,並且兩緩衝器之翻負载將趨於 載屬槪㈣叫)心辦咖Β鍵衝器C所負 載之:人級元件為正反器,其判斷依獅綱。本 、 為經由時鐘樹之樹荦往相、、舰乂驟之應用 ^ 柄树根方向,同一階層同時考慮,並於調整後進行执 m乾驗』步驟,若仍無法滿足,則重回步驟(細)之進 滿足設計規範,則調整電路為可行解之步驟結束(214)。 右可 邊率消耗最伟The job title '(254) coffee M (256) buffer M, read _, can be associated to lose money, the scale woman Bell tree buffer Lai group A, has a feasible solution. Consider buffers B and c—stage buffers, in which buffers D, E, and F are buffered by ⑽), and buffers G, Η 1, K, and L are driven by (256) buffer nc. , The current level is slow_to: the length of the physical scale of the under-level and slow-level is approximately the same length, and the input load of the human-level buffer is equal, so (254) the output and load of buffer B , (256) easing C 'It will cause the clock delay on (254) buffer B to be smaller than (256) buffer c, so the clock delay on buffer D, E, F is greater than buffer g, h, 1 J , K, L 'will violate strict clock skew design specifications. In this step, in order to avoid the excessively large lazy distortion rate, a load balancing strategy is adopted to comply with the design specification 2. If there are two or more buffers on the same level in the clock tree, select the record LQA ^,, the new buffer B and the _ device caused by the submarine caused by the submarine in the puncher. K-phase_., J remove-K and buffer.嚷 relationship, and reconnect buffer B and buffer κ. After this reload, it will be less than Yu ^ ,, and I have made sure that the new round-out balance increase of B: the reduced input and the turnover of the two buffers will tend to contain the howl) The load of the C key punch C: the human-level element is a flip-flop, and its judgment is based on the lion class. This is the application of the tree tree to the phase and the ship through the clock tree. ^ The root direction of the handle tree is considered at the same level at the same time, and the adjustment is performed. If it is still not satisfied, return to the step. The progress of (fine) meets the design specification, and the step of adjusting the circuit to a feasible solution ends (214). Right can be the most powerful edge rate

圖十為步驟(叫«尋錢佳解之詳細频,對於符合料規範狀 縫曲率之時鐘樹設計,進行快速軸轉之綠 T 由步驟_縣⑽峨卜物 = 規範的時鐘樹,進行Λ 付口叹汁 =:,減少_之緩_-可行的辦法,-為: :=:類緩_本步驟__,⑴在時鐘樹的 U中所有緩衝器至樹根的共同路徑上,若存在單— 則為缓衝器移除牛驟老卢 刀支之、或衝态, 〜 ^ ’由於移除該緩衝器對所有正反器之時鐘 18 1244015 延遲造成·果-致,故可戦產生違反時鐘歪轉之設計規範,然而移 除該緩衝器須考絲級緩_是否奴夠負载能力以驅動下—級元件,是 否造成元件权設計規範違反,若造成上述違反财,廳緩魅需存在 於也树叹。十中。(2)針對時鐘樹上同階層之緩衝器,若其前級、次級緩衝 器之連線狀態類似,職為本轉可移雜《之目標,無本項摔作牽 連元件較廣’在麟她、歧元叙貞载、輸人無時_合資料庫 設計規範,以及計算時鐘射驗—級正反^時鐘涵料足限制後, 轉除、經由步驟_刪必要之咖步驟後,其時鐘樹架構為提 供最佳化步驟之基本架構。 本發明在步驟陶中提供一種快速決定時鐘樹上緩衝器之種類,針對 時鐘樹上最後-級之緩衝騎負載之正反器大小,逐_變化最後_級緩衝 器之大小,絲肢時職帽狀緩觸;最彳__情步驟陶, 觀察是否符合資解設計規範鱗缝曲率_,若其中—項無法滿足設 計規範,則步驟_將緩衝ϋ種類調大_級,並飾計算時各級緩衝器之 時鐘延遲、輸入訊號轉換時間及輸出負載 當皆符合資料庫設計規範及時Figure 10 is the steps (called the detailed frequency of finding money and good solutions. For the clock tree design that conforms to the material-like seam curvature, the green T for fast axis rotation is performed. Pay mouth sigh juice = :, reduce _ 之 延 _- feasible method,-is:: =: class slow _ this step __, on the common path from all buffers in the U of the clock tree to the root of the tree, if Existing order — it is the buffer to remove the old and old swords, or the state of rush, ~ ^ 'Because the removal of the buffer to the clock of all flip-flops 18 1244015 caused a delay. A design rule that violates clock skew has been generated. However, the removal of the buffer must be tested with a wire-level buffer. Whether the load capacity is sufficient to drive the lower-level components, and whether the violation of the component right design specifications is caused. (2) For the buffers of the same level on the clock tree, if the connection status of the former and secondary buffers is similar, the job-based transfer can be moved to the target of “No, No This item has a wider range of implicated elements: Zai Lin, Qi Zhenxie Zai, and no time to lose. After calculating the specifications and calculating the clock shots, the clock clock is sufficient, and the clock tree structure is the basic structure that provides the optimization steps after removing and removing the necessary steps through step_. Step Tao provides a method to quickly determine the type of buffers on the clock tree. For the size of the flip-flops of the last-stage buffer riding load on the clock tree, change the size of the last_stage buffer step by step. Touch; the most __ steps step Tao, observe whether the scale crack curvature meets the design specifications, if one of the items fails to meet the design specifications, then step _ increase the type of buffer ϋ level, and decorate the buffer at all levels during calculation Clock delay, input signal conversion time, and output load all meet the database design specifications.

鐘歪曲率限辦’至简312)結束快速決”鐘樹緩触種類之程式 在本發明提出低耗能時鐘樹設計主流程圖圖三中,步驟(114)再度驗註 判斷快速決定_舰翻麵步财職定之謝㈣是否滿足設計規 範’在步驟(31G)巾關條件之-摘有已至最讀無法滿足設 計規範,是故在步驟_中將緩_之義難回進人步驟释)前之初始 狀態’以期進人最佳化倾前時賴之設計確賴足:雜庫設計規範及時 19 1244015 鐘歪曲率限制。 圖十-為步驟⑽)應用經驗法則逐—調整緩衝_類之詳細流程圖, _鱗鐘犧讀衷分析,選 。適之_應__入、置卿_。蝴爆最佳化㈣ 法開始,至步驟_)中為進行負載正反器之平衡步驟,與步驟⑽)操作過 =目似,但僅考慮最大時鐘延遲之正反器與最小時鐘延遲之正反器,而非 全部正反器同時考慮,此策略可降低程式運算時間,並經平衡緩衝器負載 過程進而降健體功—。卿__難式,為戦上述步 驟影響時鐘餘編崎鱗_輸哪蝴規範違反之 時鐘樹架構,義娜_罐嶋,細程中時鐘 樹保持可行解架構。 ‘步驟_為本發明應用習知技術模擬退火法之最佳化過程,引入溫度 减τ ’並以冷卻餘純9降低溫度財取最佳功率雜解之時鐘樹缓 衝器組合。在本步财,為了降低變換緩衝_類對設計規範中時鐘歪曲 率的影響,故採用時鐘樹中同—階層緩衝器採用同—種類之策略:除解決 # 上述影響外,更可降低最佳化過程之運算複雜度。步驟(4贈判斷經由模 擬退火演算法後之時鐘樹功率消耗是否比前—階段為佳,若成立則繼續步 驟(404)之正反器平衡操作,以期得最佳解得達到全域最佳解;若不成立, 步驟(414)回復至前—最佳解之緩衝器組合,並至步驟(416)中結束最佳化過 程。圖十二巾為本以模擬退火法為基叙最佳化過程虛擬碼。 本舍明所提出之低時鐘歪曲率及低功率時脈樹合成程式,其步驟可分為 20 1244015 兩大部份:首先將違 對於符合州… _縣職之0_,·接下來 驟。“=狀_行鱗練雜化鱗敍斜最佳化之步 /、磔异法之運算複雜度分述如下。 乂 在_是否為可行解之步驟,將由輪入接聊啟 入轅拖Bip弓、, 心汀斤緩衝态上輸 轉換時間,亚依連線資訊計算其負载大小, 得輪出轉換時間、功率消耗箄〜a * 十車乂查表方式而 數 〗力羊桃寺貝讯。令時脈射緩衝器個數為M、正反器個The clock distortion rate limit is limited to 'to Jane 312) End the fast-determined "Clock Tree Slow Touch Type" program. In the present invention, the main flow chart of the low-energy clock tree design is shown in Figure 3. Step (114) is re-injected to determine the rapid decision. The question of whether to meet the financial specifications is met in the step (31G) of the requirements of the design specification-in the step (31G), you ca n’t meet the design specifications, so it ’s difficult to return to the steps. Explained) The initial state before the 'in order to enter the optimization of the current design is indeed sufficient: miscellaneous library design specifications and timely 19 1244015 clock distortion limit. Figure X-for steps ⑽) Apply the rule of thumb to adjust buffer- Detailed flow chart of this kind, _scale bell sacrifice reading analysis, select. Appropriate _ should _ _ into, set _ _. The optimization method of the butterfly burst begins, to step _) to balance the load flip-flop The steps are the same as those in step ⑽), but only the flip-flops with the maximum clock delay and the flip-flops with the minimum clock delay are considered, not all the flip-flops are considered at the same time. This strategy can reduce the program operation time, and Balance the buffer load process to reduce physical fitness—Qing __ Difficult, as The above steps affect the clock tree structure, which is the clock tree structure that violates the specification of the butterfly, and the enamel canister. The clock tree maintains a feasible solution structure during the fine-grained process. In the optimization process, the temperature reduction τ 'is introduced, and the cooling tree is used to reduce the temperature to obtain the optimal power disassembly clock clock buffer combination. In this step, in order to reduce the conversion buffer, the clock skew rate in the design specification The same-level strategy in the clock tree is adopted to adopt the same-type strategy: in addition to solving the above-mentioned effects, the computational complexity of the optimization process can be reduced. Steps (4 free judgment after the simulated annealing algorithm Is the clock tree power consumption better than the previous stage? If it is true, continue with the flip-flop balancing operation of step (404), in order to obtain the best solution to reach the global best solution; if not, step (414) returns to the previous —The optimal solution buffer combination, and the optimization process ends in step (416). Figure 12 shows the virtual code based on the simulated annealing method. The low clock proposed by Ben Sheming Distortion rate and Power clock tree synthesis program, its steps can be divided into two major parts: 20 1244015: First, the violation will be in line with the state ... _ county post 0_, and then the next step. The step of optimization and the computational complexity of different methods are described as follows: 乂 In _ whether it is a feasible solution step, the turn-by-turn chat will be used to drag the Bip bow, and the conversion time will be transferred to the buffer state of the heart. Ai ’s connection information calculates its load size, and it ’s necessary to calculate the conversion time and power consumption. 箄 ~ a * Ten cars 乂 look up the meter. 〖Liyang Taosi Beixun. The number of clock pulse buffers is M, positive Inverter

巾正心錄约輕_錄之數十倍。令轉庫絲計算之複 雜度為勢則判斷是否為可行解之運算複雜度為卿_。 在變更可行解部份,其關鍵步驟為平衡樹狀結構,針對具有最大延遲時 之緩衝器與具有最小延遲時間之緩衝器進行負載平衡之操作。以完整二元 樹考慮’最後-級之緩衝器之個數約佔所有緩衝器之—半,即M/2。以最差 清况(Worst Case)考慮,將輕f之排顺合魏。此時之運算複雜度 為0(/>*⑽+抑*从2) 〇 另外在進行最佳化解法之步驟時,本發明之最佳化步驟採用模擬退火 演算法計算功率消耗。由於本演算法在變化緩衝器型態之方式亦採用同—# 階層變化’對於-包含Μ個緩衝器之完整二元樹,其包含之階層數(樹高) 近似為,對於每-置換步驟均需判斷是否符合設計規範,此運算複雜 度為0(P*(M+iV)*tog2M广另夕卜由於變化緩衝器之大小時必須考慮是否會與 已存在之緩衝益、正反器造成重疊而違反規範,故每個緩衝器變化時,可 其判斷重3:步驟之複雜度為〇(从+州。因此模擬退火演算法所需之運算複 為度共為0(Ρ*(Μ+Λ〇 */〇g2JVf)。综合以上分析,本發明所提出之時鐘樹合成 21 1244015 心法s取差情况下之運算複雜度為⑽*+ 0(P气M+N)2Hog2M)。 應用本發明所提出之同時考慮功率消耗及最小時鐘歪斜之時鐘樹人成 =法⑽m置_蝴職作,_大型積體電路 喊树可紐含絲個緩_,騎触紅運算娜度,換 器義之步驟為針對時鐘樹同一階層同時置換,以達到嚴格斜 制,故降低時鐘樹之功率消耗目標將可達到近似最佳解。 應用圖三之濟算法冷鞋 各表—整理出提供作為測試本發明之五個實施 例物請„输卜…顧彳槪胸挪;峨案例二 u慨衝益及5⑻個正反器;測試案例三包含出個緩衝器及咖 個正反器;_獅邮含251靖恤5_仏顺孝例五含 821個緩衝器及1咖個正反器。表中列出原始時鐘樹排線列表之功率消耗 及時鐘歪曲率。表二中應用本發明所提出之演算法對各測試案例進行分 析‘、二由限制Μ童歪曲率同時觀察功率消耗之變化。由表二中可發現,本 _所提演算法對於變更排線列表後之電路,其功率絲可降低聽,另 力夕口卜若對時鐘歪曲率嚴格設計,將提高時鐘樹之功率消耗,然而仍小於原始 十構之功率雜。_:巾另可魏,树观格之時缝曲率而新增之 緩衝器,爾致_驗曜(正反iikt蝴輯增加,曰但 ="t鐘L遲與取小時鐘延遲同時增加之情況下,將不影響電路之操作。 昨 本U提&種應用插入、移除、置換、移位緩衝器的技巧, 恤咖鐘樹上各緩_種随置之方法,可配合已設定之時鐘樹合 22 1244015 成汉计流程,迅速地決定 b τ為種痛及值置,使得整體設計符合資料 庫設計規範,以及時鐘歪 率可以付合條件限制。本發明之快速決定時鐘 樹中緩衝器種類及位置演管工呈 - 匕栝·輸入待處理的時鐘樹結構,包含 各線路的貝_緩_插人位置;輸人特定元件之資料庫,包含輸入負載、 功率消耗、時鐘延遲、輸出訊號轉換時間等特性;初始狀態設定;判斷時 =力木構疋否符合設計規範之可行解,若不存糊顧負載平衡及緩衝器 彳可讀’快速決核衝^麵崎低時賴之消耗功 率,應用結合模擬退火法之最佳化 η 全域最佳解。 “隶小功率消耗之缓衝器組合 當不能以之限制本發 範圍所做之均等變化及修飾,仍將不 唯以上所述者,_以·本發_工作原理, 明的範圍。即大凡依本發明中請專利 精神和範圍,故都應視為本發明 失本發明之要義所在,亦不脫離本發明之 的進一步實施狀況。 圖式簡單說明】 圖一為時鐘樹網路展開圖。 圖二為一基本之ic設計佈局流程圖。 實施3行本發明之演算卫具的輸人輸出方塊圖。 貧料庫中連線結構對功率消耗分析之示意圖。 圖五為%4里树中緩衝器連接結構之示意圖。 衝器延遲值、輸出峨轉換《及功率消耗的資辦查表計 連制寄生Rc電路’(b)為其聊袼輕路電性參數。 圖八為本發明貫施可行解步驟流程圖。 圖九為時鐘樹負載平衡步驟之示意圖,⑻為平衡前,⑼為平衡後。 23 1244015 圖十為快速決定低耗能時鐘樹緩衝器組合之流程圖。 =十-為以模擬退火法為基礎之低耗能時鐘樹最佳化演算法流程圖。 回十一為以模擬退火法為基礎之低耗能時鐘樹最佳化演算法虛擬碼。 【主要元件符號說明】 習用符號 (1〇)接墊 (12)時鐘樹網路 (14)子電路 (16)正反器 (18)緩衝器 (52)開始 (54)放置 (56)最佳化時鐘束網路的合成與繞線 表k化緩衝器插入方式分析 (58)插入緩衝器調整放置位置 (60)有無違反時序限制 (62)繞線 (64)結束 本發明符號 〃 (102)時鐘樹合成電網列表 (104)時鐘歪曲率上限 (106)緩衝器、正反器功率、時序資料庫 (108)檢查原電路設計是倾合設計規範? (110)將本電路變為可行解 (112)針對緩衝器種類快速尋求最佳可行解 (114)运反設計規範或消耗功率增加 (U6)回覆成前一狀態 (118)利用經驗法則逐一調整緩衝器大小 1244015 (120)程式結束 / ( 152) ( 154)缓衝器 (156)正反器 (162) ( 164)緩衝器 (166)寄生RC電路 (202)違反設計規範之電路 (204)將正反器接到最後一級 (206) (212)是否為可行解? (208)程式結束 (210)平衡時脈樹内部元件的分布 g (214)程式結束 —Towel recording is about ten times lighter. Let the complexity of Zhu Kusi's calculation be the potential, and the computational complexity of judging whether it is a feasible solution is __. In the part of changing the feasible solution, the key step is to balance the tree structure, and load balance the buffer with the maximum delay and the buffer with the minimum delay time. Considering a complete binary tree, the number of last-stage buffers accounts for about one-half of all buffers, that is, M / 2. Considering the worst case (Worst Case), the rank of light f is in line with Wei. The computational complexity at this time is 0 (/ > * ⑽ + * * from 2) 〇 In addition, when performing the step of the optimization method, the optimization step of the present invention uses a simulated annealing algorithm to calculate the power consumption. Because this algorithm also uses the same way to change the buffer type— # Hierarchical Change 'for a complete binary tree containing M buffers, the number of levels (tree height) it contains is approximately It is necessary to determine whether it meets the design specifications. The complexity of this operation is 0 (P * (M + iV) * tog2M) In addition, when changing the size of the buffer, it must be considered whether it will overlap with existing buffer benefits and flip-flops. It violates the specification, so when each buffer changes, it can be judged as 3: the complexity of the step is 0 (from + state. Therefore, the operation required by the simulated annealing algorithm is a total of 0 (P * (Μ + Λ〇 * / 〇g2JVf). Based on the above analysis, the complexity of the clock tree synthesis 21 1244015 proposed by the present invention is ⑽ * + 0 (P 气 M + N) 2Hog2M). The invention of a clock tree that considers power consumption and minimum clock skew at the same time is equal to the method of setting a butterfly, _ a large integrated circuit called a tree can contain a bit of slowness, riding a touch to calculate the degree, and the converter The righteous step is to simultaneously replace the same level of the clock tree to achieve strict oblique control, so The power consumption target of the low clock tree will reach an approximate optimal solution. Apply the economic algorithm shown in Figure 3 to the tables of the cold shoes—sort them out and provide them as a test for the five embodiments of the present invention. The benefits of case 2 and 5 flip-flops; test case 3 includes a buffer and a flip-flop; _ Lion Post contains 251 Jing shirt 5_ 仏 Shun filial example 5 contains 821 buffers and 1 There are flip-flops. The table lists the power consumption and clock distortion of the original clock tree wiring list. Table 2 uses the algorithm proposed by the present invention to analyze each test case. At the same time observe the change in power consumption. From Table 2, it can be found that the circuit proposed by this algorithm can reduce the power of the wire after changing the wiring list. In addition, if you strictly design the clock distortion rate, Increase the power consumption of the clock tree, but it is still less than the power of the original ten structures. _: The towel is also available, and the buffer is added when the tree looks at the grid. , But Dan = " t clock L late and fetch small clock delay increase at the same time Under the circumstances, it will not affect the operation of the circuit. Yesterday, this U mentions a technique of inserting, removing, replacing, and shifting the buffer. Each method on the clock tree can be set slowly, which can be matched with the set The clock tree combination 22 1244015 Cheng Han calculation process quickly determines b τ as a kind of pain and value, so that the overall design conforms to the database design specifications, and the clock skew rate can meet the constraints. The fast decision clock tree of the present invention Buffer type and location manager presents-Dagger · Input clock tree structure to be processed, including the _Easy_Insert position of each line; database of input specific components, including input load, power consumption, clock delay , Output signal conversion time and other characteristics; initial state setting; when judging = force wood structure 疋 whether it meets the feasible solution of the design specification, if there is no doubt about load balance and buffer 彳 readable 'quick determination of the impact ^ when the surface is low The power consumption depends on the optimization η global optimal solution combined with the simulated annealing method. "When the combination of buffers with small power consumption cannot limit the equal changes and modifications made by the scope of the present invention, it will still be more than the above-mentioned. According to the spirit and scope of the patents claimed in the present invention, they should all be regarded as missing the essence of the present invention, and without departing from the further implementation of the present invention. Brief Description of the Drawings Figure 1 is an expanded view of a clock tree network. Figure 2 is a basic IC design layout flow chart. The input and output block diagrams of the implementation of the three-line calculation safty of the present invention. Schematic diagram of the power consumption analysis of the wiring structure in the lean warehouse. Schematic diagram of the connection structure of the buffer. The delay value of the buffer, the conversion of the output, and the parasitic Rc circuit connected to the meter by the power consumption of the meter look-up parasitic Rc circuit (b) are the electrical parameters of the Liaoning Light Road. The flow chart of the feasible solution steps is shown in Figure 9. Figure 9 is a schematic diagram of the clock tree load balancing steps, ⑻ is before balancing, and ⑼ is after balancing. 23 1244015 Figure 10 is a flowchart for quickly determining a low-energy clock tree buffer combination. = 十- Model Flowchart of low energy clock tree optimization algorithm based on annealing method. The eleventh is a virtual code of low energy clock tree optimization algorithm based on simulated annealing method. [Description of main component symbols] Conventional symbols (10) pads (12) clock tree network (14) sub-circuits (16) flip-flops (18) buffers (52) start (54) placement (56) optimization of clock beam network synthesis and Analysis of the winding table kization buffer insertion method (58) Inserting the buffer to adjust the placement position (60) Whether the timing limit is violated (62) Winding (64) Ending the symbol of the invention 〃 (102) Clock tree synthesis grid list (104) Clock distortion upper limit (106) Buffer, flip-flop power, timing database (108) Check that the original circuit design is a design specification? (110) Turn this circuit into a feasible solution (112) Quickly seek for the type of buffer Best feasible solution (114) Reverse design specifications or increased power consumption (U6) Return to the previous state (118) Use the rule of thumb to adjust the buffer size one by one 1244015 (120) End of the program / (152) (154) buffer (156) flip-flop (162) (164) buffer (166) parasitic RC circuit (202) Passage (204) to the last stage flip-flop (206) (212) whether or not a feasible solution (208) the program ends (210) clock tree distribution g equilibrium inside the end member (214) Program? -

(252)緩衝器A(252) Buffer A

(254)緩衝器B(254) Buffer B

(256)緩衝器C (258)緩衝器 (302)電路可行解 (304)緩衝器移除步驟 (306)依正反器負載決定緩衝器種類 (308)是否為可行解? (310)已調整至最大緩衝器 修 (312)結束 (402)最佳化步驟開始 (404)平衡正反器之分布 (406)訊號歪斜是否下降? (408) (414)回覆成前一狀態 (410)模擬退火法 (412)功率消耗是否下降? (416)執行結束 25 1244015 附件一 表一、測試實施例分析 測試案例一 測試案例二 測試案例三 測試案例四 測試案例五 缓衝器個數 7 30 113 251 821 正反器個數 123 500 1234 5000 10000 原始電路 消耗功率(m\V) 4.9822 21.4612 52.0567 210.505 419.8569 最大時鐘延遲 ㈣ 0.6404 0.8741 1.1298 1.2143 1.5029 原始電路 時鐘歪曲率(ns) 0.0035 0.0140 0.0204 0.0547 0.0586 1244015 附件二 表二、測試實施例最佳化功率、時鐘歪曲率分析 測試案例 測試案例 測試案例 三 測試案例 四 測試案例 五 測試一 時鐘歪曲率限制(ns) 0.005 0.02 0.05 0.1 0.2 最大時鐘延遲變化 -12% +14% +33% +15% +90% 最佳化功率消耗(mW) 4.373 18.728 45.038 192.383 359.918 功率降低(%) 12.22 12.73 13.48 8.60 14.27 測試二 時鐘歪曲率限制(ns) 0.003 0.01 0.02 0.05 0.1 最大時鐘延遲變化 -13% +1% +7% +14% +60% 最佳化功率消耗(mW) 4.610 19.858 51.434 195.837 400.698 功率降低(%) 7.47 7.47 1.19 6.96 4.56 測試三 時鐘歪曲率限制(ns) 0.002 0.005 0.015 0.03 0.05 最大時鐘延遲變化 -12% -6% +4% +4% +12% 最佳化功率消耗(mW) 4.794 21.394 51.723 204.903 419.832 功率降低(%) 3.78 0.31 0.64 2.66 0.01(256) Buffer C (258) Buffer (302) Circuit feasible solution (304) Buffer removal step (306) Determine the buffer type (308) as a feasible solution based on the flip-flop load. (310) Adjusted to the maximum buffer Repair (312) End (402) Start of the optimization step (404) Balance the distribution of flip-flops (406) Does the signal skew decrease? (408) (414) Reply to the previous state (410) Simulated annealing (412) Does the power consumption decrease? (416) End of execution 25 1244015 Attachment I Table I. Test example analysis Test case 1 Test case 2 Test case 3 Test case 4 Test case 5 Number of buffers 7 30 113 251 821 Number of flip-flops 123 500 1234 5000 10000 Original circuit power consumption (m \ V) 4.9822 21.4612 52.0567 210.505 419.8569 Maximum clock delay ㈣ 0.6404 0.8741 1.1298 1.2143 1.5029 Original circuit clock skew rate (ns) 0.0035 0.0140 0.0204 0.0547 0.0586 1244015 Appendix II Table 2. Optimized power of test examples 2. Clock skew analysis test case test case test case three test case four test case five test one clock skew rate limit (ns) 0.005 0.02 0.05 0.1 0.2 maximum clock delay change -12% + 14% + 33% + 15% +90 % Optimized power consumption (mW) 4.373 18.728 45.038 192.383 359.918 Power reduction (%) 12.22 12.73 13.48 8.60 14.27 Test 2 Clock distortion rate limit (ns) 0.003 0.01 0.02 0.05 0.1 Maximum clock delay change -13% + 1% +7 % + 14% + 60% Optimized power consumption (mW) 4.610 19.858 51.434 195.837 400.698 Power reduction (%) 7.47 7.47 1.19 6.96 4.56 Test three clock skew limit (ns) 0.002 0.005 0.015 0.03 0.05 Maximum clock delay change -12% -6% + 4% + 4% + 12% Optimized power consumption (mW) 4.794 21.394 51.723 204.903 419.832 Power reduction (%) 3.78 0.31 0.64 2.66 0.01

Claims (1)

X X 糾· 6 · 24 1244015 十、申請專利範圍 人^考慮低Mm孟曲率及低耗能之時鐘樹合成方法,其步驟包含·· ():二::理的_合成電腦排線表列,包含各線路的電性參數資訊與緩 衝益%序及功率資料庫資訊; (b)輸入時鐘歪㈣的時序規範上限值; ㈣設犧驗略計算罐由輔根輸人爾,至時鐘樹各 :氣及正反$之轉人轉換時間,以及各緩衝器之輪出負載值,判斷 該緩衝器組合是否符合特定資料庫之設計規範; ⑼時鐘樹謝衡,解鱗賴檐庫峨範之纽,刪鐘歪曲 率大於設計限制之情形; (樹緩編_轉求爾,概_上之功率消耗; 以得時鐘樹上功率消耗之全 (糊物定簡庫之經驗法則,劇簡擬退火法為基礎之最佳化過 秋’哥找時鐘樹上緩衝器連接方式之組合, 域最佳解。 2· 1申4利補$丨酬述之步考慮鱗輕㈣及低耗能之時 域合成方法,其設計規範驗註器的組成包含: ⑷針職衝ϋ之特性’判斷輸人訊號之轉換時間是否符合賴規範; ⑼針對咖之特性’判斷細輸峨是否符合設計規範; 賴岐雛’應屬遞犧啸輪咖,細差法計 异緩衝g§之時鐘延遲及輸出訊號轉換時間。 3‘如申請專利侧丨項所述之—崎考慮低時鐘歪曲率及低耗能之時 26 !244〇15 鐘樹合成綠,其時鐘樹貞解齡驟馳成包含: ㈤耕鱗賴上日輪μ料鱗,树騎财正反器賴在時 塌—_,如樣_域恤目標細連接之故 )_«後-階層之緩衝器所負載之正反器,為考慮整體時鐘延遲最小 7 Μ翻率,故刪鐘婦後—崎各輸之輸出負載; c)綱鐘軸之咖,她爾_響版時鐘歪曲率, 故平衡緩衝器之輪出負載將有效解決時鐘歪曲率之限制,此外平衡負载 可改善輸峨轉糊,娜峨_之功率消耗。 4.^請專利範圍第1項所述—種同步考慮低時鐘歪曲率及低耗能之時鐘 ^合成方法,其快速尋求緩衝器組合之步驟包含: 、, ⑷針鱗鐘樹中最後—階層緩衝器之負载,選擇同 :將時鐘樹中所有緩衝器變更為該一 之 c =所有㈣料緩補之義,選擇符合設計規範且辨消耗最小’ 、、且己為本快速尋求緩衝器組合步驟之解。 27 1244015 之功率消耗,· (C)應用習知技術模擬退火 I人去咢找缓衝器之最佳組合,為考量嚴格之時鐘歪 曲率限制及P牛低/貝异法之運算複雜度,緩衝器組合之變化以階層為單 位,尋求功率消耗之局部最佳解; ⑼再度改善時鐘樹隶後一階層之負載正反器平衡,重新應用模擬退火法尋 求時鐘缓衝裔之組合’哥求功率 >肖耗之全域最佳解。XX Correction 6 · 24 1244015 X. Patent application scope ^ Considering a clock tree synthesis method with low Mm curvature and low energy consumption, the steps include ... (): 二 :: 理 的 _Synthetic computer wiring list, Contains the electrical parameter information of each line and the buffer gain sequence and power database information; (b) The upper limit of the timing specification of the input clock is skewed; Set the calculation tank to the clock tree from the auxiliary root to the clock tree Each: the conversion time of gas and positive and negative $, and the load value of each buffer, determine whether the buffer combination meets the design specifications of a particular database; ⑼ Xie Heng, clock tree, and E Fanzhi New, delete the situation where the distortion rate of the bell is greater than the design limit; (the power consumption of the tree is slow_____________, the general__; to get the full power consumption of the clock tree (the rule of thumb for the simple library of paste, simple drafting The optimization based on the annealing method is to find a combination of buffer connection methods on the clock tree, and the domain is the best solution. 2 · 1 Apply 4 Benefits $ 丨 Steps of Remuneration Consider the light weight and low energy consumption The time domain synthesis method, the design specification of which consists of: The characteristics of ϋ ϋ 'to determine whether the conversion time of the input signal meets the Lai specification; ⑼ For the characteristics of the coffee 判断 to determine whether the fine input is in line with the design specifications; Lai Qi Chu' should be a sacrifice round coffee, and the difference method is used to calculate different buffers Clock delay and output signal conversion time. 3 'As described in the patent application side item-when Saki considers low clock distortion and low power consumption 26! 244〇15 The clock tree is green, and its clock tree is aging. Chi Cheng contains: ㈤ farming scales rely on the sun wheel μ material scales, tree riding wealth reversing devices rely on the time to collapse-_, like this _ domain shirt targets are finely connected) _ «back-level buffers are loaded by the positive Inverter, in order to consider the overall clock delay of the minimum 7 M flip rate, so delete the output load after the bell woman-Saki; c) Gang Zhong axis, her _ ring clock clock distortion rate, so balance the wheel of the buffer The output load will effectively solve the limitation of the clock skew rate. In addition, balancing the load can improve the power consumption of the E-switching and the N-switching. 4. ^ Please refer to item 1 of the patent scope—a clock synthesizing method that considers low clock skew and low power consumption synchronously. The steps for quickly seeking the buffer combination include:,, last in the scale tree clock—hierarchical buffer For the load, choose the same: change all the buffers in the clock tree to the one c = the meaning of all material buffers, choose to meet the design specifications and minimize the consumption ', and this is the quick steps to find the buffer combination step solution. 27 1244015 Power Consumption, (C) Applying conventional techniques to simulate annealing, I am looking for the best combination of buffers, in order to consider the strict clock distortion limit and the operation complexity of the P-low / Bayer method. The change of the buffer combination takes the hierarchy as the unit, and seeks the local optimal solution of power consumption; ⑼Improve the load flip-flop balance of the next level of the clock tree again, and re-apply the simulated annealing method to find the combination of clock buffers. Power > Xiao Consumption Global Best Solution. 十一、圖式: 如次頁Eleven, schema: as the next page 2828
TW93124824A 2004-08-18 2004-08-18 A clock tree synthesizing tool synchronously considering low clock skew and low power consumption TWI244015B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW93124824A TWI244015B (en) 2004-08-18 2004-08-18 A clock tree synthesizing tool synchronously considering low clock skew and low power consumption

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW93124824A TWI244015B (en) 2004-08-18 2004-08-18 A clock tree synthesizing tool synchronously considering low clock skew and low power consumption

Publications (2)

Publication Number Publication Date
TW200428240A TW200428240A (en) 2004-12-16
TWI244015B true TWI244015B (en) 2005-11-21

Family

ID=37154671

Family Applications (1)

Application Number Title Priority Date Filing Date
TW93124824A TWI244015B (en) 2004-08-18 2004-08-18 A clock tree synthesizing tool synchronously considering low clock skew and low power consumption

Country Status (1)

Country Link
TW (1) TWI244015B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105404352B (en) * 2014-09-11 2018-05-11 北京华大九天软件有限公司 It is a kind of to check clock tree synthesis result bottleneck so as to the method for improving comprehensive quality
CN105930591A (en) * 2016-04-26 2016-09-07 东南大学 Realization method for register clustering in clock tree synthesis
CN114117974A (en) * 2020-08-31 2022-03-01 深圳市中兴微电子技术有限公司 Chip clock driving unit external member and design method and chip
CN112464612B (en) * 2020-11-26 2023-01-24 海光信息技术股份有限公司 Clock winding method and device and clock tree

Also Published As

Publication number Publication date
TW200428240A (en) 2004-12-16

Similar Documents

Publication Publication Date Title
JP5373906B2 (en) SYSTEM AND METHOD FOR DESIGNING INTEGRATED CIRCUITS USING ADAPTIVEVOLTAGE ANDSCALING OPTIMIZATION
US6505322B2 (en) Logic circuit design method and cell library for use therewith
CN109376467B (en) Clock tree layout flow method and clock tree deviation compensation device in integrated circuit
KR102324782B1 (en) Method of performing static timing analysis for an integrated circuit
JP2011530763A5 (en)
US8181130B1 (en) Method for jitter reduction by shifting current consumption
Hyman et al. A clock control strategy for peak power and RMS current reduction using path clustering
TWI244015B (en) A clock tree synthesizing tool synchronously considering low clock skew and low power consumption
Ganeshpure et al. On ATPG for multiple aggressor crosstalk faults
Beheshti-Shirazi et al. A reinforced learning solution for clock skew engineering to reduce peak current and IR drop
Chakraborty et al. Analysis and optimization of NBTI induced clock skew in gated clock trees
Lu et al. Clock tree synthesis with XOR gates for polarity assignment
Tsai et al. Clock planning for multi-voltage and multi-mode designs
Kiamehr et al. A layout-aware X-filling approach for dynamic power supply noise reduction in at-speed scan testing
CN103488457A (en) Variable time delay predicting method and prediction based variable time delay summator
Lin et al. NBTI and leakage reduction using ILP-based approach
Chakrabarti Clock tree skew minimization with structured routing
Ganeshpure et al. On ATPG for multiple aggressor crosstalk faults in presence of gate delays
Tie et al. Dual-Vth leakage reduction with fast clock skew scheduling enhancement
JP2000286342A (en) Computer-readable memory medium, method of design for semiconductor integrated circuit, and method of design for semiconductor device
US11526642B1 (en) Clock network power estimation for logical designs
Yelamarthi et al. Process variation-aware timing optimization for dynamic and mixed-static-dynamic CMOS logic
Lu et al. Register on MEsh (ROME): A novel approach for clock mesh network synthesis
Pham et al. Design of radix-4 SRT dividers in 65 nanometer CMOS technology
Saurab et al. Design and Optimization of Timing Errors on Swapping of Threshold Voltage

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees