CN110688723A

CN110688723A - Rapid design method for clock distribution network

Info

Publication number: CN110688723A
Application number: CN201910835765.XA
Authority: CN
Inventors: 胡向东; 潘达杉; 童中华; 黄金明
Original assignee: Shanghai Integrated Circuits with Highperformance Center
Current assignee: Shanghai Integrated Circuits with Highperformance Center
Priority date: 2019-09-05
Filing date: 2019-09-05
Publication date: 2020-01-14
Anticipated expiration: 2039-09-05
Also published as: CN110688723B

Abstract

The invention relates to a method for quickly designing a clock distribution network, which divides a clock network into a first-stage clock network and a second-stage clock network, wherein the first-stage clock network is driven by a first-stage clock network driving unit, and the second-stage clock network is driven by a second-stage clock network driving unit, and comprises the following steps: adopting a time sequence-first layout and acquiring the position of a trigger; dividing the trigger into a plurality of local areas by adopting a clustering algorithm according to the layout parameters, and establishing a second-level clock network; dividing the second-level clock network driving unit into a plurality of uniformly loaded areas by adopting a clustering algorithm according to the layout parameters, and establishing a first-level clock network; and winding the first-stage clock network and the second-stage clock network. The invention can reduce the delay of the clock network and reduce the load of the clock network.

Description

Rapid design method for clock distribution network

Technical Field

The invention relates to the technical field of sequential circuit design, in particular to a rapid design method of a clock distribution network.

Background

In high performance microprocessing, the clock network is a critical component that affects processor performance and power consumption. A low-delay and low-skew clock network can effectively reduce the time sequence overhead, and the performance of the processor can be further improved by reducing the on-chip skew and delay on the clock path. Meanwhile, the power consumption of the clock network accounts for more than 40% of the dynamic power consumption of the whole processor. Reducing the power consumption of the clock network thus helps to reduce processor power consumption. Reducing the clock network is generally achieved by reducing the load of the clock network. It can be seen that a low-latency, low-skew, low-power consumption clock network is critical to implementing an energy-efficient processor.

To realize a low-delay and low-skew clock network, it is necessary to shorten the driving stage number of the clock network as much as possible, and all flip-flops have close delays to the driving point. It is common practice to increase the drive reduction delay by gathering flip-flops near the drive point or by forming the clock network into a large grid using multi-point drive. The former will influence the time sequence because the trigger position under the best condition of time sequence is changed, the latter will increase the power consumption of the clock network because of the direct current path problem generated by the multi-point drive, and simultaneously, a large number of clock grid lines occupy a large amount of high-quality winding resources, thereby influencing the design performance. Therefore, the design of the high-energy-efficiency clock network needs to solve the two problems at the same time, and the clock network with few driving stages, low delay and single-point driving is designed under the condition of not influencing the time sequence and the winding of a chip.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a method for quickly designing a clock distribution network, which can reduce the delay of the clock network and reduce the load of the clock network.

The technical scheme adopted by the invention for solving the technical problems is as follows: the method for quickly designing the clock distribution network is provided, the clock network is divided into a first-stage clock network and a second-stage clock network, wherein the first-stage clock network is driven by a first-stage clock network driving unit, and the second-stage clock network is driven by a second-stage clock network driving unit, and the method comprises the following steps:

(1) adopting a time sequence-first layout and acquiring the position of a trigger;

(2) dividing the trigger into a plurality of local areas by adopting a clustering algorithm according to the layout parameters, establishing the connection relation of a second-level clock network, determining the position of a second-level clock network driving unit in each local area, and establishing a fishbone-shaped backbone clock network in the range of each local area;

(3) dividing the second-level clock network driving unit into a plurality of uniformly loaded areas by adopting a clustering algorithm according to the layout parameters, establishing a first-level clock network connection relation, determining the first-level clock network driving unit of each uniformly loaded area, and establishing a binary tree-form backbone clock network in each uniformly loaded area;

(4) and winding the first-stage clock network and the second-stage clock network.

And (3) the clustering algorithm in the step (2) gives the position of the second-level clock network driving unit under the condition that the total connection amount from the trigger to the second-level clock network is shortest after multiple iterations, so that the total load of the second-level clock network is minimum.

And (4) the clustering algorithm in the step (3) gives the position of the first-stage clock network driving unit under the condition that the total connecting line between the second-stage clock network driving unit and the first-stage clock network is the shortest after multiple iterations, so that the total load of the first-stage clock network is the smallest.

The layout parameters in the step (2) and the step (3) comprise a clock network delay constraint, a winding layer selection constraint and a driving point number constraint.

The method also comprises the following steps after the step (4): parasitic parameters of the whole clock system are extracted, the load conditions of the first-stage clock network and the second-stage clock network are analyzed, and the driving and load sizes of the first-stage clock network driving unit and the second-stage clock network driving unit are adjusted according to the load conditions.

The method also comprises the following steps after the step (4): and performing redundancy cutting on network backbones of the first clock network and the second clock network, judging the backbones of the effective clock networks by identifying the positions of connecting holes of the backbones and the branches, changing the length attribute of the connecting lines one by one, and deleting the whole backbone if no branch is connected.

Advantageous effects

Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: the invention does not need to adjust the position of the trigger, reduces the influence on the time sequence, reduces the delay of the clock network by adjusting the position of the driving point through the K-means algorithm, reduces the load of the clock network, not only improves the highest working frequency of the processor, but also reduces the power consumption of the microprocessor.

Drawings

FIG. 1 is a schematic diagram of a clock network according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a first-stage clock network driving unit structure;

FIG. 3 is a schematic diagram of a second stage clock network driving unit structure;

fig. 4 is a flow chart of the present invention.

Detailed Description

The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.

The embodiment of the invention relates to a method for quickly designing a clock distribution network, which divides a clock network into a first-stage clock network (FLCN for short) and a second-stage clock network (SLCN for short) as shown in figure 1, wherein the first-stage clock network is driven by a first-stage clock network driving unit (FLCD for short) and the second-stage clock network is driven by a second-stage clock network driving unit (SLCD for short). The first-stage clock network is the interconnection of the input of the second-stage clock network driving unit and adopts H-type interconnection. The second-level clock network is the interconnection of the trigger clock input and adopts fishbone-shaped interconnection.

The structure of the first-stage clock network driving unit is shown in fig. 2, the first-stage clock network driving unit comprises a high-low frequency circuit switching circuit for preventing glitch, a driving adjustable buffer and a load adjustable capacitor array, and the driving and the load adjustment are realized by driving the adjustable buffer and the load adjustable capacitor array.

The structure of the second-stage clock network driving unit is shown in fig. 3, the second-stage clock network driving unit includes a duty cycle adjusting circuit, a clock gating circuit, a drive adjustable buffer and a load adjustable capacitor array, and the drive and the load adjustment are realized by driving the adjustable buffer and the load adjustable capacitor array.

Fig. 4 is a flowchart of the present embodiment. The input of the process is as follows: and then, performing area planning on a second clock network under the influence of user-defined clock network delay constraint, winding layer selection constraint and driving point quantity constraint, and performing position optimization on a second clock network driving unit through a K-means algorithm. And performing area planning of the first clock network according to the position of the second clock network driving unit, and performing position optimization of the first clock network driving unit through an optimized K-means algorithm. And after the range and the driving point position of the two-stage clock network are determined, clock grid wiring is carried out. The method adopts a bottom-up method to carry out hierarchical clock driving unit position optimization and clock network design according to a time sequence-first trigger layout result, and comprises the following specific steps:

(1) and adopting the layout with the priority of the time sequence and acquiring the positions of the triggers, namely, acquiring the positions of the triggers firstly after the chip finishes the layout with the priority of the time sequence to prepare for the area planning of the second clock network.

(2) And dividing the trigger into a plurality of local areas by adopting a clustering algorithm according to the layout parameters, establishing the connection relation of a second-level clock network, determining the position of a second-level clock network driving unit in each local area, and establishing a fishbone-shaped backbone clock network in each local area. Specifically, according to a second-level clock network delay constraint customized by a user, a winding layer selection constraint of a second-level clock network and a number constraint of second clock network driving units, a trigger is divided into a plurality of local areas by adopting a K-means algorithm, a second-level clock network connection relation is established, the position of the second clock network driving unit of each local area is determined, and a fishbone-shaped main clock network is established in each local area. When the K-means algorithm is carried out, the position of a second-stage clock network driving unit under the condition that the total connecting line between the trigger and the second-stage clock network is the shortest is given after multiple iterations, so that the total load of the second-stage clock network is the smallest.

(3) And dividing the second-level clock network driving unit into a plurality of uniformly loaded areas by adopting a clustering algorithm according to the layout parameters, establishing a first-level clock network connection relation, determining the first-level clock network driving unit of each uniformly loaded area, and establishing a binary tree-form trunk clock network in each uniformly loaded area. Specifically, according to a first-level clock network delay constraint customized by a user, a winding layer selection constraint of a first-level clock network and a number constraint of first clock network driving units, a trigger is divided into a plurality of uniformly loaded areas by adopting a K-means algorithm, a first-level clock network connection relation is established, the position of the first clock network driving unit of each uniformly loaded area is determined, and a binary tree type trunk clock network is established in each uniformly loaded area. When the K-means algorithm is carried out, the position of the first-stage clock network driving unit under the condition that the total connecting line between the second-stage clock network driving unit and the first-stage clock network is the shortest is given after multiple iterations, so that the total load of the first-stage clock network is the smallest.

It is easy to find that, in the embodiment, the propagation delay of the clock network is adjusted by setting the length and the width of the target clock network area, the selection of the metal winding layer of the clock network and the upper limit of the number of the first-stage clock network driving units and the second-stage clock network driving units in the K-means algorithm, and the clock skew is controlled at the same time, so that the configurable delay and skew of the two-stage clock network are realized at the same time.

(4) And winding the first-stage clock network and the second-stage clock network. During winding, the driving end and the load end of the same clock network are wound according to different rules, and the probability of SEM violation is reduced. The first-level clock network and the second-level clock network both have the characteristics of high fan-out and high load, so that during winding, the load is temporarily disconnected from the clock network to wind the large-line-width high-metal layer at the driving end, and then the winding at the driving end is locked to wind the fishbone-shaped second clock network with the low line width at the load end, so that the finished winding at the driving end cannot be changed, the driving end of the clock network has stronger winding at the driving end, and SEM violation is avoided roughly.

(5) The method is characterized in that key parameters such as delay, load and shielding of the wired clock network are simulated, and the driving size and the load value are modified in real time, so that the delay of the clock network is reduced as much as possible, and the load of the clock network is balanced. Specifically, after the winding of the two-stage clock network is completed, parasitic parameters of the whole clock system are extracted, the load conditions of each first-stage clock network and each second-stage clock network are analyzed, and the driving and internal load sizes of the first-stage clock network driving unit and the second-stage clock network driving unit are adjusted according to the load conditions. On the premise of balancing the load of the clock network, the slope of the clock signal is ensured to meet the design requirement.

(6) And cutting a redundant clock backbone of the finally optimized clock network, and reducing the load of the clock network as much as possible, thereby reducing the power consumption of the clock network and reducing the delay of the clock network. Specifically, after the winding is completed, redundant cutting is carried out on network backbones of the first clock network and the second clock network, the backbones of the effective clock network are judged by identifying the positions of connecting holes of the backbones and the branches, the length attribute of the connecting lines is changed one by one, and the whole backbone is deleted if no branch is connected, so that the effect of deleting the redundant network is achieved, the load of the clock network is reduced, and the clock network has benefits on time delay and power consumption.

Therefore, the invention does not need to adjust the position of the trigger, reduces the influence on the time sequence, reduces the delay of the clock network by adjusting the position of the driving point through the K-means algorithm, reduces the load of the clock network, not only improves the highest working frequency of the processor, but also reduces the power consumption of the microprocessor. The two-stage clock network generated by the method has short delay and small deviation. By adopting the design method, the design of hundreds of thousands of triggers can reach the clock deviation less than 15ps, thereby greatly reducing the time sequence expense, improving the design frequency and simultaneously reducing the design power consumption.

Claims

1. A method for quickly designing a clock distribution network is characterized in that a clock network is divided into a first-stage clock network and a second-stage clock network, wherein the first-stage clock network is driven by a first-stage clock network driving unit, and the second-stage clock network is driven by a second-stage clock network driving unit, and the method comprises the following steps:

2. The method according to claim 1, wherein the clustering algorithm in step (2) provides the position of the driving unit of the second-stage clock network when the total connection amount from the trigger to the second-stage clock network is the shortest through multiple iterations, so that the total load of the second-stage clock network is the smallest.

3. The method for rapidly designing a clock distribution network according to claim 1, wherein the clustering algorithm in the step (3) gives the position of the first-stage clock network driving unit under the condition that the total connection amount from the second-stage clock network driving unit to the first-stage clock network is the shortest through multiple iterations, so that the total load of the first-stage clock network is the smallest.

4. The method for rapidly designing a clock distribution network according to claim 1, wherein the layout parameters in the steps (2) and (3) comprise a clock network delay constraint, a winding layer selection constraint and a driving point number constraint.

5. The method for rapidly designing a clock distribution network according to claim 1, wherein the step (4) is followed by the steps of: parasitic parameters of the whole clock system are extracted, the load conditions of the first-stage clock network and the second-stage clock network are analyzed, and the driving and load sizes of the first-stage clock network driving unit and the second-stage clock network driving unit are adjusted according to the load conditions.

6. The method for rapidly designing a clock distribution network according to claim 5, wherein the step (4) is followed by the steps of: and performing redundancy cutting on network backbones of the first clock network and the second clock network, judging the backbones of the effective clock networks by identifying the positions of connecting holes of the backbones and the branches, changing the length attribute of the connecting lines one by one, and deleting the whole backbone if no branch is connected.