Calibration of clock generators in System-on-Chip integrated circuits
This invention relates generally to the calibration of clock generators in System-on-Chip (SoC) integrated circuits and, more particularly, to a method and apparatus for calibrating the clock generators in a System-on-Chip (SoC) integrated circuit, which is particularly useful, but not necessarily exclusively, for use in respect of globally asynchronous and locally synchronous (GALS) integrated circuit designs.
Faster and smaller CMOS technologies and the increase in available die size pave the way for integrating large systems on a single chip, but exploiting these abilities is largely hindered by several obstacles in common design methodologies. Globally- asynchronous locally synchronous (GALS) architectures are a systematic approach toward solving these problems in a way that does not limit performance or size of a system. The basic idea is to partition a system into several independently-clocked modules that are communicating in a self-timed manner. Thus, the functionality of each subsystem or "Locally-Synchronous Island" is still described and synthesized along well-established synchronous design flows, while communication between such Locally-Synchronous Islands specialist asynchronous parts. Figure 1 of the drawings is a schematic block diagram illustrating the general configuration of a GALS module consisting of a Locally-synchronous Island (LSI) and a Self-Timed Wrapper. The GALS approach restricts the asynchronous parts to some well-known circuits contained in a Self-timed Wrapper 10 around each Synchronous Island 12. The Self- timed Wrapper 10 contains asynchronous port controllers 14, and extension 16 for adding testability, and a Local Clock Generator 18 for the Synchronous Island 12. The self-timed approach eliminates the need to time-align the operation of all modules within the framework of a common base clock period. Instead, each module is driven from a Local Pausable Clock Generator 18, ideally a programmable ring oscillator, which is controlled (paused) by asynchronous port controllers 14 so as to prevent any timing violations from occurring within the Synchronous Island's data interface. Thus, globally-asynchronous locally-synchronous (GALS) operation is a known approach to VLSI systems design that combines the following features:
a) All major modules are designed in accordance with proven synchronous clocking disciplines. b) Data exchange between any two modules strictly follows a full handshake protocol. c) Each module is run from its own local clock. d) Any asynchronous circuitry necessary for coordinating clock-driven with self- timed operation is confined to "asynchronous wrappers" arranged around each synchronous Island. As explained above, in a GALS based system, the problem of distributing a global clock in other System-on-Chip paradigms is avoided because the various locally- synchronous modules communicate asynchronously with each other. As a result, on the one hand, the problems related to a single global clock are eliminated, but on the other hand it is necessary to provide a clock generator for each module. Each of these clocks should operate as fast as the logic block can perform, such that each clock generator must be tuned to the delays observed in the respective particular block. In the past, tuning of the clock generators was achieved by using static timing analysis tools by means of which a certain test vector is applied to the clock generator, which has a delay line in it after the chip has been fabricated. However, this process is not automated.
We have now devised an improved arrangement, and it is an object of the present invention to provide a method and apparatus for the automated and accurate calibration of a clock generator to the delay observed in its respective module in System-on- Chip IC's, such as GALS-based IC's and the like. In accordance with the present invention, there is provided a method of calibrating a clock generator for generating a clock in respect of a plurality of sub-blocks of an integrated circuit, the method comprising: a) identifying a critical path for each of said plurality of sub-blocks, and obtaining test patterns to sensitize the critical paths; b) setting said clock generator at an initial frequency; c) performing an at-speed path delay test in respect of the critical paths identified for each of said sub-blocks; d) determining whether or not all of the critical paths pass the path delay test, and if not:
i) modifying the frequency of said clock generator; and ii) repeating said at-speed path delay test; until all of said critical paths are determined to pass the path delay test. Beneficially, the critical path for each sub-block are identified by means of a static timing analysis tool. The test patterns may, for example, be generated by an on-chip test pattern generator or they may be loaded from an on-chip scratch pad memory or the like. In one embodiment, the integrated circuit comprises a Globally-Asynchronous Locally-Synchronous (GALS) based circuit, wherein the sub-block comprises Locally- Synchronous Islands. Preferably, the clock generator comprises an on-chip programmable ring oscillator having a delay line, wherein a control word is used to change the delay for which the clock generator is configured, and thereby modify the frequency thereof. Beneficially, the initial frequency is the maximum frequency at which the clock generator can operate and the step of modifying said frequency comprises reducing said frequency in a step-wise manner. In one exemplary embodiment, the step of determining whether or not all of the critical paths pass the path delay test comprises comparing the value of the signals propagated there through with a predetermined valid signature. The method is preferably performed at boot time of the integrated circuit. However, it will be appreciated that the method can, in principle, be carried out at any time. For example, if it is required to change the operating point (frequency/voltage) of the circuit for power management purposes, the operating frequency will be scaled, which in turn will require the re-calibration of the local clock generators. The present invention also extends to apparatus for calibrating a clock generator for generating a clock in respect of a plurality of sub-blocks of an integrated circuit, and an integrated circuit including such apparatus. These and other aspects of the invention will be apparent from, and elucidated with reference to, the embodiment described herein.
An embodiment of the present invention will now be described by way of example only and with reference to the accompanying drawings, in which: Fig. 1 is a schematic block diagram illustrating the general configuration of a GALS module;
Fig. 2 is a schematic timing diagram illustrating the operation a clock for at- speed path delay testing; Fig. 3 is a schematic illustration of the configuration of a sub-block of an integrated circuit; Fig. 4 is a schematic block diagram illustrating a method of calibrating a ring oscillator according to an exemplary embodiment of the present invention; and Fig. 5 is a schematic flow diagram illustrating the principal steps in a method of calibrating a ring oscillator according to an exemplary embodiment of the present invention.
The present invention provides a novel method of using at-speed scan test vectors for analyzing the delay of critical paths in IC designs. This information is then used for calibrating on-chip ring oscillators used for generating the clocks. By way of general background, at-speed scan based testing is a well-known technique for stuck-at-fault structural testing of an integrated circuit. Existing static timing analysis tools generally perform two types of at-speed testing: transition delay testing and path delay testing. Both of these types of testing work by generating scan patterns that can be scanned in at low speed. After a scan vector is scanned in, two or more capture clocks are applied at full speed and then the captured result is scanned out, usually at slow speed, as illustrated schematically in Figure 2 of the drawings. Referring to Figure 3 of the drawings, in general, all signals start and end in registers (namely, a launching flip-flop 100 and a capture flip-flop 200 respectively) every clock period. There is generally only one clock in every sub-block (or locally-synchronous Island) of a chip, and each flip-flop is clocked every cycle. As explained above, each of the clocks in a chip should operate as fast as the respective logic block can perform, such that each clock generator must be tuned to the delays observed in its particular block. In other words, the clock speed is determined by the slowest path between the flip-flops 100, 200 in a designs, which path is often referred to as "the critical path" 300. A known static timing analyzer, such as The Cadence Pearl Static Timing Analyzer, can be used to trace through the actual logic and calculate the slowest and fastest delay for each true path through the logic. It then outputs a timing report for critical (slowest) and other paths. Path delay testing is used to test known critical paths at-speed. For path delay testing, known Design for Test (DFT) tools generate a pattern, which when scanned in,
sensitizes the path of interest (i.e. critical paths) by setting up the value to be propagated through the path. Many of the gates in the critical path will have additional inputs which need to be set appropriately in order to propagate the signal from the launching flip-flop 100 to the capture flip-flop 200, as illustrated schematically in Figure 3. The test passes if the capture flip-flop 200 captures the correct value at-speed. As explained above, however, accurate calibration of on-chip clock generators is crucial in extracting the optimal performance from a circuit, such that conventional static timing analysis methods are no longer adequate for use in generating sufficiently accurate calibration data for clock generators. One of the factors contributing to this inaccuracy is the on-chip variation of process, voltage and temperature (PVT). In deep submicron technologies, on-chip variation has a significant impact on circuit delays, which determines the performance of the chip, such that on-chip variations have become too significant to be ignored when calibrating clock generators in circuits such as GALS-based IC's. The present invention takes into account the on-chip variations by using real in-circuit critical paths for delay measurements. This is the most accurate data for the calibration of, for example, ring oscillators generating the clocks for respective locally- synchronous Islands of a GALS-based integrated circuit. Prior art schemes model the critical path using delay line elements for estimating the maximum delay of the circuit, usually based on static timing analysis, as explained above. On the other hand, the present invention employs at-speed path delay testing for sensitizing the critical paths for delay measurements. The method according to the invention, as defined above, comprises the following steps: a) using static timing analysis tools (for example, Cadence Pearl or Synopsis
Primetime), a list of the top N functional critical paths is generated; b) an automatic test pattern generator tool (ATPG) is used to generate test patterns to sensitize the N critical paths; c) the calibration process in respect of the ring oscillator generating the clocks is started: in order to be correct, all of the critical paths should clear the path delay test (the capture flip-flop 200 should capture the correct results), and this is checked against a pre- stored signature, in accordance with an exemplary embodiment of the invention. It will, of course, be appreciated that the test patterns could also be loaded from an on-chip scratch pad memory. An exemplary embodiment of the present invention is illustrated schematically in Figure 4 of the drawings. The ring oscillator 400 has a programmable delay line 500,
which can be configured for different delays using a control word 600. Scan patterns are loaded into the N modules at the respective launching flip-flops 100 and the resultant signal value captured by the capture flip-flops is output to a register 700. The resultant signature of values is compared with a valid signature 800. If it does not match, the delay of the ring oscillator 400 is changed by the control word 600, and the path delay test is carried out again. This process is repeated until the resultant signature 700 matches the valid signature 800, at which time, the calibration process is complete. This process, which can be carried out at boot time, is illustrated in more detail in Figure 5 of the drawings. Critical paths and test patterns to sensitize the critical paths are generated using known techniques, in respect of each of the N modules or Islands in the chip, and the test patterns for all of the N critical paths are scanned in. Next, the control word for the delay in the delay line (maximum frequency possible for the ring oscillator) is loaded, and the ring oscillator then generates at-speed clock pulses for all of the critical paths. The pattern of resultant values captured by the capture flip-flops is compared with the known valid signature to determine whether or not all of the critical paths passed the path delay test. If not, the delay on the delay line of the ring oscillator is increased, and the process is repeated. Once all of the critical paths pass the path delay test, the calibration process is complete and the ring oscillator is set at the delay for which it was determined that all of the critical paths passed the path delay test. In other words, depending on the captured data from the critical paths, the delay on the delay line of the programmable ring oscillator is modified until all of the critical paths pass the path delay test. In this case, this is done by comparing the captured value with a known good value from the outputs of the critical paths, but other methods of verification are envisaged. Advantages of the present invention include: accuracy - since on-chip real critical paths are used for measuring the delays, the clock generators can be fine-tuned to run at optimal operating frequency (closest to the maximum possible frequency); simplicity - the proposed solution has no major costs in terms of silicon area; it uses existing tools (for generating test patters for activating critical paths in a circuit) and minimum hardware for calibration. An embodiment of the present invention has been described above by way of example only, and it will be apparent to a person skilled in the art that modifications and variations can be made to the described embodiments without departing from the scope of the
invention as defined by the appended claims. Further, in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The term "comprising" does not exclude the presence of elements or steps other than those listed in a claim. The terms "a" or "an" does not exclude a plurality. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The mere fact that measures are recited in mutually different independent claims does not indicate that a combination of these measures cannot be used to advantage.